Join the conversation on IDNLearn.com and get the answers you seek from experts. Get thorough and trustworthy answers to your queries from our extensive network of knowledgeable professionals.

In natural language processing models like BERT, what does the "attention mask" and "pad token id" primarily contribute to?
A) Sentence segmentation
B) Named entity recognition
C) Masked language modeling
D) Sequence classification