Get expert insights and community-driven knowledge on IDNLearn.com. Discover thorough and trustworthy answers from our community of knowledgeable professionals, tailored to meet your specific needs.

In natural language processing models like BERT, what does the "attention mask" and "pad token id" primarily contribute to?
A) Sentence segmentation
B) Named entity recognition
C) Masked language modeling
D) Sequence classification