Research Article
BibTex RIS Cite

New York Times Anma Yazılarından Meslek Sınıflandırmasında Cinsiyet Yanlılığı

Year 2022, , 425 - 436, 16.05.2022
https://doi.org/10.21205/deufmd.2022247109

Abstract

Yapay zeka gibi teknolojik yenilikler, geliştiricilerin niyetlerinden bağımsız olarak toplumda mevcut olan ön yargıyı arttırabilirler. Bu sebeple, araştırmacılar geliştirilen bir ürün/çözüm ile birlikte gelebilecek etik sorunların farkında olmalıdırlar. Bu çalışmada, sosyal ön yargılardan biri olan cinsiyet yanlılığının meslek sınıflandırması üzerindeki etkisi araştırılmaktadır. Bunun için New York Times web sitesinden anma yazıları toplanarak yeni bir veri kümesi oluşturulmuş ve bu anma yazıları cinsiyet göstergeleri dahil ve hariç olmak üzere iki farklı versiyonuyla sunulmuştur. Bu veri kümesindeki sınıf dağılışları incelendiğinde cinsiyet ve meslek değişkenleri arasında bir bağımlılık ilişkisi görülmektedir. Dolayısıyla cinsiyet göstergelerinin meslek tahmini üzerinde bir etkisi olması beklenmektedir. Bu etkiyi sınamak üzere, SVM (Karar Destek Makineleri), HAN (Hiyerarşik İlgi Ağı) ve DistilBERT algoritmaları kullanılarak meslek sınıflandırması yapılmıştır. Sadece meslek sınıflandırması yapan bu modellerin yanında meslek ve cinsiyetin eş zamanlı öğrenildiği bir model de değerlendirilmiştir. Deneysel sonuçlar, meslek tahmininde cinsiyet yanlılığının etkili olduğunu ortaya koymaktadır.

References

  • Thelwall, M. 2018. Gender Bias in Sentiment Analysis: Online Information Review. Vol. 42, p. 7, DOI: https://doi.org/10.1108/OIR-05-2017-0139
  • Bölükbaşı, T., Chang, K.-W., Zou, J., Saligrama, V., Kalai, A. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embedding. Proceedings of the 30th International Conference on Neural Information Processing Systems, 5-10 December, Barcelona, Spain, 4356–4364.
  • Caliskan, A., Bryson, J. J., Narayanan, A. 2017. Semantics derived automatically from language corpora contain human-like biases: Science, Vol. 356, p. 183-186, DOI: 10.1126/science.aal4230
  • Garg, N., Schiebinger, L., Jurafsky, D., Zou, J. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes: Proceedings of the National Academy of Sciences, Vol. 115, p. E3635-E3644. DOI: 10.1073/pnas.1720347115
  • Buolamwini, J. and Gebru, T. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 23-24 February, New York, USA, 77-91.
  • Caplar, N., Tacchella, S., Birrer, S. 2017. Quantitative Evaluation of Gender Bias in Astronomical Publications from Citation Counts: Nature Astronomy, Vol. 1, p. 8, DOI: https://doi.org/10.1038/s41550-017-0141
  • Fu, L., Danescu-Niculescu-Mizil, C., Lee, L. 2016. Tie Breaker: Using Language Models to Quantify Gender Bias in Sports Journalism. Proceedings of IJCAI workshop on NLP meets Journalism, 10 July, New York, USA.
  • De-Artega, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Couldechova, A., Geyik, S., Kenthapadi, K., Kalai, A. T. 2019. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Settings. ACM Conference on Fairness, Accountability, and Transparency, 29-31 January, New York, USA, 120-128.
  • Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L. 2018. Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018), 1-6 June, New Orleans, USA, 2227-2237. DOI: 10.18653/v1/N18-1202
  • Devlin, J., Chang, M., Lee, K., Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2019), 2-7 June, Minneapolis, USA, 4171-4186. DOI: 10.18653/v1/N19-1423
  • Basta, C., Costa-Jussa, M., Casas, N. 2019. Evaluating the Underlying Gender Bias in Contextualized Word Embeddings. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), 28 July-2 August, Florence, Italy, 33-39. DOI: 10.18653/v1/W19-3805
  • Tan, Y.C., Celis, Y.E., 2019. Assessing Social and Intersectional Biases in Contextualized Word Representations. Advances in Neural Information Processing Systems (NEURIPS 2019), 8-14 December, Vancouver, Canada, 13209-13220.
  • Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D. 2020. Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems (NEURIPS 2020), 1877-1901.
  • Garrido-Muñoz, I., Montejo-Ráez, A., Martínez-Santiago, F., Ureña-López, L.A. 2021. A Survey on Bias in Deep NLP, Appl. Sci, Vol. 11, p. 3184. DOI: 10.3390/ app11073184
  • Webster, K., Wang, X., Tenney, I., Beutel, A., Pitler, E., Pavlick, E., Chen, J., Petrov, S. 2020. Measuring and Reducing Gendered Correlations in Pre-trained Models. https://arxiv.org/abs/2010.06032
  • Romanov, A., De-Arteaga, M., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., Geyik, S., Kenthapadi, K., Rumshisky, A., Kalai, A. 2019. What's in a Name? Reducing Bias in Bios without Access to Protected Attributes. Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2019), 2-7 June, Minneapolis, USA, 4187-4195. DOI: 10.18653/v1/N19-1424
  • Kiritchenko, S., Mohammad, S. 2018. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, June 5-6, New Orleans, USA, 43-53. DOI: 10.18653/v1/S18-2005
  • Stanovsky, G., Smith, N.A., Zettlemoyer, L. 2019. Evaluating Gender Bias in Machine Translation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), 28 July-2 August, Florence, Italy, 1679-1684. DOI: 10.18653/v1/P19-1164
  • Prates, M.O.R., Avelar, P.H., Lamb, L.C. 2020. Assessing gender bias in machine translation: a case study with Google Translate, Neural Comput & Applic, Vol. 32, p. 6363-6381. DOI: 10.1007/s00521-019-04144-6
  • Subramanian, S., Han, X., Baldwin, T., Cohn, T., Frermann, L. 2021. Evaluating Debiasing Techniques for Intersectional Biases. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 2492-2498.
  • Basu Roy Chowdhury, S., Ghosh, S., Li, Y., Oliva, J., Srivastava, S., Chaturvedi, S. 2021. Adversarial Scrubbing of Demographic Information for Text Classification. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 550-562.
  • Bureau of Labor Statistics. https://www.bls.gov/soc/2018. (Access Time: 20 September 2019).
  • Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E. Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June, San Diego, USA, 1480-1489. DOI: 10.18653/v1/N16-1174
  • Sanh, V., Debut, L., Chaumond, L., Wolf, T. 2020. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. http://arxiv.org/abs/1910.01108
  • Ruder, S. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. https://ruder.io/multi-task/. (Access Date: December 2019).

Gender Bias in Occupation Classification from the New York Times Obituaries

Year 2022, , 425 - 436, 16.05.2022
https://doi.org/10.21205/deufmd.2022247109

Abstract

Technological developments such as artificial intelligence can strengthen social prejudices prevailing in society, regardless of the developer's intention. Therefore, researchers should be aware of the ethical issues that may arise from a developed product/solution. In this study, we investigate the effect of gender bias on occupational classification. For this purpose, a new dataset was created by collecting obituaries from the New York Times website and is provided in two different versions: With and without gender indicators. Category distributions from this dataset show that gender and occupation variables have dependence. Thus, gender affects occupation classification. To test the effect, we perform occupation classification using SVM (Support Vector Machine), HAN (Hierarchical Attention Network), and DistilBERT-based classifiers. Moreover, to get further insights into the relationship of gender and occupation in classification problems, a multi-tasking model in which occupation and gender are learned together is evaluated. Experimental results reveal that there is a gender bias in job classification.

References

  • Thelwall, M. 2018. Gender Bias in Sentiment Analysis: Online Information Review. Vol. 42, p. 7, DOI: https://doi.org/10.1108/OIR-05-2017-0139
  • Bölükbaşı, T., Chang, K.-W., Zou, J., Saligrama, V., Kalai, A. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embedding. Proceedings of the 30th International Conference on Neural Information Processing Systems, 5-10 December, Barcelona, Spain, 4356–4364.
  • Caliskan, A., Bryson, J. J., Narayanan, A. 2017. Semantics derived automatically from language corpora contain human-like biases: Science, Vol. 356, p. 183-186, DOI: 10.1126/science.aal4230
  • Garg, N., Schiebinger, L., Jurafsky, D., Zou, J. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes: Proceedings of the National Academy of Sciences, Vol. 115, p. E3635-E3644. DOI: 10.1073/pnas.1720347115
  • Buolamwini, J. and Gebru, T. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 23-24 February, New York, USA, 77-91.
  • Caplar, N., Tacchella, S., Birrer, S. 2017. Quantitative Evaluation of Gender Bias in Astronomical Publications from Citation Counts: Nature Astronomy, Vol. 1, p. 8, DOI: https://doi.org/10.1038/s41550-017-0141
  • Fu, L., Danescu-Niculescu-Mizil, C., Lee, L. 2016. Tie Breaker: Using Language Models to Quantify Gender Bias in Sports Journalism. Proceedings of IJCAI workshop on NLP meets Journalism, 10 July, New York, USA.
  • De-Artega, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Couldechova, A., Geyik, S., Kenthapadi, K., Kalai, A. T. 2019. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Settings. ACM Conference on Fairness, Accountability, and Transparency, 29-31 January, New York, USA, 120-128.
  • Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L. 2018. Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018), 1-6 June, New Orleans, USA, 2227-2237. DOI: 10.18653/v1/N18-1202
  • Devlin, J., Chang, M., Lee, K., Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2019), 2-7 June, Minneapolis, USA, 4171-4186. DOI: 10.18653/v1/N19-1423
  • Basta, C., Costa-Jussa, M., Casas, N. 2019. Evaluating the Underlying Gender Bias in Contextualized Word Embeddings. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), 28 July-2 August, Florence, Italy, 33-39. DOI: 10.18653/v1/W19-3805
  • Tan, Y.C., Celis, Y.E., 2019. Assessing Social and Intersectional Biases in Contextualized Word Representations. Advances in Neural Information Processing Systems (NEURIPS 2019), 8-14 December, Vancouver, Canada, 13209-13220.
  • Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D. 2020. Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems (NEURIPS 2020), 1877-1901.
  • Garrido-Muñoz, I., Montejo-Ráez, A., Martínez-Santiago, F., Ureña-López, L.A. 2021. A Survey on Bias in Deep NLP, Appl. Sci, Vol. 11, p. 3184. DOI: 10.3390/ app11073184
  • Webster, K., Wang, X., Tenney, I., Beutel, A., Pitler, E., Pavlick, E., Chen, J., Petrov, S. 2020. Measuring and Reducing Gendered Correlations in Pre-trained Models. https://arxiv.org/abs/2010.06032
  • Romanov, A., De-Arteaga, M., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., Geyik, S., Kenthapadi, K., Rumshisky, A., Kalai, A. 2019. What's in a Name? Reducing Bias in Bios without Access to Protected Attributes. Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2019), 2-7 June, Minneapolis, USA, 4187-4195. DOI: 10.18653/v1/N19-1424
  • Kiritchenko, S., Mohammad, S. 2018. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, June 5-6, New Orleans, USA, 43-53. DOI: 10.18653/v1/S18-2005
  • Stanovsky, G., Smith, N.A., Zettlemoyer, L. 2019. Evaluating Gender Bias in Machine Translation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), 28 July-2 August, Florence, Italy, 1679-1684. DOI: 10.18653/v1/P19-1164
  • Prates, M.O.R., Avelar, P.H., Lamb, L.C. 2020. Assessing gender bias in machine translation: a case study with Google Translate, Neural Comput & Applic, Vol. 32, p. 6363-6381. DOI: 10.1007/s00521-019-04144-6
  • Subramanian, S., Han, X., Baldwin, T., Cohn, T., Frermann, L. 2021. Evaluating Debiasing Techniques for Intersectional Biases. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 2492-2498.
  • Basu Roy Chowdhury, S., Ghosh, S., Li, Y., Oliva, J., Srivastava, S., Chaturvedi, S. 2021. Adversarial Scrubbing of Demographic Information for Text Classification. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 550-562.
  • Bureau of Labor Statistics. https://www.bls.gov/soc/2018. (Access Time: 20 September 2019).
  • Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E. Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June, San Diego, USA, 1480-1489. DOI: 10.18653/v1/N16-1174
  • Sanh, V., Debut, L., Chaumond, L., Wolf, T. 2020. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. http://arxiv.org/abs/1910.01108
  • Ruder, S. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. https://ruder.io/multi-task/. (Access Date: December 2019).
There are 25 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Research Article
Authors

Ceren Atik This is me 0000-0002-0843-7026

Selma Tekir 0000-0002-0488-9682

Publication Date May 16, 2022
Published in Issue Year 2022

Cite

APA Atik, C., & Tekir, S. (2022). Gender Bias in Occupation Classification from the New York Times Obituaries. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, 24(71), 425-436. https://doi.org/10.21205/deufmd.2022247109
AMA Atik C, Tekir S. Gender Bias in Occupation Classification from the New York Times Obituaries. DEUFMD. May 2022;24(71):425-436. doi:10.21205/deufmd.2022247109
Chicago Atik, Ceren, and Selma Tekir. “Gender Bias in Occupation Classification from the New York Times Obituaries”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi 24, no. 71 (May 2022): 425-36. https://doi.org/10.21205/deufmd.2022247109.
EndNote Atik C, Tekir S (May 1, 2022) Gender Bias in Occupation Classification from the New York Times Obituaries. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 24 71 425–436.
IEEE C. Atik and S. Tekir, “Gender Bias in Occupation Classification from the New York Times Obituaries”, DEUFMD, vol. 24, no. 71, pp. 425–436, 2022, doi: 10.21205/deufmd.2022247109.
ISNAD Atik, Ceren - Tekir, Selma. “Gender Bias in Occupation Classification from the New York Times Obituaries”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 24/71 (May 2022), 425-436. https://doi.org/10.21205/deufmd.2022247109.
JAMA Atik C, Tekir S. Gender Bias in Occupation Classification from the New York Times Obituaries. DEUFMD. 2022;24:425–436.
MLA Atik, Ceren and Selma Tekir. “Gender Bias in Occupation Classification from the New York Times Obituaries”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, vol. 24, no. 71, 2022, pp. 425-36, doi:10.21205/deufmd.2022247109.
Vancouver Atik C, Tekir S. Gender Bias in Occupation Classification from the New York Times Obituaries. DEUFMD. 2022;24(71):425-36.

Dokuz Eylül Üniversitesi, Mühendislik Fakültesi Dekanlığı Tınaztepe Yerleşkesi, Adatepe Mah. Doğuş Cad. No: 207-I / 35390 Buca-İZMİR.