This project investigates the evolution of the English language over the past century through a machine learning model trained on leading articles from The New York Times spanning from 1920 to 2020. The primary aim is to predict the year in which a given sentence could have been written based on linguistic patterns, including word usage and sentence structure. By analyzing these patterns, the model provides insights into the changing styles and trends in written English over time. The model's predictions are grounded in extensive data analysis and machine learning techniques, ensuring a high degree of accuracy. This study not only highlights the dynamic nature of language but also demonstrates the application of computational methods in linguistic research. The findings of this research are significant for historical linguistics and literature studies, as they provide a quantifiable method to track linguistic changes. Additionally, this work can aid in the development of tools for temporal text classification, benefiting fields such as digital humanities and archival studies. Understanding how language evolves is crucial for preserving cultural heritage and improving communication strategies in various media.
language evolution machine learning historical linguistics text analysis computational linguistics
Primary Language | English |
---|---|
Subjects | Data Mining and Knowledge Discovery, Artificial Intelligence (Other) |
Journal Section | Articles |
Authors | |
Early Pub Date | December 22, 2024 |
Publication Date | December 22, 2024 |
Submission Date | November 8, 2024 |
Acceptance Date | December 21, 2024 |
Published in Issue | Year 2024 Volume: 8 Issue: 2 |