Identifying the authors of a given set of text is a well addressed and complicated task. It requires thorough knowledge of different authors’ writing styles and discriminating them. As the main contribution of this paper, we propose to perform this task using machine learning and deep learning methods, state-of-the-art algorithms, and methods used in numerous complex Natural Language Processing (NLP) problems. We used a text corpus of daily newspaper columns written by thirty authors to perform our experiments. The experimental results proved that document embeddings trained via neural network architecture achieve cutting edge accuracy in learning writing styles and identifying authors of given writings even though the dataset has a considerably unbalanced distribution. We represent our experimental results and outsource our codes for interested readers and natural language processing (NLP) enthusiasts as a GitHub repository. They can reproduce and confirm the results and modify them according to their own needs.
Natural Language Processing, Document Embeddings, Logistic Regression, Support Vector Machines, Author Identification
Primary Language | English |
---|---|
Subjects | Engineering |
Journal Section | Articles |
Authors |
|
Supporting Institution | TÜBİTAK |
Project Number | 3190585 |
Thanks | This work is a part of the project supported by the Scientific and Technological Research Council of Turkey (TUBITAK) TEYDEB-1501 program under Project no 3190585, and named “General Purpose Chatbot Application That Can Produce Meaningful Dialog via Machine Learning Algorithms”. |
Publication Date | June 28, 2021 |
Acceptance Date | May 3, 2021 |
Published in Issue | Year 2021, Volume 17, Issue 2 |
Bibtex | @research article { cbayarfbe846016, journal = {Celal Bayar University Journal of Science}, issn = {1305-130X}, eissn = {1305-1385}, address = {}, publisher = {Celal Bayar University}, year = {2021}, volume = {17}, number = {2}, pages = {137 - 143}, doi = {10.18466/cbayarfbe.846016}, title = {Deep Feature Generation for Author Identification}, key = {cite}, author = {Ozan, Şükrü and Taşar, Davut Emre and Özdil, Umut} } |
APA | Ozan, Ş. , Taşar, D. E. & Özdil, U. (2021). Deep Feature Generation for Author Identification . Celal Bayar University Journal of Science , 17 (2) , 137-143 . DOI: 10.18466/cbayarfbe.846016 |
MLA | Ozan, Ş. , Taşar, D. E. , Özdil, U. "Deep Feature Generation for Author Identification" . Celal Bayar University Journal of Science 17 (2021 ): 137-143 <https://dergipark.org.tr/en/pub/cbayarfbe/issue/63104/846016> |
Chicago | Ozan, Ş. , Taşar, D. E. , Özdil, U. "Deep Feature Generation for Author Identification". Celal Bayar University Journal of Science 17 (2021 ): 137-143 |
RIS | TY - JOUR T1 - Deep Feature Generation for Author Identification AU - ŞükrüOzan, Davut EmreTaşar, UmutÖzdil Y1 - 2021 PY - 2021 N1 - doi: 10.18466/cbayarfbe.846016 DO - 10.18466/cbayarfbe.846016 T2 - Celal Bayar University Journal of Science JF - Journal JO - JOR SP - 137 EP - 143 VL - 17 IS - 2 SN - 1305-130X-1305-1385 M3 - doi: 10.18466/cbayarfbe.846016 UR - https://doi.org/10.18466/cbayarfbe.846016 Y2 - 2021 ER - |
EndNote | %0 Celal Bayar University Journal of Science Deep Feature Generation for Author Identification %A Şükrü Ozan , Davut Emre Taşar , Umut Özdil %T Deep Feature Generation for Author Identification %D 2021 %J Celal Bayar University Journal of Science %P 1305-130X-1305-1385 %V 17 %N 2 %R doi: 10.18466/cbayarfbe.846016 %U 10.18466/cbayarfbe.846016 |
ISNAD | Ozan, Şükrü , Taşar, Davut Emre , Özdil, Umut . "Deep Feature Generation for Author Identification". Celal Bayar University Journal of Science 17 / 2 (June 2021): 137-143 . https://doi.org/10.18466/cbayarfbe.846016 |
AMA | Ozan Ş. , Taşar D. E. , Özdil U. Deep Feature Generation for Author Identification. CBUJOS. 2021; 17(2): 137-143. |
Vancouver | Ozan Ş. , Taşar D. E. , Özdil U. Deep Feature Generation for Author Identification. Celal Bayar University Journal of Science. 2021; 17(2): 137-143. |
IEEE | Ş. Ozan , D. E. Taşar and U. Özdil , "Deep Feature Generation for Author Identification", Celal Bayar University Journal of Science, vol. 17, no. 2, pp. 137-143, Jun. 2021, doi:10.18466/cbayarfbe.846016 |