<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                    <journal-id></journal-id>
            <journal-title-group>
                                                                                    <journal-title>Bilişim Teknolojileri Dergisi</journal-title>
            </journal-title-group>
                            <issn pub-type="ppub">1307-9697</issn>
                                        <issn pub-type="epub">2147-0715</issn>
                                                                                            <publisher>
                    <publisher-name>Gazi Üniversitesi</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id pub-id-type="doi">10.17671/gazibtd.714447</article-id>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Computer Software</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Bilgisayar Yazılımı</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                        <trans-title-group xml:lang="en">
                                    <trans-title>Preparing Interdisciplinary Graduate Course Contents Using Natural Language Processing Techniques</trans-title>
                                </trans-title-group>
                                                                                                                                                                                                <article-title>Doğal Dil İşleme Teknikleri Kullanılarak Disiplinler Arası Lisansüstü Ders İçeriği Hazırlanması</article-title>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0002-2166-1102</contrib-id>
                                                                <name>
                                    <surname>Albayrak</surname>
                                    <given-names>Ahmet</given-names>
                                </name>
                                                                    <aff>DÜZCE ÜNİVERSİTESİ, TEKNOLOJİ FAKÜLTESİ</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20201030">
                    <day>10</day>
                    <month>30</month>
                    <year>2020</year>
                </pub-date>
                                        <volume>13</volume>
                                        <issue>4</issue>
                                        <fpage>373</fpage>
                                        <lpage>383</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20200404">
                        <day>04</day>
                        <month>04</month>
                        <year>2020</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20200811">
                        <day>08</day>
                        <month>11</month>
                        <year>2020</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2008, Bilişim Teknolojileri Dergisi</copyright-statement>
                    <copyright-year>2008</copyright-year>
                    <copyright-holder>Bilişim Teknolojileri Dergisi</copyright-holder>
                </permissions>
            
                                                                                                <trans-abstract xml:lang="en">
                            <p>In this study, natural language processing methods, one of the data mining techniques, were used to prepare the content of an interdisciplinary course that is planned to be opened at graduate level. The graduate course is called Data Science and Applications. Data science is an interdisciplinary concept that includes statistics and computer science. The course has no place in the literature with a similar name. Data science is an approach that prioritizes data and is applied in many fields. Since the application area is very wide, the course is called Data Science and Applications. Papers published at a conference organized by IEEE for years were used as a data set in determining the course content. The conference called Data Science and Advanced Analytics will be held for the 7th time this year. Papers accepted to the conference in 2015, 2016, 2017 and 2018 were used in the data set. The title texts and keywords of the papers were analyzed with natural language processing techniques and the course content was determined. In this study, after the first data set was prepared, data-cleaning process was performed on the data, and then the title of the papers was divided into words. The frequencies of the words are found in the data set devoted to the words and the first twenty words are selected according to the frequency. Apache Spark NTK package was used in the natural language processing process. Since the 20 words chosen are atomic, the main topic titles are determined by the induction method.</p></trans-abstract>
                                                                                                                                    <abstract><p>Bu çalışmada lisansüstü seviyede açılan düşünülen disiplinler arası bir dersin içeriğinin hazırlanması için veri madenciliği tekniklerinden doğal dil işleme yöntemleri kullanılmıştır. Lisansüstü ders, Veri Bilimi ve Uygulamaları adını taşımaktadır. Veri bilimi temelde istatistik ve bilgisayar bilimlerini içine alan disiplinler arası bir kavramdır. Dersin benzer bir ad ile literatürde yeri yoktur. Veri bilimi yaklaşımı veriyi öncelikleyen ve oldukça fazla alanda uygulanan bir yaklaşımdır. Uygulama alanı çok geniş olduğundan derse Veri Bilimi ve Uygulamaları adı verilmiştir. IEEE’nin yıllardır düzenlediği bir konferansta basılan bildiriler ders içeriğinin belirlenmesinde veri seti olarak kullanılmıştır. Data Science and Advanced Analytics adındaki konferansın bu yıl 7. si düzenlenecektir. 2015, 2016, 2017 ve 2018 yıllarında konferansa kabul edilen bildiriler veri setinde kullanılmıştır. Bildirilerin başlık kısımları ve anahtar kelimeler doğal dil işleme teknikleri ile analiz edilerek ders içeriği belirlenmiştir. Bu çalışmada ilk olarak veri seti hazırlandıktan sonra, veri üzerinde veri temizleme işlemi yapılmış ardından bildiri başlıkları sözcüklere ayrılmıştır. Sözcüklere ayrılan veri seti içinde sözcüklerin frekansları bulunarak frekansa göre ilk yirmi sözcük seçilmiştir. Doğal dil işleme sürecinde Apache Spark NTK paketi kullanılmıştır. Seçilen 20 sözcük atomik olduğundan tümevarım yöntemi ile ana konu başlıkları belirlenmiştir.</p></abstract>
                                                            
            
                                                                                        <kwd-group>
                                                    <kwd>veri bilimi</kwd>
                                                    <kwd>  doğal dil işleme</kwd>
                                                    <kwd>  ders içeriği hazırlama</kwd>
                                                    <kwd>  veri bilimcisi</kwd>
                                                    <kwd>  konu modelleme</kwd>
                                            </kwd-group>
                            
                                                <kwd-group xml:lang="en">
                                                    <kwd>data science</kwd>
                                                    <kwd>  natural language processing</kwd>
                                                    <kwd>  course content preparation</kwd>
                                                    <kwd>  data scientist</kwd>
                                                    <kwd>  topic modeling</kwd>
                                            </kwd-group>
                                                                                                                                        </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">G. Strawn, “Data Scientist”, IT Prof., 18(3), 55–57, 2016.</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">M. Kim, T. Zimmermann, R. Deline, and A. Begel, “Data scientists in software teams: State of the art and challenges”, IEEE Trans. Softw. Eng., 44(11), 1024–1038, 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">C. Costa and M. Y. Santos, “The data scientist profile and its representativeness in the European e-Competence framework and the skills framework for the information age”, Int. J. Inf. Manage., 37(6), 726–734, 2017.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">V. Dhar, “Data science and prediction”, Commun. ACM, 56( 12), 64–73, 2013.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">F. W. Spaid and J. C. Frishett, “Incipient separation of a supersonic, turbulent boundary layer, including effects of heat transfer”, AIAA Journal, 10(7). 1972.</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">F. Provost and T. Fawcett, “Data Science and its Relationship to Big Data and Data-Driven Decision Making”, Big Data, 1(1) 51–59, 2013.</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">H. Hu, Y. Luo, Y. Wen, Y. S. Ong, and X. Zhang, “How to Find a Perfect Data Scientist: A Distance-Metric Learning Approach”, IEEE Access, 6, 60380–60395, 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">Internet: McKinsey, Big data: the next frontier for innovation, competition, and productivity,athttp://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_ innovation, 21.01.2020.</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">Internet: Columbia University, Data Science Institue. Columbia University, https://datascience.columbia.edu/master-of-science-in-data-science,15.02.2020.</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">E. Saquete, D. Tomás, P. Moreda, P. Martínez-Barco, and M. Palomar, “Fighting post-truth using natural language processing: A review and open challenges,” Expert Syst. Appl., 141, 112943, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">E. Saquete, D. Tomás, P. Moreda, P. Martínez-Barco, and M. Palomar, “Fighting post-truth using natural language processing: A review and open challenges,” Expert Syst. Appl., 141, 112943, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">M. Pejic-Bach, T. Bertoncel, M. Meško, and Ž. Krstić, “Text mining of industry 4.0 job advertisements”,  Int. J. Inf. Manage., 50, 416–431, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">M. Giménez, J. Palanca, and V. Botti, “Semantic-based padding in convolutional neural networks for improving the performance in natural language processing. A case of study in sentiment analysis”,  Neurocomputing, 378, 315–323, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">F. Salo, M. Injadat, A. B. Nassif, A. Shami, and A. Essex, “Data mining techniques in intrusion detection systems: A systematic literature review”, IEEE Access, 6, 56046–56058, 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">K. A. Renn and E. R. Jessup-Anger, “Preparing new professionals: Lessons for graduate preparation programs from the national study of new professionals in student affairs”, J. Coll. Stud. Dev., 49(4), 319–335, 2008.</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">I. Y. Song and Y. Zhu, “Big data and data science: what should we teach?”, Expert Syst., 33(4), 364–373, 2016.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">Y. Zhang, M. Chen, and L. Liu, A review on text mining, Proc. IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS, 681–685, 2015.</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">I. Yahav, O. Shehory, and D. Schwartz, “Comments Mining With TF-IDF: The Inherent Bias and Its Removal”, IEEE Trans. Knowl. Data Eng., 31(3), 437–450, 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">X. Chen, D. Zou, G. Cheng, and H. Xie, “Detecting latent topics and trends in educational technologies over four decades using structural topic modeling: A retrospective of all volumes of Computers &amp; Education”, Comput. Educ., 151, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">L. Yao, Y. Zhang, Q. Chen, H. Qian, B. Wei, Z. Hu, “Mining coherent topics in documents using word embeddings and large-scale text data,” Eng. Appl. Artif. Intell., 64, 432–439, 2017.</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">M. Pejic-Bach, T. Bertoncel, M. Meško, Ž. Krstić, “Text mining of industry 4.0 job advertisements”, Int. J. Inf. Manage., 50, 416–431, 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">M. Giménez, J. Palanca, and V. Botti, “Semantic-based padding in convolutional neural networks for improving the performance in natural language processing. A case of study in sentiment analysis,” Neurocomputing, 378, 315–323, 2020, doi: 10.1016/j.neucom.2019.08.096.</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">F. Salo, M. Injadat, A. B. Nassif, A. Shami, and A. Essex, “Data mining techniques in intrusion detection systems: A systematic literature review,” IEEE Access, 6, 56046–56058, 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">K. A. Renn, E. R. Jessup-Anger, “Preparing new professionals: Lessons for graduate preparation programs from the national study of new professionals in student affairs,” J. Coll. Stud. Dev., 49(4), 319–335, 2008.</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">E. Ustun., “Learning Analytics and Applications in Higher  Education”, Bilişim Teknolojileri Dergisi, 13(3), 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">H. Polat, M. Korpe, “Extracting Close Meaning Concepts from GNAT Parliamentary Minutes”, Bilişim Teknolojileri Dergisi, 11(3), 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">Ç. Aci, A. Çirak, “Turkish News Articles Categorization Using Convolutional Neural Networks and Word2Vec”, Bilişim Teknolojileri Dergisi, 12(3), 2019.</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
