The global use of solar photovoltaic system is accelerating rapidly due to the ever-decreasing cost and improvement on cell efficiency [1]. Solar cell, which generates the electricity directly from sunlight, is expected to play a major role in solving the global energy crisis in an environmental-friendly and sustainable way. Among them, organic solar cells have attracted a great deal of attentions owing to several advantages, including wide range of applicable materials, low-cost, and compatible with flexible substrates [2]. Organic solar cell a third generation photovoltaic cell, consists of an organic photovoltaic active layer placed between a transparent electrode and a metal electrode [3].
With the continuous progress of society, unprecedented achievements have been made in scientific and technological innovation in various fields, and a large number of patent documents have also been produced and accumulated. Patent is the largest technical information source in the world and the carrier of scientific and technological knowledge. Patent applications and registrations in the domain of organic solar cells have been increasing rapidly over the past two decades.
Patent text-mining is a kind of effective method for decision-making of technological development. With the explosive growth of the number of patents, the massive patents have been time-consuming and labor-intensive, and even beyond the processing limit of human beings. How to quickly filter out useful information and the internal patterns of data from this information has become a difficult problem in front of us. Therefore, it is necessary to develop intelligent data analysis methods.
Python is an interpretable, high-level, general-purpose programming language. One of the areas where Python excels at is analysis of data and visualization. Based on the Python language, this paper uses the NLTK package to preprocess the patent text data, uses the TF-IDF method to vectorize the text. The K-means++ algorithm is used to cluster the data. The NetworkX library is used to create, manipulate graphs and analyze networks. The theory of structure holes is used to identify key nodes.
Beijing University of Chemical Technology
H1916
Thanks for the recommendation of ICRERA 2019.
H1916
Primary Language | English |
---|---|
Subjects | Engineering |
Journal Section | Articles |
Authors | |
Project Number | H1916 |
Publication Date | June 30, 2020 |
Published in Issue | Year 2020 Volume: 4 Issue: 2 |