Sliding window based data summarization which is a quantity based summarization is commonly used in data stream clustering area in which the recent data is more important. In this data summarization method, w which is a predefined variable, of the most recent data is taken as the summary each time a new data arrives and the window slides one by one. This means that the model processes all the data in the data window each time a new data arrives. This approach causes the performance to reduce. Therefore, there is a need of new studies to be proposed in this area. In this study, a new sliding window model named ImpSlidingWindow (ISW) is proposed as a solution to the mentioned problem. In the proposed model, we propose that clustering model to work whenever a certain number of data accumulates instead of each data entry. With this new model, the sliding window width is divided into four equal parts and the clustering model works at the end of each part. As a result, a significant increase in the performance is achieved by enabling the clustering model to run four times instead of working as much as the number of data in the window width. When the proposed model applied to KD-AR Stream algorithm which is a proposed algorithm in the data stream clustering area, it has been found that up to 80% improvement obtained in run-time complexity.
Primary Language | Turkish |
---|---|
Subjects | Engineering |
Journal Section | Articles |
Authors | |
Publication Date | October 31, 2019 |
Published in Issue | Year 2019 |