In recent years, Algerian universities have become aware
of the interest of electronic archiving and the digitization of archives for a better management of their documents.
The development of systems enabling the analysis and understanding of archival
documents became an unavoidable need. The present paper follows this trend; it
proposes a system for the analysis of the physical structure of Algerian
baccalaureate transcripts, stored in the universities archives. The proposed
system proceeds in two phases: 1) preprocessing, in which several
operations are applied in order to reduce the noise present in the input
images. 2) Segmentation; It starts with the elimination of the
transcript border. Then, it extracts the text lines and the blocks, based on
RLSA algorithm and the projection profiles analysis. After, it proceeds to the
classification of the blocks in three: textual block, table, and graphic.
Finally, it recovers textual content from textual blocks and tables.
structure of document document understanding segmentation document image
Birincil Dil | İngilizce |
---|---|
Konular | Yazılım Mühendisliği (Diğer) |
Bölüm | Makaleler |
Yazarlar | |
Yayımlanma Tarihi | 23 Eylül 2019 |
Kabul Tarihi | 8 Eylül 2019 |
Yayımlandığı Sayı | Yıl 2019 Cilt: 2 Sayı: 1 |
International Journal of Informatics and Applied Mathematics