Call For Paper Volume:4 Issue:10 Oct'2017 |

TEXT AND IMAGE EXTRACTION FROM HISTORICAL DOCUMENTS

Publication Date : 06/04/2016



Author(s) :

Madhuri Thete , Sonali Pawar , Sapana Tekule , Pramod Patel.


Volume/Issue :
Volume 3
,
Issue 3
(04 - 2016)



Abstract :

The chronicled record pictures are divided into areas of various substances. For dividing content components from non content components a binarized form of record is utilized. The non content areas are refined into drawings, foundation and clamor. To ensure rational locales in the last division spatial and shading components are abused. At first binarized variant of the archive is used to identify and extricate content and after that content is evacuated utilizing content stroke examination strategy. A classifier is utilized to frame different classes like foundation and clamor. The examination process comprises of two principle steps, page division and piece grouping. In the initial step a record picture is divided into homogeneous areas. The characterization step endeavors to recognize among the fragmented districts whether they are content, picture, drawing, and so forth. Every area is encouraged into a fitting calculation, as indicated by the sort of the district, for further handling. We show a technique to portion recorded report pictures into districts of various substance. Initially, we portion content components from non-content components utilizing a binarized variant of the report. At that point, we refine the division of the non-content districts into drawings, foundation and commotion. At this stage, spatial and shading elements are misused to ensure sound districts in the last division. Tests demonstrate that the proposed approach accomplishes better division quality as for different techniques. We look at the division quality on 252 pages of a verifiable original copy, for which the recommended technique accomplishes around 92% and 90% division precision of drawings and content components, separately.


No. of Downloads :

2


Indexing

Web Design MymensinghPremium WordPress ThemesWeb Development

TEXT AND IMAGE EXTRACTION FROM HISTORICAL DOCUMENTS

March 28, 2016