NANO SCIENTIFIC RESEARCH CENTRE PVT.LTD., AMEERPET, HYD
WWW.NSRCNANO.COM, 09640648777, 09652926926
JAVA PROJECTS LIST--2013
JAVA 2013 IEEE PAPERS
Handwritten
Chinese Text Recognition by Integrating Multiple Contexts
Abstract
This
paper presents an effective approach for the offline recognition of
unconstrained handwritten Chinese texts. Under the general integrated
segmentation-and-recognition framework with character oversegmentation, we
investigate three important issues: candidate path evaluation, path search, and
parameter estimation. For path evaluation, we combine multiple contexts
(character recognition scores, geometric and linguistic contexts) from the
Bayesian decision view, and convert the classifier outputs to posterior probabilities
via confidence transformation. In path search, we use a refined beam search
algorithm to improve the search efficiency and, meanwhile, use a candidate
character augmentation strategy to improve the recognition accuracy. The
combining weights of the path evaluation function are optimized by supervised
learning using a Maximum Character Accuracy criterion. We evaluated the recognition
performance on a Chinese handwriting database CASIA-HWDB, which contains nearly
four million character samples of 7,356 classes and 5,091 pages of
unconstrained handwritten texts. The experimental results show that confidence
transformation and combining multiple contexts improve the text line
recognition performance significantly. On a test set of 1,015 handwritten
pages, the proposed approach achieved character-level accurate rate of 90.75
percent and correct rate of 91.39 percent, which are superior by far to the
best results reported in the literature.
Existing
system
In
the context of handwritten text (character string1) recognition, many works
have contributed to the related issues of oversegmentation, character
classification, confidence transformation, language model, geometric model, path
evaluation and search, and parameter estimation. For oversegmentation,
connected component analysis has been widely adopted, but the splitting of
connected (touching) characters has been a concern. After generating candidate
character patterns by combining consecutive primitive segments, each candidate
pattern is classified using a classifier to assign similarity/dissimilarity scores
to some character classes. Character classification involves character
normalization, feature extraction, and classifier design. For classification of
Chinese characters with large number of classes, the most popularly used classifiers
are the modified quadratic Discriminant function (MQDF) and the nearest
prototype classifier (NPC). The MQDF provides higher accuracy than the NPC but suffers
from high expenses of storage and computation.
Proposed System
This
system focuses on the recognition of text lines, which are assumed to have been
segmented externally. For the convenience of academic research and benchmarking,
the text lines in our database have been segmented and annotated at character
level. First, the input text line image is oversegmented into a sequence of
primitive segments using the connected component-based method. Consecutive
primitive segments are combined to generate candidate character patterns,
forming a segmentation candidate lattice. After that, each
candidate pattern is classified to assign a number of candidate character
classes, and all the candidate patterns in a candidate segmentation path generate
a character candidate lattice.
Software
Requirement Specification
Software
Specification
Operating System : Windows XP
Technology : JAVA
1.6
Minimum
Hardware Specification
Processor : Pentium
IV
RAM : 512 MB
Hard Disk : 80GB
No comments:
Post a Comment