android - Determining title in business card reading -
i making business card reader app. implementing tesseract ocr getting text image. texts printed on business card in format like
mark henry(name)
asst professor(profession)
xyz university(employer).
but how decide text user name, 1 user's company , 1 job title. there algorithm or what.
p.s. above sequence can changed.
this ideal problem natural language processing, can train classifier presume 'professor of', 'assistant to', etc more job description, , text 'mark', 'andrew', etc name. fuzzy logic , guess @ best.
example - http://textblob.readthedocs.org/en/latest/classifiers.html
>>> train = [ ... ('i love sandwich.', 'pos'), ... ('this amazing place!', 'pos'), ... ('i feel these beers.', 'pos'), ... ('this best work.', 'pos'), ... ("what awesome view", 'pos'), ... ('i not restaurant', 'neg'), ... ('i tired of stuff.', 'neg'), ... ("i can't deal this", 'neg'), ... ('he sworn enemy!', 'neg'), ... ('my boss horrible.', 'neg') ... ] >>> test = [ ... ('the beer good.', 'pos'), ... ('i not enjoy job', 'neg'), ... ("i ain't feeling dandy today.", 'neg'), ... ("i feel amazing!", 'pos'), ... ('gary friend of mine.', 'pos'), ... ("i can't believe i'm doing this.", 'neg') ... ]
Comments
Post a Comment