Metadata
- Source
- DECA-188
- Type
- Bug
- Priority
- Major
- Status
- Open
- Resolution
- N/A
- Assignee
- N/A
- Reporter
- Jonathan Hung
- Created
2011-10-31T12:13:01.330-0400 - Updated
2013-01-27T12:21:44.954-0500 - Versions
-
- 0.5
- 0.6
- 0.7
- Fixed Versions
-
- Future
- Component
-
- genpdf
Description
The quality of the OCR'ed text for Type 2 PDF is poor when using reasonably well photographed documents. The expectation is to have more legible / machine readable text generated.
Image 1 - generated using original computer generated document (See attached 1-1-1.png).
Image 2 - photograph of Image 1 (see: http://source.fluidproject.org/svn/design/decapod/testing-images/2-1-1.png ).
PDF 1 - the Type 2 generated PDF of Image 1 (See attached 1-1-1-t2.pdf).
PDF 2 - the Type 2 generated PDF of Image 2 (See attached 2-1-1[t2].pdf).
Type 2 results for PDF 1 and PDF 2 should be comparable quality?
Attachments
Comments
-
Jonathan Hung commented
2011-10-31T12:15:10.263-0400 Original computer generated image.
-
Jonathan Hung commented
2011-10-31T12:59:44.099-0400 generated PDF of image 1-1-1.png
-
Jonathan Hung commented
2011-10-31T13:01:01.362-0400 generated PDF of image 2-1-1.png (photograph of computer generated document).
-
tamir@tamirhassan.com commented
2013-01-27T12:21:44.954-0500 I've tried it out on the current (latest) version and can't notice any significant difference between the OCR quality of either document.