[sdiy] Can anyone OCR the AN23.PDF File Here?

Andrew Simper andy at cytomic.com
Fri Jul 7 03:12:27 CEST 2017


On 7 July 2017 at 01:30, Bernard Arthur Hutchins Jr <bah13 at cornell.edu> wrote:
>
>
> Since there is no improvement in the figures/equations, and the text is a serious downgrade, tell me again (anyone) why an OCR/ebook is a good idea here.
>
>
> Bernie
>
>

Bernie,

I admire your patience on this topic. It has always been clear to me
that trying to convert the text into computer characters is a bad
idea, but it is also very clear to me that having everything as PDF is
a good idea, even if it's just entire pages as images.

The best option from my point of view, if it's made clear that
searching may not find all instances, is to have the original whole
page images displayed for the old PDFs, but also include the OCR text
as a searchable part to the document. This way a bulk search will
hopefully at least give you a hit on each PDF that the keyword is in,
even if it doesn't get every keyword right (Rf vs Ri) in every
instance of that word at least it should give you the right document,
which would be a massive help. This process could be entirely
automated, no proofing required.

I know Bernie likes paper, and so do I, when I really want to
understand something and work through it I print out the single PDF
I'm working on so I can scribble on it and hold it and understand it
better. Most of the time I don't need to do this, just reading it on
screen is enough. For me being able to have a copy to read in digital
form is better than no copy, and it is also better than having the
entire collection as paper. I'm younger and used to PDFs, and being
able to search the entire collection for a keyword and then have the
full image scan of the PDF in question to read error free I would
definitely pay USD 40 for.

Bernie, what is your opinion on this?

Cheers,

Andy



More information about the Synth-diy mailing list