[sdiy] Can anyone OCR the AN23.PDF File Here?
Mike HEQX
mike at heqx.com
Fri Jul 7 00:03:19 CEST 2017
I'll do it. I have already converted it, so I'll just give it a read and
fix up the graphics.
Mike
On 7/6/2017 1:51 PM, Rob Kam wrote:
> Thanks for the challenge Bernie but no thanks. I don't have the
> patience to correct the OCR.
>
> Rob
>
> ------------------------------------------------------------------------
> *From:* Bernard Arthur Hutchins Jr <bah13 at cornell.edu>
> *To:* Rob Kam <robkam at ymail.com>
> *Cc:* "synth-diy at synth-diy.org" <synth-diy at synth-diy.org>
> *Sent:* Thursday, 6 July 2017, 18:30
> *Subject:* Re: [sdiy] Can anyone OCR the AN23.PDF File Here?
>
> Thanks Rob -
>
> True - the equations are now usable, but slightly more blurred than my
> original PDF. Likewise, the figures are now OK but of slightly lower
> quality, which does NOT matter much for hand drawings.
>
> I did note a lot of OCR misreads in the text. A careful proofing of
> the text took me 18 minutes and there are 25 errors, some not at all
> obscure, and about 13 of which I had to look at the original to see
> what they were supposed to be. (One was hard to detect since it
> substituted an Rf for an Ri, a disaster). A full proofread/correction
> would take at least 30 minutes (188 eight-hour days for 6000 pages).
> And I wrote this! Almost certainly a volunteer would have more
> trouble and miss errors.
>
> In the spirit of no good deed going unpunished, Rob, let me put you on
> the spot. Take your scan, find and fix the 25 errors. Let us know how
> easy/hard this was and the time it took, and show your results.
>
> I will post the "solution" to the "find the errors" this evening if I
> get the chance.
>
> Since there is no improvement in the figures/equations, and the text
> is a serious downgrade, tell me again (anyone) why an OCR/ebook is a
> good idea here.
>
> Bernie
>
>
> ------------------------------------------------------------------------
> *From:* Rob Kam <robkam at ymail.com>
> *Sent:* Thursday, July 6, 2017 7:24 AM
> *To:* Bernard Arthur Hutchins Jr
> *Cc:* synth-diy at synth-diy.org
> *Subject:* RE: [sdiy] Can anyone OCR the AN23.PDF File Here?
> There’s a second attempt at http://www.sdiy.info/AN23b.rtf converting
> the equations to images instead, (and still manually tweaking the
> OCR). It took six minutes to do from the scan/PDF and the text still
> needs comparing and correcting against the original.
> There are already experts at this sort of project, at Archive.org who
> have been doing this for a number of years
> https://archive.org/details/texts&tab=about
> Free Books : Download & Streaming : eBooks and Texts ...
> <https://archive.org/details/texts&tab=about>
> archive.org
> The Internet Archive offers over 12,000,000 freely downloadable books
> and texts. There is also a collection of 550,000 modern eBooks that
> may be borrowed by anyone ...
>
>
>
>
> To put my two cents in, the synth DIY community should see whether
> they are able to raise the funds to compensate (against unsold
> hardcopy, ebooks etc.) for releasing Electronotes under a
> non-commercial Creative Commons licence
> https://creativecommons.org/licenses/by-nc/2.0/uk/
> Rob
> *From:*Bernard Arthur Hutchins Jr [mailto:bah13 at cornell.edu]
> *Sent:* 06 July 2017 01:42
> *To:* Rob Kam <robkam at ymail.com>; mskala at ansuz.sooke.bc.ca
> *Cc:* synth-diy at synth-diy.org
> *Subject:* Re: [sdiy] Can anyone OCR the AN23.PDF File Here?
> Tkanks Rob -
> But a manual identifications and 5 minutes/page is no good for the
> small improvement. Still months of 8-hour days to do 6000 pages. My
> PDF is still much better already. The equations are still unusable.
> It makes the same text errors, apparently. Why not just say it
> just can't do this? Wasn't intended to.
> Thanks for trying - useful data point!
> Bernie
> ------------------------------------------------------------------------
> *From:*Rob Kam <robkam at ymail.com <mailto:robkam at ymail.com>>
> *Sent:* Wednesday, July 5, 2017 6:47 PM
> *To:* Bernard Arthur Hutchins Jr; mskala at ansuz.sooke.bc.ca
> <mailto:mskala at ansuz.sooke.bc.ca>
> *Cc:* synth-diy at synth-diy.org <mailto:synth-diy at synth-diy.org>
> *Subject:* RE: [sdiy] Can anyone OCR the AN23.PDF File Here?
> Hi Bernie,
>
> At http://www.sdiy.info/AN23.rtfthis took 10 minutes to OCR with ABBYY
> FineReader 12
> <https://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwiZhc6ZmPPUAhVG6RQKHRHpA1UQFggoMAA&url=http%3A%2F%2Fwww.abbyy.com%2Fen-gb%2Fsupport%2Ffinereader-12%2F&usg=AFQjCNHLOjsz219pjjTDqDytG2Cpm9N90w>,
> first manually identifying areas of text vs. images. Obviously it
> still needs further corrections.
>
> Rob
>
>
>
>
> _______________________________________________
> Synth-diy mailing list
> Synth-diy at synth-diy.org
> http://synth-diy.org/mailman/listinfo/synth-diy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://synth-diy.org/pipermail/synth-diy/attachments/20170706/c7ca1c40/attachment.htm>
More information about the Synth-diy
mailing list