[sdiy] Can anyone OCR the AN23.PDF File Here?
Rob Kam
robkam at ymail.com
Thu Jul 6 13:24:48 CEST 2017
There's a second attempt at <http://www.sdiy.info/AN23b.rtf>
http://www.sdiy.info/AN23b.rtf converting the equations to images instead,
(and still manually tweaking the OCR). It took six minutes to do from the
scan/PDF and the text still needs comparing and correcting against the
original.
There are already experts at this sort of project, at Archive.org who have
been doing this for a number of years
<https://archive.org/details/texts&tab=about>
https://archive.org/details/texts&tab=about
To put my two cents in, the synth DIY community should see whether they are
able to raise the funds to compensate (against unsold hardcopy, ebooks etc.)
for releasing Electronotes under a non-commercial Creative Commons licence
https://creativecommons.org/licenses/by-nc/2.0/uk/
Rob
From: Bernard Arthur Hutchins Jr [mailto:bah13 at cornell.edu]
Sent: 06 July 2017 01:42
To: Rob Kam <robkam at ymail.com>; mskala at ansuz.sooke.bc.ca
Cc: synth-diy at synth-diy.org
Subject: Re: [sdiy] Can anyone OCR the AN23.PDF File Here?
Tkanks Rob -
But a manual identifications and 5 minutes/page is no good for the small
improvement. Still months of 8-hour days to do 6000 pages. My PDF is still
much better already. The equations are still unusable. It makes the same
text errors, apparently. Why not just say it just can't do this? Wasn't
intended to.
Thanks for trying - useful data point!
Bernie
_____
From: Rob Kam < <mailto:robkam at ymail.com> robkam at ymail.com>
Sent: Wednesday, July 5, 2017 6:47 PM
To: Bernard Arthur Hutchins Jr; <mailto:mskala at ansuz.sooke.bc.ca>
mskala at ansuz.sooke.bc.ca
Cc: <mailto:synth-diy at synth-diy.org> synth-diy at synth-diy.org
Subject: RE: [sdiy] Can anyone OCR the AN23.PDF File Here?
Hi Bernie,
At <http://www.sdiy.info/AN23.rtf> http://www.sdiy.info/AN23.rtf this took
10 minutes to OCR with
<https://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&u
act=8&ved=0ahUKEwiZhc6ZmPPUAhVG6RQKHRHpA1UQFggoMAA&url=http%3A%2F%2Fwww.abby
y.com%2Fen-gb%2Fsupport%2Ffinereader-12%2F&usg=AFQjCNHLOjsz219pjjTDqDytG2Cpm
9N90w> ABBYY FineReader 12, first manually identifying areas of text vs.
images. Obviously it still needs further corrections.
Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://synth-diy.org/pipermail/synth-diy/attachments/20170706/6a82fefe/attachment.htm>
More information about the Synth-diy
mailing list