Scanning old documents (like EN)
KA4HJH
ka4hjh at gte.net
Thu Feb 17 00:13:41 CET 2000
Aside from the tangle of legal issues involved, the fact is that
scanning old documents and touching up the results is a
time-consuming, laborious business. Something the size of the EN
collection would be a massive undertaking.
Then there's OCR, Optical Character Recognition. The idea is to
turned the scanned text back into ASCII text, which takes up a LOT
less space in pdf file!
As an experiment I have already done this with a couple of
interesting old articles, with impressive results (I think). The only
problem is that it took FOREVER to get one done. Part of the problem
stems from the fact that things like parts lists and technical jargon
are not readily recognized by the OCR software, and have to be
corrected by hand. "S5" becomes "SS" and "R11" becomes "Rll". It's
still faster than retyping the original but a real pain. The better
programs, like OmniPage, can be "trained" to a certain extent, but I
have never had the opportunity to look into this.
Once I had the text converted and the images tweaked I recreated the
original article in FrameMaker, then used Acrobat Distiller 4 to
create the final pdf. Not only prints great, but it looks great on
the screen, too. MUCH more readable than a scan. You can zoom in as
deep as you want.
I hope to have this online so everyone can take a look at it this week.
Terry Bowman, KA4HJH
"The Mac Doctor"
ICQ: 45652354
More information about the Synth-diy
mailing list