[sdiy] Remember the Internet Archive (Re: EKO Stradivarius)
john at sleefamily.org
Sun Apr 3 08:58:08 CEST 2022
In addition to Brian’s comments, if also doing OCR:
* scan in greyscale, not 1-bit; in my experience (several million pages, using Kadmos OCR library) that helps OCR accuracy significantly
* find some PDF software that lets you place the OCR text as an invisible layer under the image, so you can search but see the original text (and thus not get confused by OCR errors)
Choice of scanner also makes a huge difference to workflow, and I don’t mean just autofeed. Stuff like Kodak’s PerfectPage tech can dramatically reduce post-scan cleanup work.
I’ve been out of that game for a decade but it was an interesting stream of work and I miss it
On Sun, 3 Apr 2022, at 03:30, Brian Willoughby wrote:
> Since archival is being suggested, I would like to recommend that the
> scan be done carefully and a proper file format be chosen.
> First of all, JPEG (.jpg) is a bad format for text and line drawings
> like schematics. The format was designed for full color photographs of
> natural scenes, but JPEG will distort the edges of high-contrast images
> like black ink on white paper. Choose TIFF or PNG or even fax
> compression which are lossless.
> Second of all, sometimes it's necessary to pay attention to the color
> of the material behind the pages when scanning, to avoid bleedthrough
> of the image on the back side. When working on troublesome sources,
> using 8-bit monochrome scans and then applying a little bit of contrast
> adjustment after the scan can really help with the clarity of the image.
> I'm always appreciative of any scan, but there's always more demand
> when no archive exists. Unfortunately, a poor scan might make it less
> likely that someone will re-scan to make the schematics more readable.
> I mention this because I've attempted repairs of vintage synths where
> the specific information that I needed was obscured due to bleed
> through and/or JPEG edge artifacts. These are reasonably easy to avoid,
> but it's just that the automatic settings on scanners don't always do
> the best job. It takes some careful parameter settings.
> Brian Willoughby
> On Apr 2, 2022, at 08:42, Anthony Carrico <acarrico at memebeam.org> wrote:
>> On 4/2/22 04:14, Rutger Vlek wrote:
>>> I'm not sure what exactly you are suggesting
>>> This is a rare 70s machine, so the service manual may have never existed in digital form. So luttele chance of finding it in the archive. OR are you suggesting to put it there?
>> Sorry--I'm suggesting putting it there!
>> You have one of the few copies, and you have taken the first step by digitizing it, so you might as well take the next step and archive it. That way you will make the schematic available for posterity, and you will be able to easily refer to the archived copy in your posts to Synth-diy and GearSpace.
>> Anthony Carrico
> Synth-diy mailing list
> Synth-diy at synth-diy.org
> Selling or trading? Use marketplace at synth-diy.org
More information about the Synth-diy