[sdiy] Can anyone OCR the AN23.PDF File Here?
Mike HEQX
mike at heqx.com
Sun Jul 16 06:14:24 CEST 2017
Well the OCR that I did was very accurate and super quick. It did not
like the formulae or the graphics of course, but keywords could be
parsed out with a good script. In that case you need a master list of
terms related to electronics. Shouldn't be too hard to get.
Mike
On 7/15/2017 9:29 PM, Brian Willoughby wrote:
> On Jul 6, 2017, at 3:07 PM, Mike HEQX wrote:
>> You're right about that Dave. You don't need to be able to search every single word per page. That is why a good taxonomical index is the way to go.
> Well, you can't make a good taxonomical index without at least getting the key words recognized correctly. Unless you're going to hand enter those key words, then you really should start with a good OCR.
>
> In my experience, it's helpful to have more than just the subject of the article. If there is a particular kind of power supply (bipolar, unipolar, switching, linear) or a different kind of output (balanced, unbalanced) or a different kind of gain stage (discrete transistor, op-amp, special-purpose integrated circuit), then you will benefit later if you can search for all of these minor details.
>
> For an example of the ways that I think are important to categorize a series of electronics circuit articles, see the following:
>
> http://www.diyaudio.com/forums/pass-labs/187649-concise-guide-published-amplifier-circuits-nelson-pass.html
>
> The PDF is a hand-made taxonomical index to about three decades of articles written by Nelson Pass. It goes deeper than the titles of the articles, but obviously doesn't provide a full text search.
>
> Brian Willoughby
>
More information about the Synth-diy
mailing list