[sdiy] re-publishing typewritten material
mskala at ansuz.sooke.bc.ca
mskala at ansuz.sooke.bc.ca
Mon Nov 9 14:59:48 CET 2020
On Sun, 8 Nov 2020, Barry Klein wrote:
> about to redraw the schematics. As it stands, I have the book in PDF form
> with a file size of about 40MB (334 pages). Not something you can email. I
> am sure there are those out there that would take on the job of doing all this
> for hundreds of dollars, but I don't believe the cost would be justified.
I'm surprised that it's only 40M for 334 pages of un-OCRed scanned
material. My Leapfrog VCF manual is about 40M for 76 pages and that's
direct-to-PDF LaTeX rendered, not scanned. I would expect my file to be
much more compact per page than scanned material. If your book is 40M for
334 scanned pages it suggests that the PDF conversion must have involved
some very aggressive lossy compression (JPEG-style) and then the
compressed PDF is necessarily going to look dirtier than the original
scans were. Some of the problems you're seeing may be coming from that,
rather than editing that really needs to be done to the original scans.
Editing PDFs is Hell in any case. It's like trying to edit Gerbers
without the original CAD files from which they were generated. If there's
any possibility of getting the original scanner output as image files -
one file per page, preferably lossless TIFF format, and I'd expect them to
be at least 1M per page (totalling much more than 40M!) - then those would
be much easier to edit. Convert them to PDF only as the very last step
before going to print, and expect the final PDF to be bigger than 40M.
Commercial printers are well-equipped to handle PDF files that are
hundreds of megabytes or more.
If there's no possibility of getting the original scans and you must work
from your 40M PDF, then the first step is to convert it into a set of
per-page lossless image files that can be edited. That part is easy.
Under Linux I'd use "pdftopng" and it would be done in a few seconds.
There may be online tools that will do this for free - just upload your
file and download the results. I could help do it but given my own life
situation the only way I could do it for free would be for a
non-commercial project that asked me nicely; anybody who is commercial
would have to pay me.
Once converted to editable files, editing would be the time-consuming part
because that requires human intervention on every page, but it sounds like
you're willing to do the editing yourself. Converting the edited per-page
files back to PDF for the final print is harder than going the other way,
but still a pretty simple thing to automate.
An important question is how clean you really need it to be, and how much
(time or money) you're willing to invest in that. Human editing is
definitely going to cost you at least a dollar per page if you pay for it,
and you mention hundreds of dollars being too much, so that means you
either can't have human editing, or you have to get it without paying for
it. But it sounds like you may be willing to do the editing yourself, so
then the question is how to make that do-able; and I think avoiding PDF
format for as much of the workflow as possible would help a lot.
--
Matthew Skala
mskala at ansuz.sooke.bc.ca People before tribes.
https://ansuz.sooke.bc.ca/
More information about the Synth-diy
mailing list