A quest on converting PDFs to rasterized PNGs with good quality and readability

Lately I’ve scratched a personal itch of mine: writing some kind of fiction or short story.
I started with a fan fiction named “IVO” (which should give you quite a big hint what it’s about) which is currently stalled and then transitioned to writing short stories which, I must say, I’m enjoying much more.

One issue I’ve faced is that my following is on Twitter which doesn’t support file sharing or PDF files: only images and videos.
This meant that I had to convert my PDFs (or Word documents) to images in order to share them there.

PDFs/DOCs are vector images which, to put it simply, are “resolutionless” images: you can make them as big or as small as you want and they’ll always look good.
This is great because it means that text will look great and be easily readable on every device and screen, despite their own resolution.
And the file is always really small in size.

PNGs and other images, on the other hand, are a raster format which means that they are bound to a resolution: in other words, if you need text to be clear and readable, you have to use high resolutions which mean not only big files, but that also leaves you at the mercy of the quality of your OS/browser/image viewer interpolation algorithm.
Ideally you’d want an image at a certain resolution so that every person can view it at a 1:1 scale and see it in its entirety.
This is of course impossible because each screen has different pixel density.
Just think about this: my 11″ Macbook Air has a lower resolution than my 6″ smartphone.

With this consideration in mind, we have to find some sort of middle ground to make everyone happy so here’s the one million dollar question: what resolution do we need?

Initially I did something horrible: I just took a screenshot of each PDF.
That gave me ~135DPI resolution (1366×768 on a 11.6″ screen) but it had a saving grace: Preview in macOS uses an absurdly strong anti-aliasing (more on this later) so the text, while being incredibly pixelated, was still readable.

Afterwards, I used Adobe Acrobat’s own export to image at 300DPI and the result was, honestly terrible.
But now, you might ask, how could a higher resolution image look worse than a lower one?

Introducing anti-aliasing, a term most gamers are probably aware of.
To put it briefly, anti-aliasing is a technique to “fill the gaps” in graphics when low resolutions are used in order to eliminate jagged edges.
As a consequence, in simple graphics such as fonts, it also makes the font look slightly bolder and, as a consequence, more readable.

Like I said, macOS Preview app applies a huge amount of AA, to the point where the font almost looks as if it’s in bold, while Acrobat, on the other hand, doesn’t apply any AA at all!

So after doing lots of tests I could conclude the following:

  • Anything 300DPI or lower is just not useable unless you’re using a very simple font like Courier
  • At 600DPI the situation improves significantly and even more refined fonts such as Times New Roman become pleasing and more readable
  • 1200DPI is what I’d recommend using though because at this resolution you won’t need any form of AA and the text will be perfect on whatever device/screen you read it on.

Now I asked myself: why the hell Acrobat doesn’t apply AA?

Honest reply: I don’t know. Seriously.
If you search the Internet you’ll find lots of people with my same question and no one, not even Adobe employees, knew why Acrobat never gave the user an option to apply AA when exporting PDFs to images.
Another issue of exporting super high resolution images with text in it is that, when opened and scaled down to screen resolution, most image viewers will make the text appear incredibly thin unless you zoom in quite a bit, worsening the reading experience.

But luckily open source comes in to save the day once again: enter Ghostscript, a free and opensource software which… does stuff.
Yeah, it is the fundamental building block for everything about text in the open source world, from managing PDFs and documents to printing, it would take me 20 pages to list all the things GS does.

Anyway, good news is that Ghostscript allows us to apply pretty strong AA when exporting a PDF to images, but that’s not all.

When I first tried at 300DPI I got abysmal results, even with AA.
That lead me to a solution which, once again, takes a page from the gaming world: super-sampling.
This is a technique where an image is rendered at a higher resolution than your screen can show and then downsample it to fit your screen: the result is a “fake AA” which improves the image quality in a quite noticeable way.

So I just did that.
I exported the PDF to 1600DPI, applied the strongest AA (and interpolation) and then downsampled by a factor of 8, down to 200DPI.
The result was, IMHO, perfect.

But don’t take my word on it, see it for yourself (make sure to click on the image and open it to see it in a 1:1 scale or 100% zoom):

Left: macOS own Preview app – incredibly strong AA, to the point where the font looks bold

Center: my current Ghostscript exported PNG with AA and super-sampling at 200DPI

Right: Adobe Acrobat export at 300DPI

Bear in mind that I use URW’s PO52 as a font which is the open source version of Adobe’s Palatino which is my favorite font for fiction but it is also incredibly detailed so it needs particular care when being rasterized.

This is the Ghostscript command I used:

gs -dSAFER -dQUIET -dNOPLATFONTS -dNOPAUSE -dBATCH -sOutputFile="${1%.pdf}/${1%.pdf}-%d.png" -sDEVICE=pnggray -r1600 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -dDOINTERPOLATE -dDownScaleFactor=8 -dUseTrimBox "${1}"

And here’s a brief explanation of the relevant options:

 

$1 is obviously the first argument which is the PDF file.

pnggray = PNG in grayscale (makes file a lot smaller)
r1600 = 1600DPI
*AlphaBits 4 = max anti-aliasing
DOINTERPOLATE = interpolation for better font rendering
DownScaleFactor 8 = scale down 8 times (crude super-sampling)

This gives me very small file size as well, being just 200DPI.

Anyway, that’s pretty much it, hope you enjoyed the read and happy exporting!

Leave a Comment