There's a certain kind of computer science paper from the 90s, very frequently e...

svat · on Jan 9, 2022

(TLDR: Compare [2] and [4].)

These are examples of files that have gone the route DVI → PS → PDF, where the PS file contained Type 3 (bitmap) fonts without hinting instructions for on-screen viewing. If you have the original PS file you can often fix them with Heiko Oberdiek's amazing `pkfix` (and `pkfix-helper` if needed) tool [1].

(You can zoom the PDF to 500% to see the font's glyph shapes down to actual pixels; this pixelation is not inherently a problem as these PDFs look fine when printed on a typical high-resolution printer: try it!)

In more detail for your example PDF [2]: running `pdfinfo` gives:

    $ pdfinfo H-SIGMOD1999.pdf
    Title:          main.dvi
    Creator:        dvips 5.58 Copyright 1986, 1994 Radical Eye Software
    Producer:       Acrobat Distiller Command 3.0 for Solaris 2.3 and later (SPARC)
    CreationDate:   Thu Jan 24 16:43:42 2002 PST

So presumably:

• `main.tex` has been typeset with TeX into `main.dvi` at some point (the paper has a "September 1998" and "SIGMOD 1999" in the title, so presumably it's from then),

• This `main.dvi` has been converted into a `.ps` file at some point, using dvips 5.58 from 1994 (dvips 5.70 was in 1997).

• This `.ps` file has been converted into `.pdf` using Acrobat Distiller on SPARC, presumably in 2002.

Now, with some searching online we can actually find the original PS file: it's at [3] (and dated "24-Sep-1998 19:15"). The dvips version 5.58 is too old for pkfix to run its magic directly, but by running pkfix-helper first (which does guessing based on font metrics, and in this case seems to have guessed mostly correctly: though the superscript font for footnotes is wrong), and then pkfix, and then converting to PDF, we get this equivalent I just made (compare with your example [2]): [4]

[1]: https://en.wikipedia.org/w/index.php?title=Pkfix&oldid=10194... and https://ctan.org/pkg/pkfix-helper?lang=en

[2]: https://courses.cs.duke.edu/spring02/cps296.1/papers/H-SIGMO...

[3]: https://www1.icsi.berkeley.edu/ftp/global/global/pub/techrep...

[4]: https://shreevatsa.net/post/2022-pkfix-etc/tr-98-033_pkfix-h...

bombcar · on Jan 9, 2022

In theory something like retina could help, overrendering and then converting down. Too bad the DVI and PDF don’t embed the original LaTeX

mwcampbell · on Jan 9, 2022

There are also some old papers, converted to PDF by way of dvips and Ghostscript, that are completely unreadable with a screen reader unless you OCR them, because the text in the PDF is somehow not using Unicode or even ASCII. You'll see the same problem if you run pdftotext or similar on the document. My usual example is this:

https://legacy.cs.indiana.edu/~dfried/mex.pdf

(I only know about this paper because someone linked to it on HN several years ago.)

I'd be interested to know more about how the conversion to PDF ended up like this.

knuthsat · on Jan 9, 2022

This is computer modern optimized for printing. It’s not bad, the pixels are constructing the glyphs in a way that makes them good for printing.

Bitmap version of the font, I guess.

gkbfj · on Jan 9, 2022

DVI files do not contain fonts. What happened here is that the DVI to PDF conversion targeted a low resolution bitmap output (for screen/download, not for printing). The vector font specified by the DVI file was rasterized at that point. Later, PDF compatible (PostScript) vector versions of the Computer Modern fonts became available.

https://wearables.cc.gatech.edu/resources/latex2pdf.html