There's a certain kind of computer science paper from the 90s, very frequently encountered if you're familiar with the research. The typesetting is incredibly ugly, and they're all ugly in the same characteristic way. See for example [1] (just the first one I found, some are quite a bit worse). I want to know the story behind this.
In the case of my example, note that when you load this paper in a browser, the tab will say "main.dvi". So I'm guessing the paper was typeset in LaTeX, published as DVI, and when PDF came along they converted it, but the DVI -> PDF conversion algorithm was really bad.
These are examples of files that have gone the route DVI → PS → PDF, where the PS file contained Type 3 (bitmap) fonts without hinting instructions for on-screen viewing. If you have the original PS file you can often fix them with Heiko Oberdiek's amazing `pkfix` (and `pkfix-helper` if needed) tool [1].
(You can zoom the PDF to 500% to see the font's glyph shapes down to actual pixels; this pixelation is not inherently a problem as these PDFs look fine when printed on a typical high-resolution printer: try it!)
In more detail for your example PDF [2]: running `pdfinfo` gives:
$ pdfinfo H-SIGMOD1999.pdf
Title: main.dvi
Creator: dvips 5.58 Copyright 1986, 1994 Radical Eye Software
Producer: Acrobat Distiller Command 3.0 for Solaris 2.3 and later (SPARC)
CreationDate: Thu Jan 24 16:43:42 2002 PST
So presumably:
• `main.tex` has been typeset with TeX into `main.dvi` at some point (the paper has a "September 1998" and "SIGMOD 1999" in the title, so presumably it's from then),
• This `main.dvi` has been converted into a `.ps` file at some point, using dvips 5.58 from 1994 (dvips 5.70 was in 1997).
• This `.ps` file has been converted into `.pdf` using Acrobat Distiller on SPARC, presumably in 2002.
Now, with some searching online we can actually find the original PS file: it's at [3] (and dated "24-Sep-1998 19:15"). The dvips version 5.58 is too old for pkfix to run its magic directly, but by running pkfix-helper first (which does guessing based on font metrics, and in this case seems to have guessed mostly correctly: though the superscript font for footnotes is wrong), and then pkfix, and then converting to PDF, we get this equivalent I just made (compare with your example [2]): [4]
There are also some old papers, converted to PDF by way of dvips and Ghostscript, that are completely unreadable with a screen reader unless you OCR them, because the text in the PDF is somehow not using Unicode or even ASCII. You'll see the same problem if you run pdftotext or similar on the document. My usual example is this:
DVI files do not contain fonts. What happened here is that the DVI to PDF conversion targeted a low resolution bitmap output (for screen/download, not for printing). The vector font specified by the DVI file was rasterized at that point. Later, PDF compatible (PostScript) vector versions of the Computer Modern fonts became available.
In the case of my example, note that when you load this paper in a browser, the tab will say "main.dvi". So I'm guessing the paper was typeset in LaTeX, published as DVI, and when PDF came along they converted it, but the DVI -> PDF conversion algorithm was really bad.
[1] https://courses.cs.duke.edu/spring02/cps296.1/papers/H-SIGMO...