• Tip of the Week: How to convert typeset PDF pages into raster image formats

    Posted by Graham on November 22, 2017

    Suppose you want to convert one or more pages of your typeset document’s PDF file into an image file format such as PNG or JPEG—for example, to use them in a web page or to produce graphics with nicely typeset text for sharing on social media. How can you do that?

    Clearly, the conversion of pages in your document PDF will need to happen after LaTeX has finished typesetting and the final PDF file has been fully output—so where do you put the command(s) to do the conversion? The answer is to use a so-called latexmkrc file which you can very easily create and add to your Overleaf project.

    How Overleaf compiles LaTeX: latexmk

    Oveleaf’s automatic compilation of LaTeX is powered by a large Perl script called latexmk which is widely used to automate the processes involved in LaTeX-based typesetting. latexmk provides a facility for users to customize/modify its behaviour through a local configuration file called latexmkrc: a simple text file that you create and add to your Overleaf project. Interested readers may like to know that we’ve published a number of articles showing how to use latexmk on Oveleaf’s servers; for example:

    Because latexmk is a Perl program, a latexmkrc file is, in effect, a small Perl script—but don’t worry if you’re not familiar with Perl programming because our example latexmkrc contains just a single line and we produced the following video to show you how to create a latexmkrc file within your Overleaf project:

    Note: there is no voiceover/sound in that YouTube video.

    Somes notes on the conversion

    The latexmkrc we have used contains the single line:

    END { system('convert -density 600 -background white -flatten output.pdf pages.png'); }
    

    This uses a Perl function called system(...) to execute a program called convert—which does the actual conversion. The command we have used will rasterize the the PDF file (output.pdf) at 600 dpi (using -density 600), outputting a PNG file called pages.png. Note that in addition to PNG convert can generate images in other graphics formats including JPEG. The meaning of the -density option is documented in ImageMagick’s online help.

    Example image

    Here is an example PNG graphic produced using the above command—for clarity, the black border was added by CSS and is not present in the converted image itself:

    A sample typeset page rasterized to a PNG graphic

    Note that the convert program has a -trim option that you can use to remove white space surrounding the typeset text.

    Choosing a particular page in your PDF

    If you want to generate a PNG file from a particular page in your PDF you can do that by putting the desired page number in square brackets [...] after the PDF file name. For example:

    END { system('convert -density 600 -background white -flatten output.pdf[1] page2.png'); }
    

    will output the second page of output.pdf as a PNG file called page2.png—the convert program uses zero-based page numbers and refers to page 1 of your PDF (output.pdf) using output.pdf[0]; page 2 is output.pdf[1] and so forth.

    convert is a very powerful graphics-processing utility and is part of the ImageMagick toolset—it has a wealth of command line options and interested readers are encouraged to explore its capabilities.