How does \expandafter work: TeX uses temporary token lists

Part 1 Part 2 Part 3 Part 4 Part 5 Part 6

Expansion and internal token lists

So far we have explored tokens, token lists and the core principles behind TeX’s concept of expansion. In this section we’ll use the TeX primitive \jobname to introduce an important aspect of expansion processing: TeX’s use of temporary token lists—which are a fundamental aspect of how \expandafter works, as we’ll see later in this article series.

\jobname is an expandable TeX primitive whose expansion generates a series of character tokens which represent the name of the main input .tex file. For example, suppose we had the following text as part of a .tex file called mycode.tex:

    The name of my file is \jobname .tex %Note the space after \jobname

This would typeset

    The name of my file is mycode.tex

When we use the \jobname command the resulting typeset characters are not read from your physical .tex file, so where do they come from: where does TeX store/read those tokens? Invisible to the user (i.e., deep inside TeX itself) the expansion process for \jobname creates a temporary token list constructed from the series of character tokens which represent the name of your file. Once \jobname has created that token list, TeX temporarily “switches its gaze” away from its current input source (here, our .tex file) to read tokens (character tokens) from that temporary token list. When TeX needs another token of input, it will read its next token from that internal list and continues to do so until it reaches the end of the list; at which point TeX resumes reading tokens from its previous input source which, here, would be text read from our .tex input file.

As shown in the following diagram, TeX resumes reading the .tex input file at the exact location it stopped after processing \jobname—after reading the space character but before reading the “.” character. The period (.) is, in practice, waiting to be read from TeX’s input buffer—a small area of TeX’s memory designed to hold a line of text read from the .tex file—TeX reads and processes your .tex file one line at a time, it does not read the entire file into memory.

When studying the following graphic, read from the bottom and work upwards to follow the process flow.

$How TeX expands \jobname$

Referring back to our discussion of expansion, we can observe that the expansion of \jobname resulted in the \jobname command (token) being removed from the input and replaced with tokens arising from expansion: the temporary token list generated to hold the name of the .tex file.

Expandable commands (such as \jobname) are not the only TeX primitives that “secretly” create and use token lists to achieve their effect. For example, the commands \uppercase and \lowercase both create internal token lists to change the case of their argument. Once the case-changing work is done, TeX switches to read character tokens from the token lists generated by those commands. Token lists are TeX’s only “token data storage” mechanism—apart from writing data out to a physical disk file.

Sources of tokens: TeX is a master juggler

When TeX processes a typical document it has to manage many sources of tokens: input coming from numerous physical disk files and innumerable internal tokens lists created during processing. In this section we'll very briefly explore how TeX manages to “juggle” those input sources.

Suppose we want a simple macro that typesets the name of our .tex file:

    \def\myfile{The name of my file is \jobname .tex}

Later, at some point in our .tex file we call the macro \myfile: temporarily, TeX switch from creating/reading tokens via text in your text (.tex) file to reading tokens from the \myfile definition (token list) stored in its memory. When TeX executes the \myfile macro (processes its tokens) it will detect a token representing the \jobname command, whose expansion creates yet another, temporary, token list from which TeX has to read tokens. Even in this simple scenario TeX has to manage three sources of input:

the .tex text file containing the \myfile macro;
the token list which stores the definition of \myfile macro;
a token list created by the \jobname command within the \myfile macro.

As TeX proceses a document it is constantly switching between input sources: physical files and token lists, so how does TeX keep track of this? The answer is that, internally, TeX engines maintain a so-called input stack which acts as a sort of “memory” that allows TeX to remember what it was doing (where it was reading from) as it switches between sources of input.

Without going too far into the details, the internal code within TeX engines uses a global variable called curinput (current input) which, among other things, tells TeX whether it is currently reading from a physical file or token list. curinput also points TeX to the location (in the current token list or its text buffer) from where it should get the next token. If TeX is reading from a token list curinput also records what type of token list is being processed—e.g., the list of tokens stored as a macro or whether those tokens arose from a different source.

When required, the curinput variable will be changed to point to a new input source and TeX’s current “input state” (source and location) will be saved in the input stack so that TeX can later return to that exact location (position in a .tex file or the next token in a token list). Once that new input source is exhausted (e.g., no more tokens in the token list or the end of a file is reached) it is popped off the stack and curinput is updated to ensure TeX reverts to getting tokens from the previous source.

Digging deeper (optional reading)

The following sections provide additional background information for readers who enjoy the details.

Real token lists

The following graphic was generated using Overleaf’s custom build of Knuth’s TeX which provides access to TeX’s internal data and data structures. This illustration of a token list builds on the simplified version presented above and includes additional data, such as showing the characters generated by \jobname have category code 12, not the usual category code of 11. In this diagram “node” is just the name given to a unit of memory storage used by TeX.

Inside a TeX token list

How TeX reads and processes `\jobname`

Also, for completeness, here is an overview of TeX’s “thought processes” as it detects \jobname in our input .tex file. In this graphic we see how TeX detects an escape character (\ with category code 0), processes the character sequence jobname, generates a token and looks-up the meaning of the \jobname command, where TeX will discover it has a command code > 100, indicating it is an expandable command.

$How TeX scans for and processes \jobname$