How does \expandafter work: A detailed study of consecutive \expandafter commands

Part 1 Part 2 Part 3 Part 4 Part 5 Part 6

Case study: Understanding `\expandafter\expandafter\expandafter...`

Advanced macros, such as those found in LaTeX packages, often make extensive use of multiple consecutive \expandafter commands to perform sophisticated token “juggling”. For most of us, such macros can be difficult to understand or daunting to write. In this section we’ll review the mechanics underlying TeX’s processing of consecutive \expandafter commands:

\expandafter\expandafter\expandafter...

To assist our explanation we’ll add subscripts to each \expandafter—to indicate which one we are referring to:

        \expandafter₁\expandafter₂\expandafter₃...

In addition, we’ll extend the notation for tokens processed by each \expandafter to use $\mathrm{T^i_1}$ and $\mathrm{T^i_2}$ representing the tokens $\mathrm{T_1}$ and $\mathrm{T_2}$ read-in by \expandafter with subscript i: \expandafter_i $\mathrm{T^i_1T^i_2}$. We’ll also assume that two tokens $\mathrm{T_X}$ and $\mathrm{T_Y}$ follow after \expandafter₃ so that our input looks like this: \expandafter₁\expandafter₂\expandafter₃ $\mathrm{T_{X}T_{Y}}$.

When TeX starts to process this input, for \expandafter₁ it will see

$\mathrm{T^1_1} =\ $\expandafter₂ which is saved for later re-insertion back into the input
$\mathrm{T^1_2} =\ $\expandafter₃, which is expanded

If we refer back to our earlier discussion of the code inside TeX which implements \expandafter, it is with expansion of \expandafter₃ that we see recursion taking place. To process \expandafter₁ TeX has already called its internal function expand(), so to process (expand) \expandafter₃ TeX is making a second call to expand()—from within the expand() function itself.

For \expandafter₃ we have

$\mathrm{T^3_1 = T_X}$, which is saved for later re-insertion back into the input
$\mathrm{T^3_2 = T_Y}$, which we’ll assume is expandable

Let’s further assume that the expansion of $\mathrm{T_Y}$ yields the following sequence of tokens: $\mathrm{{T^1_Y}{T^2_Y}{T^3_Y}}\cdots\mathrm{T^N_Y}$. We’ve now reached the end of the expansion process started by the command sequence \expandafter₁\expandafter₂\expandafter₃ $\mathrm{T_{X}T_{Y}}$ and TeX proceeds to “unwind” the process of recursion which started with \expandafter₁.

After processing \expandafter₃ TeX has, in its memory, a token list containing tokens from the expansion of $\mathrm{T_Y\text{: }{T^1_Y}{T^2_Y}{T^3_Y}}\cdots\mathrm{T^N_Y}$. TeX now starts to re-insert the tokens it saved whilst processing \expandafter commands:

TeX begins with re-inserting token $\mathrm{T_X}$ saved by \expandafter₃. $\mathrm{T_X}$ is re-inserted in front of the expansion of $\mathrm{T_Y}$, which results in a token sequence: $\mathrm{{{T_X}T^1_Y}{T^2_Y}{T^3_Y}}\cdots\mathrm{T^N_Y}$.
However, we still need to complete the process started by \expandafter₁ which saved the token representing \expandafter₂
The final sequence of tokens assembled by TeX, ready for reading in the next stage of TeX’s processing is \expandafter_{2 (token)} $\mathrm{T_{X}T^1_{Y}T^2_{Y}T^3_{Y}\cdots T^N_Y}$
TeX has now finished the “first round” of processing and switches to reading the sequence of token lists it has generated—that sequence starts with \expandafter_{2 (token)} which TeX proceeds to process. For \expandafter₂ we have
- $\mathrm{T^2_1} =\ \mathrm{T_X}$ which is saved for later re-insertion back into the input
- $\mathrm{T^2_2} = \mathrm{T^1_Y}$ which is the first token arising from expansion of $\mathrm{T_Y}$; if expandable, it is expanded
If we assume token $\mathrm{T^1_Y}$, the first token from expansion of $\mathrm{T_Y}$, expands to $\mathrm{{T^A_{Y1}}{T^B_{Y1}}{T^C_{Y1}}}$ then, after TeX has re-inserted $\mathrm{T_X}$, the resulting token sequence to be re-processed by TeX would be: \[\mathrm{{T_X}{T^A_{Y1}}{T^B_{Y1}}{T^C_{Y1}}{T^2_Y}{T^3_Y}\cdots{T^N_Y}}\]
which we can re-state as

\[\mathrm{T_X}\text{<expansion of the first token in }\mathrm{T_Y}\text{><remaining tokens in }\mathrm{T_Y}\text{>}\]

The following diagram illustrates the token lists created by the TeX code

    \expandafter₁\expandafter₂\expandafter₃T_XT_Y

$Image showing TeX processing multiple \expandafter commands$

Theory into practice

By way of an example, we’ll define the following macros to serve as $\mathrm{T_X}$ and $\mathrm{T_Y}$

$\mathrm{T_X}=\ $\foo where we define \foo as \def\foo#1{\textbf{#1}}
$\mathrm{T_Y}=\ $\bar where we first define \def\abc{Hello}, \def\xyz{, World!} and then \def\bar{\abc\xyz}

We’ll use the following code fragment to demonstrate our previous analysis:

\expandafter\expandafter\expandafter\foo\bar

From our discussion, the result of \expandafter₁\expandafter₂\expandafter₃ $\mathrm{T_X}\mathrm{T_Y}$ produces a sequence of tokens of the form:

\[\mathrm{T_X}\text{<expansion of the first token in }\mathrm{T_Y}\text{><remaining tokens in }\mathrm{T_Y}\text{>}\]

where the exact sequence depends on the nature of token $\mathrm{T_Y}$. If we insert our example commands using $\mathrm{T_X}=\ $\foo and $\mathrm{T_Y}=\ $\bar, which is defined as \def\bar{\abc\xyz}, we see:

the first token in $\mathrm{T_Y}$ is \abc and its expansion is a sequence of character tokens: Hello
the remaining tokens in $\mathrm{T_Y}$ is the single token representing \xyz.

If we plug this information into our “analysis” we get

\[ \begin{align*} &\mathrm{T_X}\text{<expansion of the first token in }\mathrm{T_Y}\text{><remaining tokens in }\mathrm{T_Y}\text{>}\\ &=\text{foo}_{\text{token}}\text{<expansion of \\abc>}_\text{token list (characters)}\text{xyx}_\text{token}\\ &=\text{foo}_\text{token}\text{Hello}_\text{token list (characters)}\text{xyx}_\text{token}\\[10pt] \end{align*} \]

Note that subscripts _token and _{token list (characters)} are used to emphasize (remind us) that TeX is reading integer token values, not text characters, hence there is no need to show any space characters or other delimiters after \foo: such delimiters have long since been processed or discarded; here we are firmly in TeX’s inner world of token lists and integer token values.

When TeX processes the token-list sequence produced by our \expandafter commands it will typeset Hello, World!—only H is typeset in bold. We could achieve the same result by writing the equivalent TeX code \foo Hello\xyz. Observe that the definition of \foo used a single parameter; consequently, \foo absorbs the single H character token for its argument, leaving the remaining character tokens (ello) untouched.

Notes:

writing \foo\bar produces very different output: the token for \bar would be used as the argument for \foo which results in typesetting Hello, World—everything is typeset in bold.
writing \expandafter\foo\bar causes \bar to be expanded, which produces two tokens: $\text{abc}_\text{token}\text{xyz}_\text{token}$. Then, after the $\text{foo}_\text{token}$ is re-inserted by \expandafter TeX process the token sequence \[\text{foo}_\text{token}\text{abc}_\text{token}\text{xyz}_\text{token}\] which typesets Hello, World—only Hello is typeset in bold. Here, the single token $\text{abc}_\text{token}$ is processed as the argument for the macro token $\text{foo}_\text{token}$, leaving the token $\text{xyz}_\text{token}$ untouched and its content typeset in the current font.

Note on `\expandafter` and macros with arguments

When using \expandafter to force expansion of macros it’s worth knowing how macro expansion works—particularly for macros that take arguments. Before TeX can run a macro—i.e., read and process tokens contained within the macro’s definition—TeX needs to get the macro “ready to run” by performing the initial macro-expansion process. If a macro’s definition includes the use of parameters (#1, #2, ... #9), part of the macro-expansion process requires TeX to scan the input looking for tokens comprising the argument(s) provided by the user: those argument tokens are absorbed (removed) from the input. During macro expansion TeX reads and absorbs tokens from the input to create mini token-lists, one list per argument; those token lists will subsequently be inserted into the appropriate location within the body of the macro—when TeX executes it. The final step in macro-expansion involves TeX locating the macro definition stored in memory and arranging for that location to become the source from which TeX will read its next set of input tokens. Macro execution commences when TeX starts to read and process those tokens, feeding-in the token lists previously created to store the arguments.

Delimiter tokens are also absorbed

If the original macro definition also used tokens acting as delimiters, TeX would also need to compare the original macro definition with the user’s use (call) of that macro, looking to find and match delimiter tokens. Once matched/located, delimiter tokens are subsequently ignored because their sole purpose is to act as “punctuation”, helping TeX pick-out and identify the actual tokens destined to form each argument.

Part 1 Part 2 Part 3 Part 4 Part 5 Part 6

How does \expandafter work: A detailed study of consecutive \expandafter commands

Case study: Understanding \expandafter\expandafter\expandafter...

Theory into practice

Note on \expandafter and macros with arguments

Delimiter tokens are also absorbed

Get in touch

Case study: Understanding `\expandafter\expandafter\expandafter...`

Note on `\expandafter` and macros with arguments