Converting markdown to LaTeX with md2tex

Posted on .#markdown#latex#python#md2tex

This post is part of a series:Making an integrated book/website about music coding using MkDocs, LaTeX, and Python

Markdown is a friendly format for writing readable plain text with rich text features. While there are different interpretations of how the markdown syntax should be defined, it’s widely used in technical writing for a reason: Markdown is easy to learn, yet expressive enough to represent many features of more verbose markup languages like HTML. In mkdocs, the conversion from markdown to HTML is handled by the excellent Python-Markdown library.

But what about converting markdown to LaTeX? There are several tools for this, including the Swiss army knife of document converters, pandoc. When I tried pandoc on the markdown sources for my book, it produced a lot of extraneous TeX commands. I would have to clean those up to get a usable TeX file, which was not ideal. I probably should have explored pandoc’s config options at this stage, but instead I looked for other tools that were more focused.

Enter md2tex

I eventually found a great Python CLI tool called md2tex, which I forked and added some features. It does exactly what I need: Converts the most common features of markdown to TeX.

Tools like pandoc and Python-Markdown convert the input to an intermediate representation called an abstract syntax tree (AST) before rendering the final output.

md2tex is simpler: It uses Python regular expressions to directly find markdown patterns and replace them with the matching LaTeX code. While this approach only works one way (from markdown to TeX), it lets us focus on supporting any markdown features that have a reasonable LaTeX equivalent.

From CLI to module

md2tex was initially made for CLI usage, but I needed to use it as a module in another Python script. With a small refactoring, I moved the main conversion functionality to a single convert function which can be imported elsewhere.

md2tex converts markdown to LaTeX
from md2tex import convert
 
md = """
# Sample document
 
Paragraph text.
 
- Here is a list item with some **bold text**.
- Another list item with [a link](https://example.com).
"""
 
latex = convert(md)
print(latex)
LaTeX result
\section{Sample document}
 
Paragraph text.
 
\begin{itemize}
\item Here is a list item with some \textbf{bold text}.
\item Another list item with \href{https://example.com}{a link}. 
\end{itemize}

Improving md2tex

I needed some additional functionality that md2tex didn’t provide out of the box. The codebase of md2tex is well organised and a pleasure to work with, so I began to add the functionality that I needed and fix a few bugs. The main features I added are introduced below.

Definition lists

In the book, I sometimes needed to define a set of terms in audio synthesis or explain the purpose of a handful of SuperCollider classes. For that purpose, Markdown has definition lists, which correspond rather nicely to LaTeX’s description environment (and HTML’s Description List elements).

Markdown definition list
scide
 
:   SuperCollider's IDE with integrated docs.
 
sclang
 
:   The standard SuperCollider interpreter.
 
scsynth
 
:   The SuperCollider sound server.
LaTeX result
\begin{description}
\item[scide] SuperCollider's IDE with integrated docs.
 
\item[sclang] The standard SuperCollider interpreter.
 
\item[scsynth] The SuperCollider sound server.
\end{description}

Source code

As the book is about coding and the first edition contains 347 code examples, it was important to have useful and well organised source code displays. I updated md2tex to add the following features:

Markdown fenced code block
```sc title=“Some cool sounds” hl-lines=“1”
{ Pulse.ar }.play;
{ PinkNoise.ar }.play;
```
LaTeX result: minted environment with caption
\begin{listing}[H]
\begin{minted}[highlightlines={1}]{sc}
{ Pulse.ar }.play;
{ PinkNoise.ar }.play;
\end{minted}
\caption{Some cool sounds}
\end{listing}

In the final rendering from LaTeX to PDF, having SuperCollider’s somewhat esoteric syntax highlighted properly required a bit of effort. I chose the LaTeX package minted over another popular choice called listings because this would provide better support for SuperCollider syntax. I’ll explain this in the next blog post.

Citations

As an academic textbook, my book contains citations and references to other sources. The plugin called mkdocs-bibtex conveniently supported rendering citations based on the same bibliography data format as LaTeX: BibTeX (do note that this specific plugin has since then been discontinued). Most citation managers like Zotero and Mendeley support exporting bibliographic data to BibTeX and related data formats. My setup uses a BibTeX variant called BibLaTeX. I know, it may be a bit confusing if you are new to LaTeX and citation processing. If you want to know more about the different citation packages in LaTeX, be my guest.

There is no standard syntax for citations in markdown. But pandoc defines one that is widely used, including by mkdocs-bibtex. As in LaTeX, each source is identified with a unique citation key which usually takes the form @eskildsen2025. That same citation key is used in LaTeX, so translating between the formats is not too complicated:

pandoc-style citations in markdown
SuperCollider uses unit generators [@eskildsen2025, p. 60].
LaTeX result
SuperCollider uses unit generators \parencite[p. 60]{eskildsen2025}.

If you choose the APA citation style, this will render in the text body as: “SuperCollider uses unit generators (UGens) (Eskildsen, 2025, p. 60).” A reference will be included in the reference with the corresponding bibliographic data, formatted as specified in the APA citation style guide.

Math

My book is not heavy on math, since the target audience is music students in the humanities. But it is useful to be able to show simple equations.

Math notation is tricky to represent in plain text, but this is one of the areas where LaTeX really shines. Its math notation has been adopted in several markdown rending frameworks, e.g. GitHub.

We can include LaTeX-equations directly in the markdown files, surrounded by $$ at the block level and $ inline. With the help of KaTeX, a JavaScript library, those equations are rendered with nice math formatting in the browser. And for the LaTeX version, since the equations are already written in LaTeX, no conversion is necessary.

Math equations in markdown
We calculate the density of grains with this formula:
 
$$
\text{\small density} = \text{\small trigger frequency} \times \text{\small grain duration}
$$
LaTeX result
We calculate the density of grains with this formula:
 
\[
\text{\small density} = \text{\small trigger frequency} \times \text{\small grain duration}
\]

Handling custom features outside of md2tex

While md2tex is useful, it is not the right tool for absolutely all aspects of converting my book to LaTeX. For instance, it does not process audio files for the simple reason that LaTeX does not have a mechanism for embedding audio into a PDF document. It does not know how to convert mermaid diagrams to LaTeX. It also is not designed to iterate over a bunch of markdown files and organise the converted output into a full LaTeX book.

If my contributions to md2tex were to be useful to anyone else, I should probably try to avoid scope creep. So, instead of making md2tex do everything, I created a Python script to take care of the rest of the conversion process. While this script is very far from ideal, it does the job and keeps md2tex reasonably simple. As I added features and worked with the output, this script quickly turned into a somewhat messy piece of code which nonetheless solves a lot of problems:

I’ll cover some of this in future blog posts.

This post is part of a series: Making an integrated book/website about music coding using MkDocs, LaTeX, and Python

  1. Part 1: An integrated book and website
  2. Part 2: Converting markdown to LaTeX with md2tex (this post)
  3. Part 3: Making a pygments lexer to syntax highlight SuperCollider code
  4. Part 4: New MkDocs plugins for embedding and visualizing audio