15 minute read

Online platforms like Overleaf have significantly transformed LaTeX-based typesetting, making it more accessible and collaborative. However, if you are starting a major project in LaTeX, such as a thesis, developing a deeper understanding of the broader LaTeX ecosystem can greatly enhance your experience, even when working within Overleaf. In this post, we’ll briefly explore essential tools in the LaTeX workflow, highlight useful Overleaf features like Git integration for version control, and offer guidance on choosing templates, an often overlooked decision that can substantially affect your workflow.

Additionally, while this post mostly deals with the tools to improve your workflow, I will conclude this post with some advice on the content of the thesis, to get you up to speed.

Quick overview of the tools

When producing high quality documents with LaTeX, it is essential to understand the different compilers included in major TeX distributions such as TeX Live (or MiKTeX). The choice of typography you intend to use often determines which compiler is most suitable for your needs. To this end, we will do a quick overview of the compilers. Overleaf documentation talks about the history of these compilers, but we will go over them in a very concise manner, such that you can quickly experiment with them on your command line (provided you have a TeX distribution, e.g. TeX Live installed). Besides the compilers, the TeX distributions also include conversion tools that convert between Device Independent (.dvi), PostScript (.ps) and PDF formats.

tex, latex and the DVI format

TeX and LaTeX are not the same. TeX was created by Donald Knuth in 1980s. A minimal working TeX example is as follows. After storing it as equation.tex, you should be able to compile it as tex equation.tex. The result will be a .dvi file.

$a^2 + b^2 = c^2$.
\bye

The DVI format, now largely obscure, was used for printing. If needed, LaTeX users would convert .dvi to PostScript using the dvips command (which is also included in TeX Live). PostScript took over DVI as the format used for printing and now has been largely replaced by PDF.

The word TeX can imply both the language (in which equation.tex is formatted), or it can imply Knuth’s implementation of it (i.e. the tex compiler).

LaTeX was created by Leslie Lamport, as a wrapper on top of TeX. While TeX allows one to typeset equations, LaTeX allows one to typeset documents, giving them a structure. Following is a minimal working LaTeX example using the report class. After storing it as a document.tex, you should be able to compile it as latex document.tex. The result, again will be a .dvi file.

\documentclass{report}
\begin{document}
\title{Minimal Report Example}
\author{Your Name}
\date{\today}
\maketitle
\chapter{Introduction}
This is the first chapter of the report.
\section{Motivation}
We want a minimal example that compiles cleanly using the \texttt{report} class.
\end{document}

Similar to the word TeX, the word LaTeX can imply both the language (in which document.tex is formatted), or it can imply Lamport’s implementation of it (i.e. the latex compiler).

LaTeX (originally) used to call Donald Knuth’s TeX underneath. This is no longer the case as there now exists a drop in replacement for tex, which is pdftex and in most modern TeX distributions, latex will actually call pdftex underneath (see next section).

pdftex and pdflatex

pdfTeX was developed by Hàn Thế Thành for his PhD thesis. From it’s name, it is easy to guess that unlike Knuth’s TeX (which outputs DVI files), pdfTeX outputs PDF files. One can compile the same minimal example using pdfTeX as

pdftex equation.tex

However, pdfTeX is a drop in replacement for Donald Knuth’s TeX. By adding an extra command line flag, we can make it output DVI files. pdfTeX was meant to be a maintained fork of Knuth’s tex. It adds many more features besides just PDF support.

pdftex -output-format=dvi equation.tex

Indeed, Lamport’s LaTeX no longer uses Knuth’s TeX. Rather, it uses Hàn Thế Thành’s pdfTeX underneath (with this extra argument), since Lamport’s LaTeX compiles into the DVI format. Additionally, it may sound as if just like Lamport’s LaTeX is a wrapper on top of Knuth’s TeX, pdflatex is also a wrapper on top of pdftex. In fact, usage wise, in order to compile document.tex, the command used also seems to suggest the same:

pdflatex document.tex

While it is okay to loosely think this way, technically, this is not the case. pdflatex is not a separate binary, rather, an alias for pdftex command which is called with an extra argument. This extra argument loads some precompiled macros that extend pdftex from just a TeX compiler to a LaTeX compiler. Precisely, calling pdflatex document.tex is equivalent to calling

pdftex -fmt=pdflatex document.tex

xetex and xelatex

XeTeX was developed by Jonathan Kew. Like pdfTeX was developed with the motivation of adding direct PDF support (instead of having to compile a DVI and then convert that to PDF), the motivation behind XeTeX was to add Unicode and OpenType font support to TeX.

The relationship between xetex and xelatex is the same as that between pdftex and pdflatex (i.e. xelatex is not a separate binary, rather an alias for calling xetex with some precompiled macros, to extend it from a TeX compiler to a LaTeX compiler).

luatex and lualatex

LuaTeX is another great project that is programmable and delegates much of the computation to Lua.

The relationship between luatex and lualatex is also the same as that beween pdftex and pdflatex, or xetex and xelatex.

General advice: choosing a template, fonts and compiler

The LaTeX ecosystem is built on top of a multitude of packages developed by the open-source community. Unlike Python or other modern language ecosystems, LaTeX packages can often interfere with one another, and it’s not uncommon to encounter bugs or compatibility issues. In addition to the compilers discussed earlier, full TeX distributions (such as TeX Live and MiKTeX) come bundled with hundreds or thousands of packages. These packages are under active development, and since TeX Live is updated annually, version mismatches between your LaTeX code and included packages can sometimes cause documents that previously compiled successfully to break.

Given these factors, I generally recommend using templates that are actively maintained. Avoid choosing a template solely because it visually resembles your target format, especially if it hasn’t been updated in years. If your university provides a thesis or dissertation template, use it only if it’s well-documented and regularly maintained; otherwise, it may introduce more problems than it solves.

Modifying templates to match requirements

LaTeX templates are typically built on top of standard base classes like report, article, or book. Customizing elements such as page numbering, section headings, font sizes, and margin settings is usually straightforward once you’re familiar with the structure of the template. In short, most templates are highly customizable. You’re better off starting with a well-maintained and modern template that aligns with your academic discipline, then modifying it to meet specific formatting requirements, rather than starting from a fragile, poorly maintained institutional template that superficially matches your needs.

For example, for thesis writing, I greatly recommend mitthesis which is a very well maintained template and based on the LaTeX report class. I used it for thesis writing, and had to make a bunch of changes to comply with my institutional requirements. Almost any change is possible with minimal change to the codebase, and I greatly recommend learning how to make these changes even if you are a beginner in LaTeX.

Using older templates

If you decide to use an older template (such as one provided by your institution), Overleaf allows you to switch to an earlier TeX Live version in the project settings, which can help resolve compatibility issues caused by outdated code. This is a lesser-known but valuable feature of Overleaf, especially when working with legacy LaTeX documents.

Fonts and typography

Typography is often overlooked in scientific writing, yet it plays a crucial role in clarity and communication. Professor Edward Tufte, a leading advocate of data visualization and design, emphasizes the importance of effective and efficient typography.

Many open source developers generously make available their fonts for use by the general public. While the default font used by most LaTeX compilers (Computer Modern) is visually appealing, it may be a bit hard to read. I greatly recommend exploring common font schemes. Many of these schemes are contained in the mitthesis template that I mentioned.

You should also be familiar with Serif and Sans Serif fonts. The convention is to specify a Serif font using the \setmainfont command and a Sans Serif font using the \setsansfont command. In these cases, we explicitly need to call \renewcommand{\familydefault}{\sfdefault} to switch to Sans Serif mode. It’s a good idea to reserve \setmainfont for a Serif font, because some LaTeX classes, like beamer use Sans Serif mode by default, and (unless you explicitly switch to Serif mode), the font specified in \setsansfont command will be used. That being said, it’s not wrong to specify a Sans Serif as the default font (i.e. in the \setmainfont command). Some font profiles in mitthesis (e.g. heros-stix2) do this.

Bibliography schemes

Before you start typesetting, you might also want to switch from biblatex to natbib which will provide more options with referencing (i.e. \citet and \citep). This switch is a bit harder to do later as you will have to manually edit the references in the text to either \citet or \citep, depending on the context.

How to choose a compiler

We discussed four different LaTeX compilers above. So, how should one choose between them? At first glance, XeTeX may appear to be the obvious choice, as it supports Unicode and OpenType fonts and seems like a functional superset of pdfTeX. However, pdfTeX is often preferred in practice due to its maturity, simplicity, and slightly better performance. In most cases, unless you specifically need XeTeX’s advanced font-handling features, sticking with pdfTeX is the safer option.

For example, some projects such as mitthesis are configured in a way that assumes XeTeX users will supply their own fonts locally. This means that compiling with XeTeX may require downloading and placing specific fonts in your working directory. In contrast, when using pdfTeX, the same project will typically fall back to using the font packages bundled with your TeX distribution, making setup and compilation smoother.

This isn’t a limitation of XeTeX itself. It can certainly use system or TeX-distributed fonts. These templates default to modern OpenType (.otf or .ttf) fonts if compiling the project with XeTeX because enhanced font is one of the primary reasons someone would use XeTeX.

For completeness, it is also worth mentioning that pdfTeX also supports OpenType fonts, except that using them requires production of ancillary files and is not easy for the average user. The Overleaf documentation provides more information on font support across different compilers.

Setting up an Overleaf project with Git/VSCode integration

One of the lesser used Overleaf features is Git integration. If you have Overleaf Premium, you should absolutely take advantage of this feature. In your Overleaf project settings, you should see a Git URL that you can clone on to your computer. You can clone it using the command line git tool or in your favorite Git app (including GitHub Desktop, which also allows cloning Git repositories hosted outside GitHub, e.g. Overleaf).

Syncing Overleaf with Git allows you to typeset in a hybrid manner, both on Overleaf as well as in your local code editor (VSCode). You can in fact use different compilers on Overleaf and locally, depending on your needs.

Working in a hybrid manner

What happens if you or one of your collaborators makes a modification on Overleaf? This won’t create a new commit, until you pull origin. This is how Overleaf’s Git feature is setup. And of course, if there are conflicts between the online version and your local version, using a good Git GUI app will allow you to resolve those conflicts efficiently. GitHub Desktop and VSCode are usually a great combination, and GitHub Desktop will allow you to resolve merge conflicts using VSCode.

In addition to Git, Overleaf will also provide a “History” feature which in itself is a version control feature. If you sync with Git, I recommend sticking with Git’s version control feature. For example, using the “revert” feature on Overleaf will still work, but it will pollute the commit history in your Git app, and I advise against using it, when working in a hybrid manner.

Buildsystem and VSCode tasks

You are almost certainly familiar with the traditional LaTeX workflow that involves

<compiler> main.tex
bibtex main.tex
<compiler> main.tex
<compiler> main.tex

where <compiler> can be any of those we discussed. In some cases biber may be used instead of bibtex.

Using latexmk

Instead of using the above sequence, the recommended way of building a LaTeX project is via latexmk. The following will use Lamport’s latex compiler and output a DVI.

latexmk main.tex

If you would like to use pdflatex (i.e. pdfTeX) instead, this becomes:

latexmk -pdf main.tex

Additionally, if you would like to use xelatex (i.e. XeTeX), this should become:

latexmk -xetex main.tex

By default, using xelatex with latexmk (via the -xetex option) invokes xelatex with the -no-pdf flag. This causes it to generate an intermediate DVI file, which is later converted to PDF using the xdvipdfmx tool (also included in TeX Live). In some situations, this two-step process is necessary, for example, when deferring PDF generation allows more accurate calculation of intermediate values. However, in most cases, this added complexity is unnecessary and only increases compilation time due to the extra conversion step. An alternative approach, what we might call the “unnatural” way of using XeTeX with latexmk, involves configuring latexmk to behave as if it’s using pdflatex, but substituting xelatex in its place:

latexmk -pdf -pdflatex='xelatex' main.tex

This bypasses the -no-pdf flag and allows xelatex to generate the PDF directly. In other words, the “unnatural” invocation of XeTeX in latexmk actually ends up calling xelatex in a more “natural” and efficient way.

Using tasks.json in VSCode

A great feature of VSCode is the .vscode/tasks.json file that you can create to call certain “tasks”. If you don’t currently have a tasks.json, pressing CMD + SHIFT + B will prompt you to create one. I now present the tasks.json that I used for my thesis writing in VSCode. It is comprised of two actions: Build and Clean, both of which use latexmk and the xelatex compiler.

{
    // See https://go.microsoft.com/fwlink/?LinkId=733558
    // for the documentation about the tasks.json format
    "version": "2.0.0",
    "tasks": [
        {
            "label": "Build",
            "type": "shell",
            "command": "latexmk -pdf -pdflatex='xelatex' Thesis.tex",
        },
        {
            "label": "Clean",
            "type": "shell",
            "command": "latexmk -c",
        },
        {
          "label": "Build and Clean",
          "dependsOn": ["Build", "Clean"],
        //   "dependsOn": ["Build"],
          "dependsOrder": "sequence",
          "group": {
            "kind": "build",
            "isDefault": true
        }
        }
    ]
}

By default, it is designed to run both Build and Clean tasks upon the default action that runs using CMD + SHIFT + B combination in VSCode. To speed things up, you may edit the dependsOn line to only Build and not Clean, but be aware:

  1. LaTeX projects can sometimes fail to compile even when the LaTeX code is valid (due to faulty intermediate files). If you disable the Clean action, consider running it manually if you face problems during compilation.
  2. If you disable the Clean action, make sure to either add a .gitignore file (see sample) to ensure temporary files are not committed to your repository (or run Clean manually before committing).

Some words on the thesis content

In some fields, like Deep Learning, the general trend is that theses are barely read, and not much time should be spent on writing them. In other words, their role has been limited to that of a formality. That being said, it doesn’t hurt to write a good theses. Your thesis is also your opportunity to document whatever had to be trimmed away from your published papers due to size limitations, or, to document research that could never be published.

Read theses in your area: In my undergraduate, I took a course in advanced programming, where we were taught techniques to write better code. The professor mentioned something at the beginning of the course, which got stuck in my mind. He said, “in order to write good code, you need to read good code”. Same applies to theses. During my PhD, at times I downloaded theses written by authors of major contributors to the field of Deep Learning, as good benchmarks of what a thesis should look like.

Make essential connections to aid the reader: In some cases, my motivation to download and read theses was to get a more “textbook style” version of their papers, which means, I was looking for elaborations, rather than summaries. While time constraints may not allow you to elaborate papers, and the general trend is to summarize than elaborate, you are giving the reader an incentive to download and read your thesis if you make essential connections that had to be trimmed away from your published papers.

Recall your advisor meetings: Recall something you brought up in the meeting that was obvious to you but not to your advisor (or vice versa)? Those can make some really good footnotes! If you had to elaborate something you considered obvious to your advisor, it might be a good idea to do the same for other people reading your research.

Summary

We talked about different compilers used in the LaTeX ecosystem and when one may prefer to call one over the other. We also discussed Overleaf’s Git integration feature and some recommendations when using it for version control of an Overleaf project. We also discussed the hybrid strategy of typesetting both on Overleaf as well as locally, and discussed proper usage of latexmk, the recommended way of compiling LaTeX projects. We accompanied it with a tasks.json file for VSCode that allows convenient usage of latexmk via keyboard shortcuts.

We also briefly touched upon selecting templates, as well as typography (which is one of the factors that impacts what compiler you end up using), as well as some other factors like bibliography which you should consider before starting typesetting. I hope you enjoyed the read, and if you are writing your thesis soon, I hope this post will be of help to you.

Updated:

Comments