Writing the PhD thesis: the tools Part I


I’m getting ready to write my PhD thesis, and for some time now I’ve been gathering information on tools that can help me get from here to there. This first post is an attempt to organize this material, to write down my thoughts on the approach I plan to take for tackling the beast. This might not be exactly what I end up doing, so I’ll try to update the post on the changes or difficulties I’ve found along the way, to keep this in a sort of live document.

First of all I’m going to describe very briefly my current situation and the kind of workflow I’d like to pursue. The main topic of my thesis is medical image processing and most of the work I’ve done has been in MATLAB. That means, a considerable amount of math, a lot of images, a graph here and there, several diagrams, and a not-so-little amount of code. For the most part, the journal papers and conference proceedings I’ve published have been written in LaTeX. This is my starting point. So, down to business.

Writing

The obvious choice for writing (typesetting) my thesis is LaTeX. I plan to split the document in chapters by working with multiple files and using the \include or \input command. More information on LaTeX and thesis writing can be found at Using LaTeX to Write a PhD Thesis. When I used to work in Windows I used TeXnic Center along with the MikTeX distribution for typesetting my LaTeX documents. It’s a free open source software and it was great several years ago. Since I moved to the Mac platform I’ve tried several front-ends and editors until I settled for texmaker. This one is also open source and it meets most of my demands for a LaTeX editor. It has syntax highlighting, code completion, code folding, spell checker, document structure view, built-in PDF viewer, among other features. One feature I really like, is the clean command, which allows you to erase the files (dvi, toc, aux…) generated by a LaTeX compilation (except the ps and pdf files).

This front-end for LaTeX has worked quite well for me. However, for some time now I’ve been looking to change to an editor that would have a rather simple interface, yet powerful enough for most of the things I do. The main reason is that when writing, I’d like to focus more on the writing, than on the formatting[1]. I can always worry about formatting later. That’s one of the reasons that led me to do the actual writing of critical parts of a paper in something as simple as TextEdit or similar, and later paste it in texmaker. For instance I’d like to disable syntax highlighting, I’d like to be able to hard wrap my text, spell check, and things of this sort. I’ve been giving TeXShop and TeXWorks a try. They seem to fulfill most of my requirements. Nevertheless, I’ve recently been exploring another approach. It consists in using your favorite text editor on any given app with a global keyboard shortcut. This is done by a small app called QuickCursor. I’ve been using it with a really nice distraction free text editor called Byword. I highly recommend it for writing a draft or a simple paragraph, and it also has markdown support. I’m actually writing this post in Byword. What I actually do, is I’m working in any given app I highlight the paragraph I want to modify (or nothing if the document is empty) and I hit ctrl+cmd+B and the text is sent to Byword. I’ll do my writing in a distraction free environment in full screen if I want, hit cmd+s, cmd+w, and my text will appear back at the original app I was working on. So far this seems like an elegant, and time saving, solution to my problem. You can use any text editor you want, I also have a shortcut for TextWrangler. I’ve also considered using Textmate, however I’ve never been much convinced of paying around €45 for a text editor for programming, however powerful it may be, if you think otherwise, please leave a comment I may yet reconsider.

To sum up, write first worry about format later, because LaTeX let’s you do this somewhat effortlessly.

Tracking changes

I also want to keep track the changes to my thesis and the optimal way of doing it is by using version control software. While most of the resources found on the web will tell you to go svn all the way, I think it’s mainly because it’s been around for quite a while. After a considerable online sleuthing, I’ve decided to go with git version control. Version control software is excellent for keeping revisions of my thesis, and then compare revisions or getting back to an old revision. Before I explored the version control paradigm, I used to keep my revisions in a simple yet unpractical way. One thing I used to do was duplicate a file and rename it to something like myfile_old.tex, then I could do latexdiff and verify the changed I had made to the document and so on. The other thing was to comment out parts I wanted to remove, but that I still wanted to keep if I later changed my mind. For small documents this might not be much of problem, but for a thesis this does’t really work out.

This somewhat naïve approach also lacks an important element of revisions: proper documentation. This is where revision control software thrives, documenting changes can help you stay in perspective as to what you have done and how you have done it. It can also help to resume work; if you’ve been working on something else for a week or two, getting back to your document and to the idea you were developing is not an easy thing to do. With proper documentation of changes this could prove to be less difficult.

One clear advantage of git is that it is a distributed version control system, it’s easy to work on multiple machines and also working offline. Being able to commit to your repository when you’re offline is very useful, particularly since not having internet access could provide a distraction free environment making it much easier to write. However, keeping a sort of master repository in a server for backup is actually not a bad idea. What I plan to do is to have host this master repository in Dropbox[2]. This way I’ll have the advantage of backup, with version control and avoiding the constant uploading of files by Dropbox every time I hit cmd-S.

Another advantage of version control I hadn’t mentioned is the ability to keep working on your thesis while sending revisions to your advisor. When you get back the advisor comments, you can directly compare to the specific revision you sent to your supervisor. If there are no conflicts you can accept all changes and continue to work. One could even experiment crazy changes, while knowing that one can easily get back to that stable version without trouble.

Best practices

There are some best practices for the workflow I’ve roughly described, which for the purpose of completeness I’ll include here, but they were taken from this link.

  1. Define the directory structure for your thesis. You can change it later, and use git for tracking the changes. Having a good structure would make your life easier.
    Work with multiple files (use include and/or input in LaTeX). You can split them by chapters or sections. This will make easier to track changes that involve specific parts of your thesis (e.g. git log content/introduction.tex).
    Track only the files you are going to touch, not the ones auto-generated. Creating a proper .gitignore file will help you a lot (LaTeX generate plenty of working files).
    As in programs, do micro-commits, that is: one commit per idea/feature/fix/activity.
  2. Every time you commit, write meaningful messages (high level) that explains what you were trying to achieve in every change. After a week you might not remember what you tried to accomplish.
  3. Keeping track of every activity/idea/fix [see (4) and (5)] could be very helpful to know how much you have done (using git log). You can write your advance report for your supervisor(s) based on git log. Even more, you can share the repository with your supervisor (using a web interface), and they can check whatever you have been doing in your thesis. For the next meeting, they will know what to expect (it will depend on how fond are your supervisors on following a RSS).
  4. Using git will be useful for keeping you in a good mood (sometimes you would feel you have not done too much, but having track of every change will help you to keep things in perspective).
  5. For every progress report you send, create a tag. For the next report, you can checkout both version and apply latexdiff . It will be useful for tracking changes between versions you submit for revision. This also will help you to check if you addressed the feedback you received for the previous report.

Other resources


  1. You might want to check Academic workflows on Mac for nice posts on this subject.  ↩

  2. Two resources I found quite useful for taking this approach were using-dropbox-git-repository and Using Git for writing thesis.  ↩

10 thoughts on “Writing the PhD thesis: the tools Part I

  1. Congratulations on getting ready to write! It is a big project and you are wise to put thought into your approach from a toolbox perspective. In reference to my older post on SVN, if I were to write my thesis today, I would use git and github rather than svn on google code. I used what was available at the time and it worked well… but now there are options I like better.

    Five years ago (when I was writing) I would have recommended textmate as I was also using it regularly. I have since switched back to vim (macvim in particular) as I grew up in vi and am most efficient there. In terms of recommendation, vi is a great editor but hard to learn. Writing a PhD thesis is hard enough, so be careful adding difficult steps!

    • Thanks! I was really looking forward to getting some feedback. I believe the whole thesis writing endeavor can be both rewarding and frustrating, so it’s best to prepare beforehand. Likewise, documenting one’s approach could lead to a satisfactory conclusion, while being able to share this with others which is quite satisfactory on its own.

  2. Very nice summary. I also wrote my thesis using latex/git, and I think it is fairly optimal. I feel sorry for the people that run manual versioning/backup by copying files around on different disks. Or the ones that just don’t and thake their chances.. Not once during my writing was I worried that my work would be lost, with mirrors of my git repository in many locations.

    • Like you say, latex/git seems the way to go. Your bash script is a nice addition to a seamless workflow. I also feel sorry for the people that use computers as if they were simple typewriters.

  3. Pingback: Weekly List Bookmarks (weekly) | Eccentric Eclectica @ ToddSuomela.com

  4. Thank you for this really good blog entry, which made my stumble upon your blog. As you seem to haven’t blogged for a while: I hope you pick it up again. I really enjoy reading your blog posts.

    If you are still working on it: All the best with your PhD thesis!

  5. Pingback: Empezando por el “principio” | (D) Miren Berasategi

  6. Pingback: Using markdown + pandoc to write my biology PhD thesis | chia kaivalya

If you liked this post please leave a comment or consider subscribing.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s