How do you even lose data when using git?

I originally wrote this article for Volume 124, Issue 5 of mathNEWS.

It’s 18:00 on a Saturday night, and you see your friend Joseph force push to your git repo for compilers. Six minutes later, a commit is added to that repo: “Joseph is an idiot.” What on earth happened, and how can you prevent this catastrophe from afflicting your group?

Root cause analysis determined the cause of this particular disaster to be “Untracked files”. How? git status lists files that haven’t been committed to the repository under the heading “Untracked files”. However, if having “Untracked files” is the normal state for your repository, then you’ll have to manually sift through the output of git status to realize you forget to commit a file you actually want to keep around for later. The chances of you doing that for each commit you make the night before the deadline? Nil. So you’ll lose the files once you git checkout another branch… and once you realize, it’s too late, and you’ll have to recreate the files from memory. The loss of precious minutes when you least have them to spare!

How can you protect you code and your loved ones from this disaster? It’s easy! You simply need to make judicious use of .gitignore. Create a file named .gitignore in your repository and commit it; each line that doesn’t start with a # contains a pattern matching files that git shouldn’t track or add to the repository. Chances are, you won’t even have to write the file yourself – you can probably concatenate together the files for your two or three favourite programming languages and editors (say, Go.gitignore, Erlang.gitignore, and Global/Kate.gitignore) from the collection at https://github.com/github/gitignore.

“But I use LaTeX,” you might protest, “and I also need to commit .log files generated by my code!” I do agree, that TeX.gitignore contains some rather sweeping patterns, such as *.log and *.out. There’s a simple pattern to handle this too: I create a directory named latex or report in the top of my repository, place my .tex files in it, as well as a copy of TeX.gitignore named .gitignore. Since the rules in latex/.gitignore only affect files stored under latex/, you’re free to add .log and .out files anywhere else in the tree.

In fact, I’ve got a bit of a pattern going now with how I start my assignments:

  • se465 (folder so I can tell each "a1" repo apart)
    • a1 (this is a git repo)
      • .gitignore
      • latex
        • .gitignore
        • Makefile
        • a1_sub.tex
      • q1
        • q1.java
      • ...

Now if only I had figured this out in first year. But now that I have, you don’t have to!

Just take a little bit of time when you start your next CS assignment, and make sure git tells you about only the Untracked files you care about!

!able