23. Making Git Comfortable#

Learning Goals

After this lesson, you should be able to:

  • Create a .gitignore file to make Git ignore specific files

  • Describe some popular Git configuration changes

  • Create Git aliases

  • Describe some strategies to fix problems with repositories

  • Name some references to turn to for help with Git

23.1. Ignoring Files#

Let’s briefly consider files that you don’t want Git to track. These are typically:

  • Private files that shouldn’t be shared. Examples in this category include configuration files with passwords, authentication tokens, SSH keys, and data sets with personally identifiable information or other sensitive information.

  • Nuisance files that aren’t a meaningful part of your project. Examples include hidden files created automatically by your computer’s operating system and intermediate files generated by your project’s workflows.

  • Large files (over 1-10 MB), because most Git hosting services have repository size limits and large files tend to slow down Git. Examples in this category include data sets and outputs. It’s usually easiest to store these files on a private server or a regular file hosting service, such as Google Drive or Box. If you need to distribute large files with your repository, consider using tools such as Git Large File Storage and Data Version Control.

As an example, here’s a file listing for a directory containing resources for a workshop about text mining in R that DataLab offers:

ls -a
.   .DS_Store   D3VIS   gephi_tutorial   r_networks   scraping
..  .git        data    kumu_tutorial    readme.txt   text_mining

Notice the .DS_Store file and .git/ directory. These are hidden configuration files for macOS and Git, respectively. Git is pretty smart, and it knows to ignore its own hidden configuration files, but we need to tell it explicitly to ignore the other one using a special file called .gitignore.

You can create a .gitignore file from scratch with a text editor. In the file, simply add the names of files and directories that you want Git to ignore, putting one per line. For example:

.httr-oauth
.DS_Store
.config
data/*

Note in this example that we’ve also added the data/ folder and any of its contents to .gitignore. This is for two reasons, which we discussed earlier: 1) it’s usually not a good idea to track data files with Git; 2) the free version of GitHub (which you’re probably using) puts a cap on the total size of a repository. It would be a waste of space to have data eating away at that size limit.

Generally speaking, you should place your .gitignore file in the root of your repository, where it will control Git behavior for the repository. GitHub provides a nice repository of template .gitignore files for various types of development here.

23.2. Configuring Git#

Proper configuration can make Git more comfortable and convenient to use. The git config manual page lists Git’s many configuration options. You can set options globally or per-repository; in practice, we’ve found the former more useful. To set an option, use git config set --global with the option’s name and setting, as explained in Configuring Git. Settings are stored in a configuration file, so if you prefer, you can edit the file directly. To open the global configuration file in Git’s default text editor, run:

git config edit --global

The file contains sections with key-value pairs. Each section begins with a header in square brackets [ ]. Key-value pairs are indented within each section and written key = value. Lines beginning with the number sign # are ignored as comments. For example:

# Name and email.
[user]
  name = Nick Ulle
  email = naulle@ucdavis.edu

Options listed in the manual always have two parts separated by a dot (.); these are the section and key. For instance, the option user.name corresponds to the section user and key name.

Git’s default text editor is usually vi, but if you don’t like vi you can set it to something else with the core.editor option. For example, if you prefer micro, you can set:

[core]
  editor = micro

The setting should either be the name of a program you can run from the terminal (that is, in the PATH; see Environment Variables) or the absolute path to a program.

Tip

Two settings we recommend for a smoother Git experience are:

[push]
  autoSetupRemote = true
[pull]
  ff = only

The push.autoSetupRemote option controls whether Git automatically sets a local branch’s upstream when you run git push. With the option set to true, the first time you run git push on a branch, its upstream will be set to the specified remote.

The pull.ff option controls what Git does when you run git pull. With pull.ff set to only, Git will abort any pull that requires a regular (not fast-forward) merge. This is helpful for avoiding accidental merge commits, and you can still run a regular merge when needed with git merge.

23.2.1. Aliases#

In Git, an alias is a name for another command. Aliases are great shortcuts for commands that are long or difficult to remember. You can set aliases in the alias section of Git’s configuration file; the key should be whatever you want the alias’ name to be. The value of the alias should be a Git command without git at the beginning. For example, to make an alias unstage that runs git reset -- to remove a change from the staging area:

[alias]
  unstage = reset --

You can also set aliases with git config set --global.

After setting an alias, you can use it like any other Git command. So to use the unstage alias in a repository:

git unstage

Any arguments you pass to an alias are appended to the end of the command. For instance, if you only want to remove changes to README.md from the staging area, you can specify this when you run git unstage:

git unstage README.md

Tip

Here are a few more useful aliases:

[alias]
  # Show a diff for changes in the staging area.
  staged = diff --staged

  # Show a graph of the repository's 10 most recent commits.
  ls = log --graph --oneline --all -10

  # Show the first commit for a file or directory.
  origin = log --follow --diff-filter=A --

  # Show number of commits per author for a file or directory.
  authors = shortlog --numbered --summary --

  # Print the path to the top level of a repository.
  root = rev-parse --show-toplevel

With the root alias, to quickly change to the top level of a repository from any of its subdirectories, run:

cd `git root`

23.3. Getting Help#

Git has a well-deserved reputation as a tool that’s difficult to learn and use. One of the reasons for this is the minimal design of Git’s original command line interface. Since Git’s first release in 2005, contributors have made several improvements to the interface, but it still has rough edges. Hopefully it will continue to improve.

In this reader, we’ve focused a small set of tasks we consider important, and where the Git interface provides multiple ways to do something, we’ve attempted to choose the simplest (which is generally the newest). Nevertheless, if you continue using Git, you’ll eventually encounter a problem or need to do something that we didn’t explain.

There’s a wealth of information about Git online. The book Pro Git by Chacon and Straub is a great reference to learn more and look for help. The website Dangit, Git!?! explains how to solve several frequently-encountered problems, and Julia Evans’ blog post Confusing Git Terminology explains Git jargon. Stack Overflow, a programming question and answer site, contains many questions and answers about Git, and if you can’t find one that helps you, you can post a new question.