23. Making Git Comfortable#
Learning Goals
After this lesson, you should be able to:
Create a
.gitignore
file to make Git ignore specific filesDescribe some popular Git configuration changes
Create Git aliases
Describe some strategies to fix problems with repositories
Name some references to turn to for help with Git
23.1. Ignoring Files#
Let’s briefly consider files that you don’t want Git to track. These are typically:
Private files that shouldn’t be shared. Examples in this category include configuration files with passwords, authentication tokens, SSH keys, and data sets with personally identifiable information or other sensitive information.
Nuisance files that aren’t a meaningful part of your project. Examples include hidden files created automatically by your computer’s operating system and intermediate files generated by your project’s workflows.
Large files (over 1-10 MB), because most Git hosting services have repository size limits and large files tend to slow down Git. Examples in this category include data sets and outputs. It’s usually easiest to store these files on a private server or a regular file hosting service, such as Google Drive or Box. If you need to distribute large files with your repository, consider using tools such as Git Large File Storage and Data Version Control.
As an example, here’s a file listing for a directory containing resources for a workshop about text mining in R that DataLab offers:
ls -a
. .DS_Store D3VIS gephi_tutorial r_networks scraping
.. .git data kumu_tutorial readme.txt text_mining
Notice the .DS_Store
file and .git/
directory. These are hidden
configuration files for macOS and Git, respectively. Git is pretty smart, and
it knows to ignore its own hidden configuration files, but we need to tell it
explicitly to ignore the other one using a special file called .gitignore
.
You can create a .gitignore
file from scratch with a text editor. In the
file, simply add the names of files and directories that you want Git to
ignore, putting one per line. For example:
.httr-oauth
.DS_Store
.config
data/*
Note in this example that we’ve also added the data/
folder and any of its
contents to .gitignore
. This is for two reasons, which we discussed
earlier: 1) it’s usually not a good idea to track data files with Git; 2) the
free version of GitHub (which you’re probably using) puts a cap on the total
size of a repository. It would be a waste of space to have data eating away at
that size limit.
Generally speaking, you should place your .gitignore
file in the root of your
repository, where it will control Git behavior for the repository. GitHub
provides a nice repository of template .gitignore
files for various types of
development here.
23.2. Configuring Git#
Proper configuration can make Git more comfortable and convenient to use. The
git config
manual page lists Git’s many configuration
options. You can set options globally or per-repository; in practice, we’ve
found the former more useful. To set an option, use git config set --global
with the option’s name and setting, as explained in Configuring Git.
Settings are stored in a configuration file, so if you prefer, you can edit the
file directly. To open the global configuration file in Git’s default text
editor, run:
git config edit --global
The file contains sections with key-value pairs. Each section begins with a
header in square brackets [ ]
. Key-value pairs are indented within each
section and written key = value
. Lines beginning with the number sign #
are
ignored as comments. For example:
# Name and email.
[user]
name = Nick Ulle
email = naulle@ucdavis.edu
Options listed in the manual always have two parts separated by a dot (.
);
these are the section and key. For instance, the option user.name
corresponds
to the section user
and key name
.
Git’s default text editor is usually vi
, but if you don’t like vi
you can
set it to something else with the core.editor
option. For example, if you
prefer micro, you can set:
[core]
editor = micro
The setting should either be the name of a program you can run from the
terminal (that is, in the PATH
; see Environment Variables) or the
absolute path to a program.
Tip
Two settings we recommend for a smoother Git experience are:
[push]
autoSetupRemote = true
[pull]
ff = only
The push.autoSetupRemote
option controls whether Git automatically sets a
local branch’s upstream when you run git push
. With the option set to true
,
the first time you run git push
on a branch, its upstream will be set to the
specified remote.
The pull.ff
option controls what Git does when you run git pull
. With
pull.ff
set to only
, Git will abort any pull that requires a regular (not
fast-forward) merge. This is helpful for avoiding accidental merge commits, and
you can still run a regular merge when needed with git merge
.
23.2.1. Aliases#
In Git, an alias is a name for another command. Aliases are great shortcuts
for commands that are long or difficult to remember. You can set aliases in the
alias
section of Git’s configuration file; the key should be whatever you
want the alias’ name to be. The value of the alias should be a Git command
without git
at the beginning. For example, to make an alias unstage
that
runs git reset --
to remove a change from the staging area:
[alias]
unstage = reset --
You can also set aliases with git config set --global
.
After setting an alias, you can use it like any other Git command. So to use the unstage alias in a repository:
git unstage
Any arguments you pass to an alias are appended to the end of the command. For
instance, if you only want to remove changes to README.md
from the staging
area, you can specify this when you run git unstage
:
git unstage README.md
Tip
Here are a few more useful aliases:
[alias]
# Show a diff for changes in the staging area.
staged = diff --staged
# Show a graph of the repository's 10 most recent commits.
ls = log --graph --oneline --all -10
# Show the first commit for a file or directory.
origin = log --follow --diff-filter=A --
# Show number of commits per author for a file or directory.
authors = shortlog --numbered --summary --
# Print the path to the top level of a repository.
root = rev-parse --show-toplevel
With the root
alias, to quickly change to the top level of a repository from
any of its subdirectories, run:
cd `git root`
23.3. Getting Help#
Git has a well-deserved reputation as a tool that’s difficult to learn and use. One of the reasons for this is the minimal design of Git’s original command line interface. Since Git’s first release in 2005, contributors have made several improvements to the interface, but it still has rough edges. Hopefully it will continue to improve.
In this reader, we’ve focused a small set of tasks we consider important, and where the Git interface provides multiple ways to do something, we’ve attempted to choose the simplest (which is generally the newest). Nevertheless, if you continue using Git, you’ll eventually encounter a problem or need to do something that we didn’t explain.
There’s a wealth of information about Git online. The book Pro Git by Chacon and Straub is a great reference to learn more and look for help. The website Dangit, Git!?! explains how to solve several frequently-encountered problems, and Julia Evans’ blog post Confusing Git Terminology explains Git jargon. Stack Overflow, a programming question and answer site, contains many questions and answers about Git, and if you can’t find one that helps you, you can post a new question.