GitHub flow

Submit pull requests from any branch – except master

Mauro Lepore https://github.com/maurolepore
07-25-2020

If your team uses the GitHub flow, you should submit your pull requests from any branch – except master. The reasons are not immediately obvious. This post explains why violating the GitHub flow sometimes gives you no trouble, while other times it leaves you a mess.

Say you want to contribute to a project – let’s call it upstream. You don’t own upstream, so you can’t push commits to upstream/master, the branch master of the project upstream. That’s okay: you can still submit a pull request from your fork of upstream – let’s call it origin.

As you do own origin, can you push commits to origin/master then submit a pull request into upstream/master? Yes, you “can”; but that is a bad idea.

Consider this example. After you forked upstream, you add two commits to origin/master and submit your first pull request into upstream/master. This goes smoothly.

But soon the history of origin and upstream start to diverge – and you may not even notice. The maintainer squashed your two commits into a new, single commit that tells the story of your pull request more succinctly. The Git history of both repos – origin and upstream – is already different. And it gets worse: Other people’s pull-requests are also merged into upstream/master, and it now has changes that your origin/master lacks.

Your second pull request exposes the problem. You add a new commit to origin/master and submit another pull request into upstream/master. Now you’ve got a mess: Although they had been merged, the commits you submitted before still show up in this second pull request; and the commits you lack cause merge conflicts with your new commits on this pull request.

Avoid the mess. Just remember to submit your pull requests from any branch – except master. For R users it gets better: the pull-request helpers from the usethis package implement the GitHub flow automatically. Too often to “just remember” doesn’t work. Instead it’s best to use systems that automatically enforce the behaviour you want to display. Why not automate repetitive tasks like pull requests?

Citation

For attribution, please cite this work as

Lepore (2020, July 25). Data science at 2DII: GitHub flow. Retrieved from https://2degreesinvesting.github.io/posts/2020-07-25-gh-flow/

BibTeX citation

@misc{lepore2020github,
  author = {Lepore, Mauro},
  title = {Data science at 2DII: GitHub flow},
  url = {https://2degreesinvesting.github.io/posts/2020-07-25-gh-flow/},
  year = {2020}
}