git feature branch workflow and pull requests

github
git
Tags: #<Tag:0x00007fdab01c6140> #<Tag:0x00007fdab01c5fd8>

#1

We had a discussion in rOpenSci Slack recently. Here is the summary:

I got a pull request to one of the R packages I maintain on GitHub the other day. First of all, awesome! Getting PRs is great. Second, the PR was coming from the master branch on their fork. This made me wonder if it would make sense in a pull request template to ask the pull requester to make their changes on a feature branch.

There’s a few different questions in there:

  1. Is “feature branch” the best term to use?
  2. Is it a good idea to request this workflow?

The question garnered a lot of conversation. Here’s a summary of the conversation:

feature branches?

For what to call the branch, in addition to “feature branch” there was:

  • new branch
  • named branch
  • fix branch

Consensus i think was that “new branch” might be better terminology than “feature branch”; not every PR is a “feature” per se, so “new” is more encompassing of all types of PRs. I still like “feature branch” myself because it seems this is the most widely used variety, but I think I’d parenthetically explain what it is.

Below, I’ll use “feature branch” for simplicity.

As for what to name each feature branch, there’s no hard and fast rule that I’m aware of. This discussion has some ideas though.

workflow

There was some concern that if you request that folks use a feature branch it might raise the barrier to contribution: if they haven’t heard of feature branches they may throw their hands up and walk away. Is this likely to happen? Anyone have thoughts?

Jenny Bryan posed a question on twitter: maybe GitHub could allow the repo maintainer to disallow PRs from a master branch. I’m in support of this :ballot_box_with_check:

Some entirely get rid of their master branch, or at least don’t work on the master branch.

There was a fair amount of consensus, I think (correct me if I’m wrong), that it’s ideal if PRs do not come from the users master branch, but from a feature branch. Most agreed that asking users submitting a PR to use a feature branch is a good idea: if they haven’t heard of it, they’d be introduced to the concept; it’s good idea to not work on master if you are working on a fork that is meant to contribute back to the original repo; we agreed that it’s good practice to work on feature branches, so introducing people to the concept is a win.

Here’s a scenario to demonstrate:

  • Fork a repo, and pull down localy (assuming the default branch is master)
  • Make changes on the master branch
  • Push up to GitHub
  • Submit pull request

The above is fine. Trouble comes when any additional PRs are made by the same user. They might make changes on their local copy on master branch, and then try to pull in changes from upstream master, which will not end well. If the user uses feature branches for each pull request, then the workflow becomes easier:

  • sync with upstream default branch
  • make a feature branch
  • do work on that feature branch
  • push feature branch up to web
  • submit PR

A few feature branch workflow info/help pages/slides:

Other bits

Your thoughts?


#2

Excellent summary. Thank you!
Motivated by this I reviewed http://happygitwithr.com/remotes.html and I found it very clear.


#3

+1 for encouraging people to use feature branches. Also helps with the (not infrequent when working with novices and large datasets) problem of accidentally committing large files (> 100 MB) to master and then having to do some git filter-branch open heart git surgery and then git push origin master --force to remedy, which really messes things up.


#4

Only replying to put a link to this resource to learn how to make a PR, that includes the creation of a new branch. https://github.com/thisisnic/first-contributions


#5

didn’t think about that, but definitely agree - one issue in one my pkgs comes to mind https://github.com/ropensci/rnoaa/issues/216


#6

I use the GitHub.com interface quite often for submitting small PRs (correcting typos). Editing a file (clicking the pencil) in a repo you don’t have access to automatically creates a fork and a branch patch-1 in the background, so that’s good. :+1: It also doesn’t ask anything extra from the user. Maybe this is a workflow that can be recommended over forking first, editing next (which by default is on master).

Is submitting a PR from a forked master mainly a problem for the user submitting or for the maintainer of the upstream repo? You mention the trouble of additional PRs made by the same user. I see this as an edge case, as it is for users who make multiple contributions and needed to do them locally (so not quick doc fixes): that’s already a category of user who is more tech savvy, but you can definitely advise them to not work in master. When I submit another PR, I tend to solve this by completely deleting my forked repo before starting again :grimacing:, but I understand this might not be the workflow everyone follows.

In my experience, cloning the original repo and then working on it (master or other-branch) is more annoying mistake to make, as you then can’t push your changes back.


#7

I think of it as a problem for both.

  • contributor: following PRs are more problematic if all work is on master. i’m not sure if this is an edge case or not, but for contributors to repos I maintain, I get more than 1 PR from many of them, so it seems relatively common to make more than 1 PR to the same repo. I agree if you know it will be a one off contribution, PR’ing from master seems less bad
  • maintainer:
    • a logistical thing for me with PRs is I can never remember the commands to pull in the PR branch changes to have a look locally, so I copy/paste the recommended commands from github, and if the contributor is on master, and it’s not their 1st PR, then the first command to checkout the contributor branch (master) fails because that branch already exists from a previous PR - anyway, its a bit silly, but that’s my workflow for better or worse.
  • the more important philosophical point though I think is that it is easier to reason about the tangle of branches (on the contributors and maintainers local machines, and the github repo) if they have names for specific features rather than all done on master

seems like a reasonable workflow.

good point! agree. I don’t have a good sense of how often people run in to this?


#8

This is great and little known. It keeps maintainers happy and is simple enough to encourage small contributions. Alternative workflows that require the contributor to clone locally have an overhead that I consider before bothering.

Has someome already written something for this particular ‘trick’ that would make it easy to include in an ISSUE_TEMPLATE.md?


#9

I haven’t seen that in a CONTRIBUTING file or similar yet. Might be worth pointing to a link that shows briefly how to do it