Open-Source Contributor's Guide to Using Git(Cross-posted from BigData Boutique Blog)
Having re-started development of the Apache Lucene.NET project, we get asked quite often about the "right" way of contributing code. This is a quick-n-dirty guide for those who are new(ish) to git. Follow the guidelines below and your journey for contributing to Lucene.NET or any other open-source project (which is using git) will be as smooth as it can be.
Get the source code
Before you do anything at all, you will need to have the source code locally.
git clone <repository_location>
This will clone the remote repository to your local disk, and checkout the master branch by default.
Fix a bug / add a feature
So you want to contribute code. First, create a local branch tracking the latest check-in to upstream:
git checkout -b <meaningful_branch_name> -t origin/master
origin/master to be the current development branch (that is: the upstream repository is called
origin, and the development branch there is
master). This is not always the case, so pay attention to the project guidelines.
Executing the above command will create a local branch, tracking the latest code from upstream. Make sure you name it properly (like
Naming your branch properly is super important - this is not an optional step!
Make sure to always be in sync
Don't let your copy go stale. Fetch changes from upstream frequently, and keep your development branches up to date with it.
Always prefer to rebase, as opposed to merging with upstream. Merging often creates an additional noise that we would want to avoid.
git rebase origin master
This will update your current branch with the latest changes from upstream (again, assuming upstream is called
origin in your local copy, and
master is the current development branch).
Sometimes you have some edits you don't want to commit, but also don't want to lose. When switching branches, or before doing a rebase, you will need to persist them somehow. Use stashing.
Recovering them later is done via
git stash pop
Read more here.
In your branch, you may find yourself committing changes quite frequently. That's perfect!
However, to make your Pull Request as tiny and tidy as possible - please do squash all the commits you've made to your branch into one nice commit. Still, make sure to have all the details and reasoning documented within this one commit.
Squashing is a bit of a pro thing - here is a good guide about doing it right.
Before submitting a Pull Request (PR), make sure you have squashed and then rebased. This will keep your PR small and tidy, and allow much easier interaction with the community. It is also a lot easier for the maintainers to pull your changes in like that.
GitHub makes Pull Requests very easy and joyful, and you should use that. While Torvalds disagrees, who cares. We are all in for making a healthy development environment and I'm ok with him not being my role model for that...
In the Pull Request, please provide meaningful explanations on what was done, how and why. It is a place for discussions on those changes, and can also serve as documentation for our children and grandchildren to understand why we coded things the way we did. Or just for the senile developer in our team...
Before sending a PR though, please make sure to squash your changes, and then rebase with upstream - to minimize the chance of requiring a merge later, as well as to make sure nobody has already worked on the stuff you just did.
There's a great guide on GitHub on Pull Requests - do read it if you are new to the concept!
Merge / diff tools
Have fun, be nice
Open-source will only strive if we will be nice to one another, and respect contributor's time and effort. This doesn't mean we can't be critic about proposed changes - but we should always remember we only do this because its important, and fun.