[Scummvm-devel] On using git correctly: Rebase vs. merge
Max Horn
max at quendi.de
Tue Feb 15 17:17:33 CET 2011
Hi there,
so I am not the best qualified person to explain the following. But since people need to use git *now* and can't wait forever, I'll give it a try anyway. People who know better are welcome to correct me or to elaborate on what I say. And "somebody" should add something on this subject to <http://wiki.scummvm.org/index.php/Git_tips>.
This post is meant to be an overview about the difference between "rebase" and "merge", when to use what, and how your typical git workflow might look. I'll recommend some links at the end for further reading.
And I am going to illustrate this with a concrete example from today. Note that it is not my intention to complain about that commit or the committer, nothing bad happened here! It's just an example (and not the first, actually).
Before I begin, one further remark: If in doubt, ask on #scummvm or here, and we'll be happy to help you. I for myself still need to learn lots of stuff, and won't be shy to do just that :).
So, let's look at the current history of our main repository. We could do this on <https://github.com/scummvm/scummvm/network>, or we could use the "gitk" command for a nice graphical view, or any of the other GUI tools. But since Git offers a neat pseudo-graphical view for this, which I can easily reproduce in this email, I'll use that:
$ git log --graph --oneline --decorate
* c9e3636 (HEAD, origin/master, origin/HEAD, master) Merge branch 'master' of github.com:scummvm/scummvm
|\
| * 402ac93 HUGO: more refactoring and encapsulation
* | 9505bec SCI: Removed several redundant helper functions
|/
* f103051 GIT: Ignore Visual Studio precompiled headers folder
* ee09af6 SCI: Fix loading SCI32 games
* 8ef4594 SCI2+: Set the correct segment for SCI32 strings/arrays when loading
[...]
What you see here is that two people made changes based on revision f103051, independently. First, Arnaud committed his code, creating revision 402ac93. Then shortly afterwards, Filippos pushed revs 9505bec and c9e3636; the latter merged Filippos c9e3636 and Arnaud's 402ac93.
As a result, we got a "diamond" in our history now.
Now, why did this happen? When Filippos made his commit, it initially failed, because in the meantime Arnaud had made his commit, which modified the "head" of the repository. Both touched completely different files, and in Subversion, the commit wold have just gone through. With git, this is not possible. This is actually a feature: In SVN, Filippos would have been left with a checkout that does not reflect any actual revision of the repository, but rather some files are from one revision, while others are from another. This can lead to nasty side effects. For example, you commit something, and it goes through fine, and your working copy compiles; yet when somebody does a fresh checkout (or you do a "svn up"), the code fails to compile. Whoops!
So, the "git push" failed, complaining that origin/master was changed remotely. To overcome this, Filippos probably did a "git pull".
Now, "git pull" first runs "git fetch" (which gets the latest changes from the remote origin, i.e. our github repository). Then, if you have no local changes, it does a so-called "fast-forward", and then it is more or less equivalent to "svn up".
But if you have local commits, things are different. You see, in git, the parent(s) of a commit are an intrinsic property of a commit (and for good reasons, too). So, Filippos had his local commit 9505bec, which was based on f103051, which used to be the head of his repository. After the "git fetch", though, there was a new head: Arnaud's 402ac93. And git refuses to just change the parents of a commit, unless explicitly told so (a very human stance, I think). Instead, it let the two commits coexist in peace, as siblings. And then it automatically performed a merge to produce a common offspring of the two commits combining their changes (at this point we should stop using family analogues; in git, incest, err, merging, is a quite commonly used feature).
Which is all is not that bad -- as long as it does not get out of hand, as can be seen here: <http://img821.imageshack.us/i/mergemess.png/> (watch for familiar names ;).
So, how can we avoid this? By insisting on the parent of Filippos commit! In git, you can do this using "rebase" (the "base" here is the parent of the commit, and "re-basing" just means we change the base=parent; I'll ignore the multi-parent scenario here for simplicity). If you run "git pull --rebase", git will rebase instead of merging. So, in this particular example, it would have taken the commit 9505bec, and tried to change its parent from f103051 to 402ac93. Since the two commits changed different files, this would have worked, and resulted in a new commit with a new SHA-1 id, but otherwise essentially identical to the original commit. Only that now, there would be no need for a merging and history would have stayed linear.
It is even possible to tell git to use "--rebase" on "git pull" by default -- globally, or just for a specific repository. To get this, run "git config branch.master.rebase true" while inside your clone of the scummvm git repository.
BUT watch out, there is a big caveat to that: If you do this, things will work nicely most of the time. But not always: Sometimes, git is not able to perform the rebase on its own. If Filippos' commit had modified the same files as Arnaud's, in the same lines, then git would not have known how to handle this. so then, the rebase would have failed. If you had only a handful of local commits, this is usually not so bad, and you can recover from this using the information git prints on the command line (basically, you manually fix the conflicts; then "git add" the conflicted files; then run "git rebase --continue"). But sometimes it *does* get problematic, and then you might need some experience, or need to read up on this (or ask for help from folks in #scummvm) to recover.
Here is another caveat: If you already have published a commit (pushed it to some public repository), then you should *not* rebase, as that will destroy the original commit and modify *public* history, which will cause problems for anybody working based on your changes. To be specific: NEVER EVER rebase a commit that is already in the main scummvm repository. You are free to rebase stuff that is in your personal public forks of scummvm (so if Paul wants to rewrite history in <https://github.com/dreammaster/scummvm>, it's up to him to decide whether this is acceptable or not). But if we did this in our main repository, we'd potentially make life very difficult for a ton of people.
One more thing: If you plan to make bigger local changes, involving several commits, then usually you are best of working on a branch for this reason, and then actually use "merge" in the end to integrate your working branch back into the "master" branch once your work is finished. But it's difficult to compress the "golden rules" for this into a short sentence, so I'll not try to do this. We really need to explain this properly in the "workflow" section of the "git tips" wiki page, or link to some suitable existing tutorial somewhere.
For more information, including stuff about using "git merge" and "git rebase" directly (and not just as part of "git pull"), read here:
<http://www.jarrodspillers.com/2009/08/19/git-merge-vs-git-rebase-avoiding-rebase-hell/>
<http://blog.experimentalworks.net/2009/03/merge-vs-rebase-a-deep-dive-into-the-mysteries-of-revision-control/>
<http://gitguru.com/2009/02/03/rebase-v-merge-in-git/>
Beyond that:
Official git documentation center:
<http://git-scm.com/documentation>
Official git tutorial:
<http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html>
A visual git "cheat sheet" summarizing the most important commands:
<https://git.wiki.kernel.org/index.php/GitCheatSheet>
For those who want to know more and details:
<http://book.git-scm.com/>
Cheers,
Max
More information about the Scummvm-devel
mailing list