Ok, I actually watched thirty minutes of that rambling, incoherent sermon. To sum up: the distributed SCM model is better because it's obviously better and it's the way Linus works naturally and that's why it's better. A subtheme was: if you don't already know why it's better Linus pities you because you're an idiot.
I loved when the one Googler asked a practical question about merges, and Linus goes off on the "network of trust." At some point, someone, somewhere, still has to merge and reconcile all these free-floating repositories into a single official version. The Googler said that specifically: massive code base, thousands of different versions and flavors, how do you manage this? Linus' response? Well, in his world everyone pulls from his branch so it's all cool. He in turn pulls from the branches of a few people he trusts (and presumably does the same manual merge/regression test cycle we all do regardless of what SCM we're using).
I don't get it, and Sourceninja I'd love to hear more about how your company is actually using Git. How many developers, what the cycle looks like, etc. At my outfit we have about 20 devs on SVN. At a team level we're responsible for making sure our commits build and pass test. We do a nightly integration build off head (typically), and this enforces a mindset where you work in smaller, manageable pieces. Our repo is in one place, so we know that the work we're paying developers to do is backed up and safe, and the integration build tells us that what is being committed to that repo works and has value.
I'm interested in anything you or anyone else can share with us about how Git is used to accomplish these goals in a distributed manner. Unless the last thirty minutes of that video is much more informative than the first, I'm not going to get it from Linus .
We recently (because of constant begging) started to look at something better than svn. The 3 choice we identified were mercurial, bazaar, and git. We tested each one by using it for a normal workflow, we created a repository, did a checkout, made some changes, then tried to merge in a huge bunch of changes (as if some developer did a major checking after you did your checkout). All of them handled it with flying colors, the only real difference between the three was a little bit of speed (but nothing the user could notice) and how they handled branching. The differences between them and svn was the fact that they were much more flexible in terms of workflow.
I'll break down our workflow with svn.
You checkout from the server, you make your changes and test them. You merge any changes after your checkout, and you commit. This workflow works and it's really the only way you can do it with svn. It's flawed. I'll break down the flaws we identified.
1) People were hesitant to commit. They didn't want to commit incomplete or broken code. This is because making new branches to test something, or for experimental features that may not make it into head is hard as hell and impractical with svn. Ever read the documents on merging branches for svn? It's pages long.
2) Because of 1 we ended up with a Trunk(HEAD/production) branch, a Test branch (for QA) and then every single developer having a single branch with their name on it. Even then people still were not committing anything until stuff was finished, so all history was basically lost and commits were large commits done daily or weekly. QA would check out the user's branch, verify it, merge it into test (a pain in the ass), test it, then move it to production.
3) It was impossible work without access to the network. You can't commit if you are not on our network. Well you can work, but you can't commit and thus are losing revision history. This means either you keep dozens of copies of your working directory, or you just hope you never screw anything up and need to diff. If I decide to go do a few hours working a coffee shop and don't have vpn access I could have just one commit (after I got home or back to work) for hours of work and multiple new features.
4) We were using svn as one repository for many projects. We have over 100 projects. Each one has it's own folder with trunk, tags, branches inside of it. When the repository got corrupted and we restored from backup, we lost history in quite a few projects. If they had been separate repositories, then we would have only lost history in the one that went sour.
5) We were dependent on the server for our code. This is a problem because I have interns and work studies. I do not want them to have commit access, but I want their revision history. This was a very annoying problem as they had to email me diffs for each revision to approve before I did the commit.
6) SVN invades your project. Take a look at how many .svn folders you can find in a single checkout. Seriously, why?
There were other issues, but git solved the major ones and improved each developers workflow. Our current workflow looks like so (and hopefully you will see how it fixed the 5 problems).
First, we have a git server that we use to hold our main code trees. Each project is now it's own git repository. Now when users checkout (clone) a repository, they have their own local copy. This means they can make any branches they want for features, do checkin's as often as they want etc. None of that will ever effect the server. In fact none of them can actually do commits to the server.
The server now only holds two branches, master (HEAD/production) and Test (for QA). The user can use whatever workflow they want on their own copy of the repository. I for example make a new branch for each feature I need to add. I have a project right now that has 5 branches in my local copy. When finished, I merge them back to master and ask QA to do a checkout from my computer. Some people just do all their work in master and when finished ask QA to do a checkout from them. QA then takes the code, verifies it and merges it into TEST and pushes it up to the server. Once test has been passed, it is then taken and merged into master on the server.
Because of the local repositories, we are also not tied to the server. In fact, the server could crash and we could still do pulls, commits, and pushes by just using one of the other developers. After the system admins rebuild the repository server one of us could just push our changes up.
This has other advantages, my interns can checkout from the server. They make their changes and when finished they ask me to do a checkout from them. I can do a checkout from them, verify it and push it on up.
Originally we had the developers just pushing their changes to the server to the test branch with a tag set for the 'official test' version. This worked great, but eventually the QA guys decided they liked doing the checkouts directly from the developers. Granted our developer team is small. We have 6 developers and the QA people are really just myself and the other senior programmer. In addition to our normal programming work, it's our jobs to verify any code before it is released for general consumption. We agreed that if we had more developers, we would probably not be pulling from them directly all the time, but having them all merge themselves into a branch that we could then just test and release.
Merging is dead simple in git. Merging in SVN was a major event for us. Merges were planned, prepared for, and usually done as a team. As I've said before the documents on the merge situations in svn could fill a small book. In most cases where only one or two developers did work on a project, the merge was simple. But in our larger projects where we had 4, 5, sometimes 8 or 9 people (interns) working on a project, the merge could be overwhelming. I can do most merges in git by myself. In fact many projects are just fast forward merges and require no commits. Also, because having a local repository has encouraged more commits, our revision history is more meaningful, more updated, and it is easier to see who, when, why, and where a piece of code was changed, added, or deleted. If I find a merge with a conflict that I can't easily resolve just by looking at a quick diff (which has been actually fairly rare), I just tell the developer to checkout test and try to merge his master into it. Once he works it out, I checkout from him, test it and push it into the server's test branch. It hasn't happened yet however.
I haven't had to merge more than one repositories together at the same time. Usually we are on top of it enough that after you merge one, you can get everyone to pull the changes to their repo, rebase their project and then let us know when they are ready for us to pull from them. However, the merge tools are very smart, and I'm confident it would be easier then it would be in subversion. Because of the pace of commits in subversion (and the annoyances of merging branches frequently) we were constantly trying to merge two branches into the test branch at the same time.
Other things we now enjoy.
1) The fact that git only has one .git folder for the entire repository. You want to stop tracking a work directory, just delete that .git folder.
2) How git handles working directories. You don't need another working directory for each branch. Switching branches changes the files in your working directory to match that branch. I find this very intuitive to work with. Some of our other developers do not and still do a separate work directory per branch.
3) git is fast. Nuff said
4) git is powerful. There are tons of commands for git that we have never used. But I can see situations where they could be useful. For example, you can commit parts of files, shelve work in progress, merge from multiple repositories, etc.
5) Building a git repository/working with git is easy. Go into a directory you want to track and type "git init". Boom it's now tracked by git. You want a new branch, go into that same directory and type git branch branch_name. Boom new branch. Want to merge that branch back to master, switch back to the master branch (git checkout master) and type git merge branch_name. That's it. You want to delete that branch now because it's merged and no longer needed, git branch -d branch_name. The branch is gone.
6) We found out some of us were already using git. Seriously we were using git-svn to do our svn checkouts and using git locally anyway. This gave us the advantages of git without anyone else being the wiser.
That is not to say that git is some kind of holy grail. Honestly, in testing mercurial, bazaar, and git, the only advantage git had was the way it handled branching. Most of us liked the branching method git used vs each branch having it's own folder. There are also great reasons to use a central source code repository. None of them met our workflow however. We tend to encourage developers to be flexible and come up with their own workflows that improve their performance. Because of this we have guys running linux, windows, and osx. Sometimes they work alone and sometimes in teams. Forcing them into a central repository workflow was limiting. Now I can checkout from my co-worker to help him with a bug, and he can checkout from me after I fix it. We can take our notebooks to pizza hut and work though lunch and still do commits. The repository has gone from a place we keep production code and testing builds, to a place we store a history of changes and a tool we use for tracking down bugs, bailing ourselves out when we screw up, and keeping a detailed history of our work. I've gone from doing once a day (or in a lot of cases once a week) commits to multiple commits a day with one push to the main server only when it makes sense (IE for testing).
However, we have lost some things. For example: the server is backed up nightly. My notebook is not officially backed up (I run my own time machine, but that's it). This means we have very loose control of where our source is. If the server goes down, I lose at most a day of work. If someone's notebook died, and they haven't committed in 3 or 4 days, we could lose 3 or 4 days of work. We however had the same problem either way, as our developers did not do commits to the server until the code was not in a broken state. Thus commits were very rarely every single day. At least with local repositories we have a history that can be pushed up to the server.
Git is not exactly 100% windows friendly. There are hundreds of tools to use svn on windows. Git has about 3. They work, but their not awe inspiring. However, bazaar (which is just as good imho) has wonderful windows tools.
Beyond my job, I use git for my own private projects. I don't have a server. I have a local repository on my notebook, and a 'bare' repository on a usb drive. I push changes to the usb drive when they are 'production' ready and make branches on my notebook and develop from there. This gives me a great revision history and backup with no need for a server. I initially tried this with subversion, but found a quick problem. Subversion doesn't work over the afp:// protocol (my usb drive is plugged into my airport extreme). Git, bazaar, and mercurial all work in this use case. I also have a start up company I'm working for, all 3 of the developers (including myself) are at remote locations (ie, our homes across the country). Having a central repository would mean we would need to lease hosting or rent an office. We have neither, instead we just checkout from each other. I'm sure this isn't sustainable in the long term, but as we ramp up for our initial release, this works just great.
For a company, it will depend on their culture. For an individual or small group, you would be stupid not to use a distributed system.