minrwhite · August 29, 2015 14:27 · Apr 27, 2015
diff --git a/git-dmz-flow.md b/git-dmz-flow.md
@@ -0,0 +1,138 @@
+# Git DMZ Flow
+
+I've been asked a few times over the last few months to put together a full write-up of the Git workflow we use at RichRelevance (and at Precog before), since I have referenced it in passing quite a few times in tweets and in person.  The workflow is appreciably different from GitFlow and its derivatives, and thus it brings with it a different set of tradeoffs and optimizations.  To that end, it would probably be helpful to go over exactly what workflow benefits I find to be beneficial or even necessary.
+
+- Two developers working on independent features must *never* be blocked by each other
+    + No code freeze!  Ever!  For any reason!
+- A developer must be able to base derivative work on another developer's work, without waiting for any third party
+- Two developers working on inter-dependent features (or even the same feature) must be able to do so without interference from (or interfering with) any other parties
+- Developers must be able to work on multiple features simultaneously, or at least *move on* to a new feature without delay once the first feature is completed
+- No code can ever enter master without code review
+- Code review should be a painless and natural part of *every* developer's work day
+- The continuous integration build on master must never break.  Ever.  For any reason or any length of time.
+    + This is related to the first point. If one developer breaks the build, no one else can base work on master until it is fixed!
+- This workflow must scale to any sized team and any number of concurrent features in development
+
+The scaling aspect of a workflow is generally overlooked.  It's a serious question though, because tricks and shortcuts that work just fine for a team of two generally end up imposing completely unacceptable overhead when you try to apply the same workflow to a team of hundreds.  We need to think very carefully about choke points and process overhead, and then work aggressively to nuke those workflow elements from orbit.
+
+You'll notice that the primary themes of this list are a) non-interference, and b) collaboration.  Developers should be free to do their work, never impeded by the actions (or inactions) of some other developers.  Any time a developer has to sound the alarm to the entire team, saying that the build is broken and they're working on it, or they're working on cutting a release and no one should push code, that is a massive amount of time wasted on the part of the whole team.  These moments must be ruthlessly eliminated.  Similarly, developers must be free to collaborate efficiently in an ad hoc, peer-to-peer fashion, without interfering with others and without friction or overhead.
+
+## Overview
+
+Just to get a picture in your mind, here is a diagram of the full workflow (including the optional release extensions).  This may be of assistance in digesting the wall of text to come:
+
+[![Git DMZ Flow](https://docs.google.com/drawings/d/1UIDn1xTXAaDyYvBFQpUFIHb1FGyVTT9MO3O8ZlAx0jY/pub?w=1239&h=2045)](https://docs.google.com/drawings/d/1UIDn1xTXAaDyYvBFQpUFIHb1FGyVTT9MO3O8ZlAx0jY/edit)
+
+This is the raster version of the diagram.  The vector version is linked.
+
+## The Basics
+
+We need a single, "anointed" (if you will) shared repository that everyone can go to in order to get the stable, shared, most recently up-to-date work on the project.  After all, it is a collaborative project.  Everyone's work needs to find its way into a single location *eventually*.  This shared repository will contain a `master` branch that is pristine.  It will *always* be safe to base work on the `master` branch in the shared upstream repository.  The code in that branch will always be self-contained and fully functional, building, tested and ready to deploy.  It will *never* fail to build, because failing to build would prevent a developer from basing work on it.
+
+In order to preserve the pristine and reliable nature of `master`, we're going to impose some constraints.  No one can ever put code *directly* into `master`.  Everyone has to work on feature branches, and those feature branches can only be brought into `master` by way of the pull request.  Pull requests will be built automatically and their build status checked before merge into `master`, thus ensuring that code coming into master has always a) passed automated checks like compilation and testing, and b) passed code review.  Pull requests should be used for everything, from tiny whitespace changes to sweeping refactorings.  A healthy team opens many, many pull requests every day.
+
+In fact, it's not unusual to see developers opening multiple pull requests in a single day!  This is a *feature* of the pull request workflow, because it means that the developer in question was able to perform multiple tasks without inhibition at their maximum productivity rate.  That productivity rate just happened to exceed the rate at which pull requests could be merged, *and this is fine!*  A healthy project with a healthy team should always have a few outstanding pull requests, working their way through the pipeline.
+
+### Forking
+
+Of course, if no one can push to master directly, we need *some shared place* for pull requested work to live before it gets merged.  There are two possible answers here.  One answer is to use branches in the shared repository, and the other is to give every developer their own fork, to which they alone have access.  Both approaches *work*, but in general the forking approach scales much better.
+
+The reason for this is the fact that branches on the shared repository are shared with *all* developers.  Not all developers may be interested in your refactoring of this tiny module off in an obscure part of the codebase.  Why should they even have to see your branch in the first place?  They're going to see the pull request; that's more than enough.  Putting branches on the shared repository causes all of the developers to live in the same effective namespace, stepping on each other with each new branch created or deleted.  It becomes very easy to accidentally push to the wrong place, or delete the wrong branch, or similar.  These problems are manageable in a team of a dozen or so.  They are categorically unmanageable in a team of hundreds.  Remembering our scaling mandate, we choose forking instead.
+
+Every developer's fork is their own.  They own it.  No one else (not even project managers!) can push to a developer's personal fork.  Everyone is free to *pull* from everyone else's fork, but only the owner can push.  In this way, every developer is given their own "shared scratch space".  This gives them the freedom to push whatever branches or tags they want, allowing maximal freedom in ad hoc sharing between developers, without ever entering the mainline repository (more on this in a bit).
+
+Whenever a developer has completed some work, they push that work as a branch to their own fork and then submit a pull request to the shared repository.  Once their pull request is merged, they can delete the branch at their leisure, knowing that the existence of this branch on their own fork isn't hurting anyone other than their own branch-cleaning OCD.
+
+### Branches
+
+Speaking of branches, some conventions for branching need to be observed in order to avoid mass chaos.  In general, rewriting public history (via rebase, filter-branch, amend, etc) is *bad*.  Very, very bad.  The reason it is bad is it inhibits collaboration!
+
+Consider the following scenario…  I'm working on some internal API that enables some cool thing in an easier way than was possible before.  I've been working on this feature mostly by myself, but of course the rest of the team all knows about what I've been doing because we've been *communicating* (the most important element of *any* workflow).  The feature is basically done, and so I push it to my fork and submit a pull request.
+
+Now, that pull request is under review, so I go on and work on other things.  People leave comments on my code, telling me to correct my indentation or adjust some elements of the algorithm, maybe add a test or two.  You know, standard stuff.  I address those comments as I have the time, pushing new changes to my branch, and eventually the branch will wind its way to a state which the rest of the team deems acceptable for merging into the shared `master`.
+
+In the meantime, some other members of the team would *really* love to use my fancy new internal API, because they want to do that cool think more easily than was possible before!  They don't want to wait for my code to land in `master`, they want to use it *now*.  They're willing to take the changes as-is, knowing that the code review will change a few things; they'll deal with that down the line and the productivity gains from having my code *now* are worth a little merge headache later on.
+
+Now, my changes are all in their own branch, nicely hanging out in my fork.  There's no reason why they can't just merge my branch into *their* branch (which they're still working on)!  Even though my branch has yet to land in `master`, they're free to base work on it and reap the benefits right away.  Remember, no developer should *ever* have to wait for another developer to complete some task, and that includes code review!  If you need some work that another developer has pull requested, you should use that branch without waiting for it to land!  This is a *huge* productivity win.
+
+However, this productivity win is completely harpooned if I inconsiderately rebase my branch *after* I pull request it!  If I rewrite my history, they're now based on work which will *never* land in master, and those commits which are now part of *their* branch will actually cause merge conflicts when they try to file their own pull request.  Multiply this by a few other branches from other developers that they brought in, and you have a recipe for insane heartache and massive productivity loss as some more schmuck who happened to have configured a fancy `mergetool` has to wade through a list of commits, weeding out the ones that got rebased.
+
+This is an unacceptable loss of productivity.  So we observe the following rule: if you have pushed code that is destined for master and it is *available to others*, you cannot rebase it.  At all.  Ever.  For any reason.  Those commits are no longer *yours*, they belong to the team, and you are no longer free to rewrite them.  You can rebase, amend and generally fiddle with history as much as you want *before* you push your commits, because what happens on your computer stays on your computer, but once you push, it's not yours anymore.
+
+It is useful to carve out an exception to this rule though, and that exception is for close collaboration with other developers on work that isn't *really* ready to go into `master`.  An example of this would be if you're working on a feature with two or three other developers.  You need a way to share work back and forth, work that you might collectively decide to rebase later.  You're not ready to pull request it, and you don't really want anyone basing *their* work on your work in progress (because it's not ready!), so you need a way of indicating to the rest of the team (who isn't privy to your little threesome of collaborative bliss) that this is not "public" code, it's just "shared".
+
+We do this by having different branch prefixes.  A branch that begins with `feature/` or `bug/` is designated for merger into master as-is, perhaps with some additional commits tacked on in response to the pull request review), but otherwise ready to go.  Such branches are safe to base work on, because they *will not* be rebased under any circumstances.  A branch that begins with `wip/` is *not* safe to base work on.  It's *not* ready to go into `master` and it might be rebased without warning.  So you can't just blindly merge `wip/` branches into your code.  Furthermore, you cannot pull request a `wip/` branch.  Only `feature/` or `bug/` branches are suitable for pull request.
+
+In this way, we allow for collaboration and code sharing between ad hoc teams and groups, while also allowing for shared stable work and avoidance of slowdown due to code review.
+
+### Pull Requests
+
+Anyone should feel free to review a pull request from anyone else at any time.  If it's an open pull request against the shared repository, it's fair game!  Read the diff to learn about parts of the codebase that you maybe haven't seen yet.  Comment on the author's code style, or point out edge cases that they didn't test.  Many eyes makes bugs shallow, and there is a whole wondrous class of bugs that are easy to catch and eliminate just by getting the entire team to take a cursory glance at your code *before* it goes into `master`.
+
+I generally work pretty heads-down on what I'm doing, but inevitably I'm going to run into a rough patch where I'm just not making any progress.  When I hit that point, my first "time wasting context shift" is to open up the pull requests page on the shared repository and start reading through things.  It gives my brain a break from what I'm working on, and allows me to take that time which *would* have been wasted on cat videos and instead spend it on something that helps the team: improving code quality in `master`.
+
+In general, you want to make sure that at least a few developers have reviewed a given pull request before it goes into `master`.  Anyone can review any pull request, but sometimes (often!) it is useful to ask specific developers for their eyes.  Maybe this pull request touches closely on someone else's work, or maybe you're unsure as to whether or not you "got it right" and you want to rope in the expertise of another specific developer.  It is very important that whatever tool you're using to host your repositories and open pull requests is also capable of "tagging" specific people, so that those people know they need to look at that PR before it gets approved.
+
+And obviously, just as a matter of basic etiquette, you don't want to merge a PR to `master` without getting the go-ahead from all of the individuals who were tagged by the original author!
+
+## Refinements
+
+Thus far, everything we have described fits a standard open source workflow to a 'T'.  There's really nothing novel or weird here, it's basically just the same workflow (with a few new conventions) that you might find in any project on GitHub, and in fact this is where the workflow *started* as a baseline.
+
+However, when we were first trying to employ this workflow at Precog, we very quickly ran into a simple problem: merging all of our pull requests took *time*.  Each pull request was of course built by continuous integration based on a merger with the current `master` state, and so we knew that each pull request individually was safe to merge.  Unfortunately, what we couldn't know was whether or not all of the pull requests were safe to merge *together*.  When you have a productive enough team (possibly simply due to size), it is the rule rather than the exception to have multiple pull requests which simultaneously need to be merged.  All of these pull requests individually build safely against `master`, but what about the aggregate merge of the whole set?
+
+It isn't uncommon to have two PRs touching very closely related code in such a way that, while they don't conflict on a syntactic level (i.e. in a way that would be detected by `git merge`), they do conflict on a *semantic* level (i.e. something that would break the build).  A good example is two PRs, one of which renaming a function and another adding brand new code that uses the old function name.  Both would individually build and test against `master`, and they wouldn't conflict in the merge, but if you merge them both to `master` at the same time, the build will break.
+
+And that's exactly what happened.  A lot.  So we started imposing constraints on the merge process.  Anyone merging pull requests into `master` had to do a full build *locally* of the merge set before pushing.  This eliminated the `master` breakage, but unfortunately it was a massive time sink.  It basically meant that anyone merging pull requests had to devote a significant chunk of their day (and CPU time!) to just sitting there waiting for a build to complete, and the amount of time wasted increased proportional to the number of open pull requests.
+
+On top of this, when merging pull requests to `master` in this fashion, you can really only have *one* person doing it at a time (or at the very most, a small set of people who all communicate very closely).  The reason being that you want to make sure that the pull requests being merged and the state of `master` being merged against are *fixed*, and no one else is messing with them under your feet while you're trying to do the merge.  Thus, we appointed a "merge master", which eventually grew into a small team of "merge masters".  The general idea is just a very close-knit set of developers (or more generally, team leaders) who were responsible for the ultimate state of `master`, who had final say on what pull requests got merged and when, and who were able to coordinate closely enough to avoid stepping on each other's toes.
+
+Unfortunately, while this allocation of trusted merge masters makes a lot of sense from a code safety and organizational standpoint, it also serves to concentrate the productivity lost due to waiting for the aggregated merge to build.  It made the job of merge master extremely distasteful, and caused the merge masters to put off merging pull requests for days and even weeks at a time.  This is understandable, but it caused a large backlog in the workflow and it needed to be fixed.
+
+### The DMZ
+
+The solution to this was a special shared branch we called `dmz`.  Rather than allowing the merge masters to merge directly into `master`, which could break the build for everyone, we instead locked down `master` *completely* such that no member of the team was able to push to that branch.  Instead of pushing to `master`, merge masters were given the rights to push to `dmz`, a different branch in the shared repository.
+
+The `dmz` branch is special, because it is *only* used by two entities: the merge masters, and the continuous integration server.  Merge masters can merge pull requests directly into `dmz` without the need to build them locally first.  As long as the merge is clean, they can push the results to `dmz`.  If the merge *isn't* clean, the openers of the respective pull requests were informed that they needed to fix some merge conflicts with the other branches and that was that.  Merge masters no longer needed to spend any time either a) fixing merge conflicts, or b) waiting for aggregated builds on their local machines.
+
+Whenever anyone (that is to say, any merge master) pushes to `dmz`, the continuous integration server starts a full build of the branch `HEAD`.  If and *only* if the build is clean, the CI server then *automatically* fast-forwards `master` to the clean `HEAD` of `dmz` and pushes the results.  If the build fails, `master` remains untouched in its previously-pristine state.  At this point, the merge masters can back out the erroneous merge (by rebasing out the commits and force-pushing `dmz`), passing along notification to the PR authors that conflicts need to be rectified before merge.
+
+Note that developers *do not* base branches off of `dmz`!  Branches are only based off of `master` or off of `feature/` or `bug/` branches already PRed.  Because no one bases off of `dmz`, not only is it safe to rebase out the failed merges, but failed merge builds don't affect anyone other than the merge masters!  The merge masters can very easily back out failed merges with a single command and some comments on a few PRs, so even the merge masters are not significantly impeded by such failures.
+
+`master` never breaks.  Ever.  It is *always* safe to base work on `master`, because the CI server *guarantees* that the current state of `master` is always pristine.  It is always cleanly building, and (if your test suite is comprehensive enough) always safe to deploy.
+
+The beauty of this is that no humans are involved in the decision to bless code for `master`.  It's all automated.  Machines cannot "cheat" code into `master`.  As long as you always go through the `dmz`, you will always have the hard, automated guarantee that `master` is safe and pristine, and no developer or deployment will ever be blocked by a build failure ever again.
+
+#### Distinction from GitFlow
+
+This is fundamentally where the Git DMZ Flow separates itself from GitFlow.  In GitFlow, you have a shared `develop` branch that anyone can push to.  Work is done in feature branches by convention, but ultimately anyone can just drop code into `develop`.  Now, it is understood in the workflow that `develop` is sort of unstable *for precisely this reason*, and this is why `develop` is separated from the deployable `master`.  But that doesn't prevent other developers from being hindered by the instability of `develop`, since everyone is cutting their feature branches from `develop`, not from `master`!
+
+With the Git DMZ Flow, all branches are always cut from `master`, which is just as pristine as `master` in GitFlow.  Work flows *efficiently* from feature branches back into `master`, via the pull request process, but the automated guarantees ensure that this efficiency does not come at the expense of stability.  Thus, the Git DMZ Flow does not require a separated `develop` branch; `master` *is* `develop`, but in a perfectly stable form!  This makes the process both lighter-weight, less prone to blockage (no build failures, ever) and more streamlined in terms of branch maintenance.
+
+### Releases and Validation
+
+While it is absolutely our *preference* to employ continuous deployment, and `master` is certainly stable enough that we can do that, some companies do utilize a Q/A validation process on releases prior to pushing them to live servers.  These sorts of Q/A validations can reveal interesting problems that are hard to catch in automated test suites, and so there is legitimate value to them.  However, they do require *time* and *stability*.  It's very hard to do Q/A validation on a branch that is changing many, many times per day.
+
+To this end, we have an *optional* extension to the Git DMZ Flow which revolves around `release/` branches in the shared repository, cut from `master`, which are used by Q/A for validation.  This allows Q/A to put a line in the sand, so to speak, and perform validation on a stable branch without blocking ongoing development in `master` (via pull requests, of course).  As emergency, show-stopper bugs are identified in the `release/` branch, `hotfix/` branches can be cut from `release/` (not from `master`!) which address these issues directly.  These branches are of course local to developer forks, just like `feature/`, `bug/` or `wip/` branches, but they are designed to more immediately address failings in a release operating under a deadline, without running afoul of ongoing development that may have changed the state of affairs in `master`.
+
+These `hotfix/` branches get pull requested back into the appropriate `release/` branch, thus still going through code review.  Unlike with `master` though, there is no need for a special "release dmz", since we expect that only one or (at most!) two `hotfix/` branches will be outstanding at any point in time, and their review and merge will be a very high priority for Q/A.
+
+Once all of the problems identified by Q/A have been addressed in `hotfix/` branches and merged into the `release/` branch, the release can be pushed to the live servers.  At this point, we want to bring the hotfixes *back* into `master`.  Fortunately, this is very easily done by pull requesting the entire `release/` branch back into `master`.  The `release/` branch will remain as a maintenance branch in the shared repository (in case further hotfixes are required), but the changes will make their way back into `master` via the pull request process and the `dmz`, just as with any other `feature/` or `bug/` branch.
+
+In this way, hotfixes and stable long-term releases can be achieved without code freeze or blocking ongoing `master` development in any way, while also seamlessly porting those changes back into `master` and ensuring that don't get "lost", as so often happens with last-minute release changes.
+
+## Overview and Experience
+
+The Git DMZ Flow allows for highly concurrent development in teams of arbitrary size, with code review as a fundamental part of everyone's daily workflow and no one person ever blocked by the work of another.  In practice, it scales extremely well, having been used in teams as small as two and up to well over a hundred developers, all on the same code base.  In practice, the code quality assurances encouraged and even enforced by this workflow result in catching bugs much earlier, while also raising team awareness of the state of the code and ongoing development.  Ad hoc collaboration and sub-team formation is encouraged and happens organically, as dictated by the needs of any given task, and work under review can be easily incorporated by other team members without awaiting "go ahead" from anyone.
+
+At this point, I personally can't imagine working with any other workflow.
+
+## Acknowledgments
+
+To be extremely clear, this workflow is *not* my creation!  The bulk of the inspiration for this workflow came from standard open source etiquette, with forking and pull requests for contributions.  There are obviously some significant differences though, and for those differences we have to thank (in no particular order):
+
+- Kris Nuttycombe
+- Derek Chen-Becker
+- Erik Osheim
+
+At least, these are the three significant designers who stick in my memory.  If I've missed someone (or mis-attributed), please let me know!
No results found