When it works, it works
Git is great. Push here, pull there. It works so well that you might be fatuously convinced it’s the perfect too to deploy code to production. Even worse, it might appear to work well at this task, further reinforcing your choice However, the Achilles Heel of any DVCS is your origin provider. Let’s say that BitBucket has borked their database for the 4th time this month or GitHub is suffering yet another DDNS attack. Then we see posts opining about failed Git based app deployments.
When shit goes wrong, things get complicated
Now shit’s gone wrong. No worries, there must be a more complicated way to solve what appeared to be a simple workflow. We’ve got all these Unix cli tools and can bodger something together. I think I can just scp the files over. Wait, better rsync them, I’m not sure exactly which ones changed. Arr… so many flags, do I want to compare file checksums or timestamps? Maybe I’ll tarball up everything and push it over to the servers. What was the command string again to untar and ungzip? Crap, I included my file permissions and they don’t work on the server. Huh, how was I supposed to know the code stored running PID’s in various files sprinkled throughout the source? WTF, someone tweaked some of those settings files server side and I just overwrote them. Fuck… I made backup of that server directory before I started, right? Alright, Hail Mary time, I’ll just export my remote repo and import it as a different origin on the server. How the hell do I do that?
Shit goes wrong at the wrong time
The above might be a fun exercise on the QA server when it’s 3pm and everyone’s in the office on a slow Tuesday, but that’s not how these things unfold. Nope. What will really happen is there is a hotfix that needs to go out and got assigned to the intern, because he needs the experience, you know. And because he’s the only guy on call during Thanksgiving since everyone else is away on vacation. But now he’s riding the wheels off this Rube Goldberg machine, getting both hands stuck in the tar pit and only working himself deeper as he borks the entire production setup and your site is down for the count at 2am on Black Friday.
Special snowflake servers
Git checkouts to update code encourages Special Snowflake servers. Each server is a unique artisan crafted piece of Unix art. Like literal snowflakes, no two are the same. No one really understands how it all works and the little documentation that exists was last updated in the Bush administration. Running `git status` shows lots of little file changes to get things just right on each machine, some versioned, some not, so no one has had the balls to `get reset –hard` for years now.
The better, deterministic way
Deploy your code as a self contained distributable. In Java we’ve got War and Ear files. In Play Framework we’ve got binary distributables you unzip and run. ASP.Net can be packaged, just like Ruby and many others. They’re like MRE‘s, but you just zip them, no need to add water. You don’t care what version of Scala is running on the server host, whether the proper DLL is loaded, or if you’re on the proper Ruby release. It Just Works™. When shit’s broken and your customers are screaming on Twitter, you want your code to Just Work.
Distributing the distributables
“The distributable is huge!” you warn. Sure, 78MB won’t fit on a floppy, but we’ve got 10G server interconnects, I think we’ll be OK. “But where will we server those from,” you say, still unconvinced. How about AWS S3, with 11 nines durability (99.999999999). Or, you can setup your own Open Stack Swift object store if you’d prefer.
The process is simple:
- Built and unit/integration test a commit on CI
- Push passing build distributable to S3
- Your deploy script on server stops app, downloads from S3, starts app
If S3 is down (better take some real MRE’s into the basement, the end is near), you either:
- Download the distributable artifact from CI and scp it to the server
- If CI and S3 are down, build locally and scp it to the server
The point is to have a canonical way to turn an explicit state of source (i.e. checkout hash) into an binary that will consistently run as desired where you deploy it. No chasing thousands of source files. No trying to compile all the source on your workstation, and on your CI, and on your front end servers. Fight entropy. Choose determinism.
Do you work in one of those scripting languages? Say PHP, Ruby, or Python. Ever had your SCM fail to update files because of open file pointers to running or zombie threads? Prepare yourself for some possible non-deterministic behavior when you deploy these apps via Git. It’ll be best to add some pgrep steps to run around and kill off possibly offensive threads, but then you’ve got to be asking yourself, “what life choices did I make that have lead me to run around killing myriad threads on each deploy?”
SCM’s worse than git
Git works pretty well, but what if you’re deploying with another SCM like SVN. God help you, friend. The change databases that back your local SVN checkout can get corrupted in wondrous ways. The net result can be that SVN says you’re on revision X, and `svn status` shows no files are changed locally. When you call `svn update` or checkout the target revision, you’re told you’re already up to date… but you’re not. This is true FML territory. If your SCM cannot reliably track changes, it should be cast to special circle in hell. Sadly, I’ve personally seen this happen three times in a single year. God help you, friend.