Monthly Archives: September 2021

Best is the enemy – stick with good

Working life, like any series of events, can be compared to other stories, such as those in cinema. Is your workday like Avengers Endgame when all your coworkers show up out of thin air and swarm to solve a difficult problem? Are you Malcolm Tucker of In the Thick of It helping your co-workers by providing astute observations and giving gentle constructive criticism? Or is it more like the middle of a 70s social realist movie when the alcoholic father / engineering manager promises you that although sure it’s bad now – mistakes have been made – but you know it’ll be great, we’ll stand up kubernetes and we’ll never deploy manually again?

Obviously – when they make a big budget movie out of the Phoenix Project, we can just look at that, but until then – what film are you living now? Which one would you like to be living?

You want to believe assurances of a bright future, but deep down, you know you’ve heard it before. Perhaps problems being indiscreetly alluded to in an early act comes back in the final act to cause a massive predictable calamity making you think your movie has a poorly crafted arc. Perhaps you give your social realism engineering manager the “you need to cut down, think of the kids” speech – but it’s met with denial. “Our network guys are diligent [to be fair -they probably are, regular John C Reillys the lot of them], it takes 2 minutes to make a configuration change – why would I take hours out of their day to write scripts to do things they complete in half an hour including the red tape we have imposed upon them? Do you realise how busy they are?”.

What if you want to switch franchises – so to speak? Get into a better movie? Let’s say your film is the social realism one, and after a few accidents in the workplace, the union is shutting the site, and the owners are threatening to move production overseas. Car factory, sounds Birmingham-based on the accents. Lovely soundtrack with early seventies Black Sabbath. Your character has to stop the mayhem on the factory floor so that the union will allow production to start before the owners scrap the factory for good, your budget is £0 but you happen to have massive rolls of black and yellow adhesive tape, some PPE and a loudhailer. Basically, you can turn your film around, you can do it – but you do have to literally start doing something.

I’m writing this to continue on a ball of yarn I’ve been unravelling in other posts. Basically I want to state that DevOps doesn’t need to mean shiny and new. Any type of automation that does the job is fine. You don’t have to change platforms , you can – and I personally mean should – start by automating the existing stuff and not by building a new feature complete platform. Take the first step! Stop dreaming about a service mesh and kubernetes. It won’t happen soon enough.

This next bit will be very marvel oriented, by the way, but feel free to translate this to your own cinematic universe. It’s like – you can manage to automate and ship software reliably, but you may not be ready to be Tony Stark. You are still be part of the MCU, but you won’t be an arms dealer billionaire or even an Australian Norse space God, or even a fighter pilot with accidental alien super powers and amnesia. The best you can hope for is to write PowerShell or bash. PowerShell and bash, perhaps. Cobble together some automation with whatever CLI you have laying around. Automate the simple things. Even if you are disrespected in the office like Agent Carter you can eventually save the day. The big first step is to figure out how all your hand crafted bespoke servers are really built and figure out how to build them from scratch with scripting. This is the painful, tedious first step that you have to take. How can I create my production environment using only scripting and free or affordable tools that my people already know how to use?

In too many companies the deployment automation is:

  1. Download packaged tested software from archive
  2. Disable monitoring to avoid scaring your on-call people
  3. Divert network traffic from node
  4. Decompress archive and copy files in place
  5. Restart services
  6. Re-enable traffic.
  7. Repeat 2 -6 for all other sides in the load balancer.
  8. Re-enable monitoring

This is not enough. There may be any number of unknown things that just live on your VMs without which things just wouldn’t work. Crucial OS settings that were made once that nobody remembers anymore. Such hidden things are the potentially big surprises that derail containerisation projects or cloud migrations. You need to Agent Carter the Whole Thing.

  1. Define networking. You have some leeway here – use a wild card cert, generate a new short-lived cert- create a load balancer or just a rule for a central load balancer. This depends on what you have in your infrastructure and what tools you know how to use, but basically – if the starting state is nothingness, after the automation is run, there should be a way for the outside world to find your service and know if it is healthy. If things already exist, your scripting should only make expected changes to it and be able to run multiple times without accidentally causing mayhem. Make sure any WAF rules or similar to enable access to dependent services are also set up here. If you can’t reach a necessary service at all, this should be immediately obvious from tooling without even digging into logs.
  2. Define virtual servers. If all you have is VMWare CLI, then create a VM based off of a suitable template. If you have some fancy cloud provider, use the highest abstraction level you can get away with. Azure Webapp, AWS ECS or Lambda. Stay away from raw VMs if you’re running in cloud, they are expensive.
  3. Install your servers to their desired infrastructure state and patch level. Ideally you use Ansible or even Powershell Desired State Configuration. There are so many non-trendy options that you already probably have a few installed. Chef or Puppet works too, if you have guys that know that stuff already. Find out what people already know and pick the simplest technology. The specific technology you choose isn’t key here, the big idea is learning how to take empty metal and get your stuff working there without having to do any manual intervention whatsoever. All of the infrastructure must be code.
  4. Now you’re at the point where the previous list is relevant. Of course depending on your choice of technology you may not need to repoint load balancers as some tools like chef and puppet support in-place upgrade. A central brain/source of truth will announce that new software exists, and you have to manage in place upgrades through ruby scripting if you’re unlucky, but it works . Either way, only here are we at what the previous CD solution thought was all of it.

You aren’t done until you can spawn a service as easily as your users shout “Another!”. You can get to this point with tools your guys already know. It may not as sexy is just flying straight through an enemy star destroyer using helm whilst your mechanical keyboard glows in addressable LED colours, but the point is your organisation most likely possesses the skill to do this already. You must take the first step.

Whichever cinematic universe your life’s film belongs to, should be proud at what you have achieved in the face of such adversity.