What can SREs do to make the DevOps philosophy tangible?
I asked a bunch of digital friends who happen to be SREs and DevOps Engineers what they do to pull off the DevOps philosophy in their teams. What I learned follows...
👋 Hey, Ash here! Welcome to this week's instalment of the Cruform newsletter. Each week I look into a topic relevant to how production software work gets done.
If you’re not a subscriber, here’s what you missed recently:
SRE: critical to the future of healthcare?
(new-ish subscribers, please be aware that I am planning to focus on team dynamics oriented content once I get dry with SRE content — that might take a while!)
Earlier, I mentioned that I asked my SRE and DevOps friends about what they and their teammates do to make DevOps a reality.
Here are their responses in a slightly less-candid way 😆 than how they told me:
Obvious DevOps activities
Setup load and integration testing to prevent regression of problems
Optimise CI/CD flows (no more needs to be said about this)
Monitor cloud environment
Build out initial cloud environments or projects for developers
Write code to automate out stuff that should be automated
Resolve app downtime, inaccessibility and performance issues
Less obvious DevOps activities
Coach the very new (< 2 years experience) and very tenured (anyone who developed software before 2010) developers on tenets of DevOps
Learn and implement fixes for the problem across all future instances, not just the one that gets alerted (proactive vs reactive work)
Learn and/or teach processes and technologies that will empower engineering teams to adopt DevOps philosophies
Push hard onto intracompany politics e.g. sell the need for a faster approval workflow of firewall rule changes
Constantly work out how to reduce meant time to response (MTTR) to problems when they arise (review past incidents regularly)
Avoid meetings — a bugbear for everyone who works in an office because most workplaces don't do meetings well
Explore new technologies in CNCF — many projects are coming out of sandbox and incubation, but a lot of us are yet to even look at the mature tools
Really push for an SRE shift: monitoring, logging and alerting stack with dashboards for SLOs
Flush out ongoing problems with CI such as downtime/slow responses on building infrastructure then make it go away.
And a few funnies to not take it all too seriously
Meeting about the meeting to plan for the next meeting so we're ready for the sprint planning meeting to discuss why were not getting through all the tasks. It's a mystery but the PM wants another meeting because they don't understand what we discussed during the first meeting.
Show developers why their code sucks (side note: be careful and gentle with your developer comrades, alright? Radical candor as a philosophy only seems to work at Netflix because they work hard at it)