Familiar Smells I’ve Detected in Your Systems Engineering Organization and How to Fix Them
Dave Mangot
Hide details ▾
Over the course of my career, I’ve had the opportunity to work with a number of organizations on their operational maturity. After doing "systems archeology" a number of times, starting at new organizations, I began recognizing certain signature "smells" that indicated that there was something that could be improved, and often had a pretty good idea how those situations came to be.
Things like the volume of pager alerts can be indicators of poor signal to noise ratios, or overworked infrastructure, or broken architectures. Things like elaborate change control can be signs of inadequate testing, or lack of automation (as if a review by people unfamiliar with the changes makes it safer). Recovery mechanisms that are never tested are never going to actually work in the case that they are needed except in the most trivial of cases.
There are many such examples with single points of failure, competing change mechanisms, scaling challenges, outsourcing of manual automation (not a typo), badly scoped runbooks, immature monitoring, multi-generational monitoring systems, and more, that are signs that we can do better.
In this talk, we’ll talk about some fun that was had…
===
Original video: https://www.youtube.com/watch?v=lzl4nu0ZHQo#action=share
Downloaded by http://huffduff-video.snarfed.org/ on Mon, 28 Oct 2019 13:51:24 GMT
Available for 30 days after download