Is Github Alive

It’s been recently noticeable that me and my team have had to halt development or progress due to a Github outage. This has been much more visible since Microsoft acquired Github As an SRE myself, I can’t help but wonder what is going on at Github/Microsoft… Is it because Github moved to Azure and the real cause of unrelaiblity is that Azure can’t cope? Is it because Github has a ton more features that are not well built? ...

May 11, 2023 · 2 min · oschvr

Things I want as SRE/DevOps from Devs

It has been a while since I’ve been working as SRE/Platform/Cloud Engineer, and lately and I realize I’ve been repeating some questions to developers that I rarely get an answer for straight away. These are not meant to make anyone’s life harder, au contraire, the whole pourpose of having a solid answer to this list of questions, is to make everyone less worried about the probabilty of some high stakes, overnight failure or a data handling missuse that could potentially cause big losses, and of course a lot of unnecessary stress. ...

December 15, 2022 · 3 min · oschvr

HumanOps Mantra

I just found this precious little jewel on the internet. The HumanOps-mantra I’m re-posting here as I strongly believe in every single point made in it. HumanOps Mantra Humans build and fix systems. Humans get tired and stressed, they feel happy and sad. Systems don’t have feelings yet. They only have SLAs. Humans need to switch off and on again. The wellbeing of human operators impacts the reliability of systems. Alert Fatigue == Human Fatigue Automate as much as possible, escalate to a human as a last resort. Document everything. Train everyone. Save time. Kill the shame game. Human issues are system issues. Human health impacts business health. Humans > systems

November 30, 2022 · 1 min · oschvr