COFF November 2022

coff@tuhs.org

3 participants
2 discussions

by Peter Jeremy

Since I haven't seen it mentioned here: According to various sources, Fred Brooks passed away on 17th November - see https://en.wikipedia.org/wiki/Fred_Brooks -- Peter Jeremy

2 years, 7 months

Re: DevOps/SRE [was Re: [TUHS] Re: LOC [was Re: Re: Re.: Princeton's "Unix: An Oral History": who was in the team in "The Attic"?]

by Michael Parson

(Moving to COFF, probably drifted enough from UNIX history) On 2022-11-09 03:01, steve jenkin wrote: >> On 9 Nov 2022, at 19:41, Dan Cross <crossd(a)gmail.com> wrote: >> >> To tie this back to TUHS a little bit...when did being a "sysadmin" >> become a thing unto itself? And is it just me, or has that largely >> been superceded by SRE (which I think of as what one used to, >> perhaps, call a "system programmer") and DevOps, which feels like a >> more traditional Unix-y kind of thing? >> >> - Dan C. > > In The Beginning, We were All Programmers… <snip> I got started in this field in the mid '90s, just as the Internet started moving from mostly EDU & military to the start of dial-up ISPs. My first job was at a small community college/satellite campus of UTexas where me and my co-worker set up the first website for a UTexas satellite campus. I'd played with VMS and SunOS, Linux was brand new and was something we could install on a system we built out of spare parts from the closet. At the time, my job title was "Assistant Systems Manager," where my main job was to add/remove users from the VMS system, reset stuck terminal lines, clean out the print queue, etc. Linux was very much a toy and the Linux system we installed was a playground. It was mostly myself, a few others on the team, and a few CS students that wanted to use something that looked more like Unix than VMS. > SRE roles & as a discipline has developed, alongside DevOps, into > managing & fault finding in large clusters of physical and virtual > machines. My next several years were spent dot-com hopping, as a sysadmin. Mostly in IT shops where we kept the systems that company used online and working. The mail server(s), web-servers, ftp sites, database servers, NFS/CIFS, etc. My job-title for most of my jobs through the mid '00s was (senior) sysadmin. I then spent 8 years as a senior product support "engineer" at IBM (I was CAG/SWAT, for anyone that's familiar with IBM/Rational's job roles), during which time I started seeing the rise of what they eventually started calling DevOps in the early 2010s. As the web grew bigger and bigger, and the concept of Software as a Service and so-called "Cloud" services (AWS, Azure, etc.) became more and more of a thing, the job of keeping the systems that ran those services started splitting off of IT and into their own teams. They took what they learned in IT, tried to codify some "best practices" around monitoring, automation and tooling, started using more shrink-wrapped stuff like ansible/chef/saltstack instead of home-grown stuff we (re)wrote with each job, etc, started forcing ourselves to be part of the dev/test/deploy cycle of the products we were supporting, etc, and someone branded the new work-flow as 'DevOps'. I've glossed over the dev side of that a bit, as they also got more and better build tools, IDEs, and for better or worse, all things git. My current day-job is being a DevOps manager. I started here 8 years ago on the DevOps team and was promoted to manager 4 years ago. > Never done it myself, but it’d seem the potential for screw-ups is > now infinite and unlimited in time :) Yup, the potential for pushing a bad config or big of code to dozens, hundreds, or even thousands of systems with the click of mouse or a single command line has never been higher, but only if the dev/test cycle failed to find the error (or wasn't properly followed) before someone decided to deploy. The guys on my team are supposed to have tested their stuff in their environments before even committing it to the repo, then it spends some time in the QA/test lab before it gets pushed to production. They're not even supposed to commit directly to the main repo, it should be done as a pull-request and someone else at least does an eye-ball review to look for obvious mistakes, which should have been caught by the originator, if they were doing proper testing in their dev environment first. Our basic tooling is github enterprise for source and saltstack is our config management/automation framework. Their work-flow is supposed to basically be: 1 pull latest copy of main repo 2 branch a working set 3 make their changes 4 use something like vagrant to spin up test VMs to test their changes (some people use docker instead of vagrant/virtualbox) 5 loop over 3-4 until it works 6 commit their changes to their branch 7 pull-request to main a. someone else on the team does an eyeball code-review b. other team member performs the merge 8 cherry-pick changes to the next release branch if changes need to go in the next release, PR those picks to the release branches, same process as above for merges. 9 push changes to the test env (test env is running on the next release branch) 10 when QA clears the release, we push to prod on release day. The developers that actually write the software offering have similar workflows for their stuff, except they have a build-system involved to compile & pkg stuff up & put the packages into the package repo which get deployed to test (and eventually prod) with saltstack rules. Our SRE is mostly concerned with making sure the monitoring of everything is up to snuff and the playbooks for acting on alerts is up-to-date and the on-call person can follow it. We have a meeting every other week to go over the alerts & playbooks to make sure that we're keeping things up to date there. He doesn't manage the systems at all, he just makes sure all the moving pieces are properly monitored and we know how to deal with the problems as they come up. -- Michael Parson Pflugerville, TX KF5LGQ

2 years, 7 months

2025

2024

2023

2022

2021

2020

2019

2018

COFF November 2022