Thursday, May 18, 2017

Site Reliability Engineering; The Book and The Practices

Site Reliability Engineering

It’s difficult to walk into a software development organization without hearing about the discipline of Site Reliability Engineering (SRE) though you may discover that SRE means different things to different teams. The practices of Site Reliability Engineering are all well known, and successful teams practiced them before there was a collective name for them. I read Site Reliability Engineering in an attempt to get a handle on what the discipline covers, and what SRE teams do. I do feel like I have a better understanding both of what canonical SRE is, and helped explain why different organizations practice the SRE discipline in slightly different ways. I recommend the book to those want to learn more about deploying systems at scale, though with some caveats.

The book is very Google centric, which is appropriate since the book is subtitled “how Google Runs Production Systems.” How well you can apply the lessons in the book to your team depends on your prior expertise, and the specific topic.

The book is a collection of articles by a variety of contributors, rather than a single sourced book and as a result the writing is a bit uneven. Some chapters do a good job of walking you though the subject area, and distinguish how what Google does could apply to your organization and too chain. Others are Google centric to a fault, describing internal tools and approaches with little if any reference to similar more generally available or even open source tools, or the tradeoffs to consider when evaluating that decision for your team.

The principles in the book are all generally valid and you will definitely walk away from the book knowing more about the issues to consider in keeping production systems running at scale. You many or may not have a clear idea about how to implement the lessons you learned.

Google is a successful company and has solved some challenging problems, and we can learn a lot from the Google practices. It’s important to remember that Google is also unique, in terms of history and problem space, so one should consider adapting the Google Way, rather than adopting it without interpretation. This book is a great launching point for discussion, and it’s worth having a copy if you deploy systems at scale. Just don’t take it as The Way to do Site Reliability Engineering.

Friday, May 5, 2017

Designing your life: Design and Agile Tools Applied to Life.

Designing your Life: How to Build a Well-Lived, Joyful Life by Bill Burnett and Dave Evans explains how you can apply design thinking to life choices. While the true test of the information in the book is to do the exercises and embrace the process, and look at the results (which I have not yet done) I finished the book with an excellent understanding of the possibilities and the feeling that I could use the tools to better understand my goals. Participating in a mini-workshop that Bill Burnett helped to validate that the process can be very powerful. A lot of work, but powerful.

As I mentioned in an earlier Techwell article (which was based on an interview) many of the concepts here may be familiar If you are a student of agile principles. Some of the exercises are reminiscent of those in Innovation Games. There is an exercise that is reminiscent of a career timeline exercise that Johanna Rothman has proposed, and the discussion of job searches and job descriptions is very consistent with some of what Johanna has written about the subject.

There are other elements that sounded familiar as well. The discussion of problem reframing will be familiar to those who work with software requirements, and the discussion of the qualities of a good “design team” will be familiar to anyone who has studied teams and read books such as Extraordinary Groups. Even if the concepts are familiar, there is value in seeing them applied to life design.

That the tools are familiar should not lead you you dismiss the value of the book. (After all, one could argue that much of the “agile toolbox” derives from other disciplines.) Like any tool box, tools can have many applications and it's useful to have a guide to how to use a tool to solve your problem with the right techniques, and to understand that there are versions of the tool that are more finely tuned to your purpose. I hesitate to say "solve your problem efficiently" because life design is neither a problem with a single solution end point (it's a process) nor is it simple. By applying the collection of straightforward steps in this book you can start on the path to understanding how to design a like that is congruent in all dimensions that matter.

Regarding solving as compared to exploring, the authors emphasize that there is difference between engineering problems and design problems is that design problems in that engineering problems are more about solving, and design ones about building forward. While I believe that to be a good engineer you need to understand design, I agree that there are different perspectives, and engineers don’t often switch between “explore options” and “solve a defined problem” appropriately.

This is an easy to read book that can be useful in helping with evaluating both the big picture and specific aspects of your life. You may even gain some insight into problems related projects, since the tools are similar. Reading the book is valuable. Working through the exercises with a team can be more so. The book is full of are exercises and is supplemented by a web site with worksheets and other resources to use. This is an easy to read book that can be as useful as you want it to be.

Site Reliability Engineering; The Book and The Practices

Site Reliability Engineering It’s difficult to walk into a software development organization without hearing about the discipline of Site ...