Published on

A good Site Reliability Engineer takes notes

7 min read

Authors

As Engineers we like to do more work less note-taking we've all being guilty of thinking note-taking is a burden, In this article I share my journey from hating note-taking to becoming a consistent beneficiary of a ritual of note-taking.


I was wrong, I should not compete with the machine, I should merge with it, use it for where I'm not very good at.

In the past few months, I have been improving my relationship with writing more documentation and notes in my day-to-day work as a Site Reliability Engineer, This practice that I have long disliked and thought of as taxing and burdening, I am slowly realizing the magnitude of my mistake, but I am also thrilled to introduce you to the idea so that we can reap the benefits together.

Years in my life, my ego has always blinded my sight, I always thought (Not really) I had a sharp mind that I could rely on, as I grew to understand a highly available postgres cluster is probably more reliable than my brain when it comes to storing and retrieving information. I realized that humans can actually build tools that complete their imperfections; but, what's more powerful than the machine is the cyborg, merging the creativity of the human being with the engineered reliability of a process and a machine.

So what begun with ego slowly dissolved into humbleness and desire to integrate with the machine instead of delusional hope to outperform it, I went from thinking that taking notes is slowing me down, taxing me, to consistently take notes on my day-to-day work like a ritual, and here is how that went:

Note-Taking: Applications & Impact

1. In the Meetings

I never noticed but the startup (scaleup) where I work and probably most others, suffer from prioritization and tracking problem, Our focus is consistently hijacked by pressing or unplanned work just like in The Phoenix Project.

Once we set up a whole environment without a backup for the database, it exploded on our face. (lesson learned, more on this story on another post..)

Only recently after I started opening a document of every weekly meeting with tiny notes of the discussion and priority queue, that I started noticing we were not finishing everything we discussed in the week, and some of the work gets forgotten because it was not recorded and properly prioritized. the next meetings there are always more things that we discussed.

Slowly I started picking up what used to slip through, and remind the team of last meeting's important actions, discuss them a second time briefly and either assign resources to them or push them down the queue, the result of this is of course very positive, we started moving forward, we acquired more resources that we could assign to those tasks to match the velocity, and sometimes we decide a change is far-fetched and not necessary.

And no, I am not becoming a manager or a secretary by starting to take notes of meetings, I take them to make improve myself. I'm simply taking the actions necessary so that my career progresses efficiently without much waste. see that's problem-solving that I care about, caring if I'm a secretary is Politics, I don't like politics.

2. In The Actual Work

More often than not, a manual task isn't requested enough for its automation to be beneficial, sometimes you have to perform a manual task once or twice in a month, and in that time you don't want to be researching StackOverflow all over again for the answer that worked for you, adjusting it again to your context and probably forget an important step that messes things up, resulting a lot of going back and forth before the task is successfully done. Another one that happens a lot to me is searching the history of commands in my terminal, something looks familiar and execute it, only to realize the next or previous variation is what worked.

Anyway, A sea of uncertainly you don't want to be regularly swimming in. (IT IS TAXING)

What I noticed after I started noting my SQL snippets and Context-Specific Command flows, is:

  1. Tasks take me far less time the next time they occur, because I don't have to care about fitting solutions on the internet to my context again.
  2. Execute tasks more confidently, leaving room for uncertainty only when it's necessary.
  3. I don't feel repetitive by thinking about something I thought already in the past.
  4. My brain is less burdened, it's like I'm executing heavy queries just on a reader replica.
  5. Overall these kind of notes make me fast and reliable as a Site Reliability Engineer, and they bring you closer to automation, you already have the steps noted, in case this manual task became high occurrence one, you'll quickly mark it done, more to next fulfilling thing.

3. In The Incident Response

When I start investigating an incident, most of the time I check many places, realize many facts but keep them all cluttered in my mind, until I reach something very decisive then I start noting everything(or everything I can remember) for the report. But, by the time I have something decisive I forget all the work that lead me there and I focus solely on the conclusion, and I feel like I've done so much but very little to show for it; I don't have a trace of the all the evidence I collected and the places I looked, it seems like the solution came in one hit and one go.

Documenting the journey of an investigation by doing the following helped a lot:

  • Screenshots of interesting findings in graphs. either I add arrow and labels in excalidraw or put the graph with a note in the document.
  • Links to analytics in APM tools with specific time ranges to showcase a change in a graph.
  • Analytical theories and arguments for those.

When I'm stuck I go back to my notes to find new direction to go, that often lead to better paths. it also makes my thinking clearer and less cloudy. Now, I prepare better incident reports, I can extract actions (without forgetting or losing info) to advise the team, thanks to note-taking I help move the organization goals of improving performance and providing a more reliable service.

It also makes you a more valuable member of your team, you're not just a person who executes tasks, you're a person who can think and solve problems, and that's what we're paid for.

The Benefits ,It's a HELL YES.

By now the benefits I hope are clear:

  • You strengthen your learning when noting, Generation Effect.
  • No more value loss, the info you worked hard for, you keep.
  • Stress less about forgetting, You have your noted knowledge when you want to come back to it.
  • Feeling of DONE, move to next task more relaxed.
  • You can delegate and provide clear instructions when you need to, avoiding the inefficiency that could rise from assigning a task for the first time to a member without enough context to perform it.
  • You spend less time looking for material to support you in your meetings or discussions.
  • You're more focused, engaged and productive.
  • You're less repetitive, less likely to feel stuck.
  • You're more efficient, Your work has more quality and the value is more visible.

Docs and Notes become part of my ritual at work, I made a good case for it to my mind, that it has a big upside and almost NULL downside, I find it easier to keep this ritual alive after demonstrating its value by experimentation to my self, I hope it does to you as well.

© Abdelati Elasri 2024