Architecture Weekly Issue #112. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram and Substack as well.

If you're interested in the technologies, development approaches and overall business of our little startup in the compliance field subscribe on Patreon and Boosty, as I shared an article recently on how we added a second product to our architecture last week.


Backdoor to break all the internet 🍼

Fasten your seatbelts for a detective and frightening story. A hacker was gaining trust for 2 years for the repo with opensource XZ tool in order to plant his backdoor. And he almost succeeded. The slight change in the performance of ssh got attention of one of the specialists and made them investigate what's going on. A lot of details - inside the article.  

Everything I know about the XZ backdoor
Please note: This is being updated in real time. The intent is to make sense of lots of simultaneous discoveries


Why Observability Requires a Distributed Column Store 👷‍♂️

Relational databases store the data in a row-based manner, so that a single row is written in one place in a file. Column dbs place the columns into the separate files. Great visualization exists on the main page of ClickHouse(Thanks, Nik!).  Find out why the column design for the data storage is crucial in observability solutions and why it also should be distributed.

Why Observability Requires a Distributed Column Store
Alex explains distributed column stores, how they work, why they’re so fast, and why that’s a fundamental requirement for observability.

#performance #observability

Delayed replication for disaster recovery with PostgreSQL 👷‍♂️

PostgreSQL has the Point-In-Time-Recovery feature which allows to bring the state of the db to the known point in time in the past with the help of a snapshot and a WAL log. At the same time, for replication purposes there is an ability to have a delayed replica. It turns out that in the case of a disaster like removing data, delayed replication can be useful as well. Look, how it saved Gitlab in their post.

#db #disasterrecovery

On a side note

This week I opened the Business Oriented System Design Course. Luckily or unfortunately, the first cohort is already sold out. However, you can drop me a message through the form here, I will add you to a waiting list for the next cohort or in case if anybody changes their mind.


What's a distributed system? 🍼

How do you define a distributed system? Find an answer in this blog, where Karim Fanous uncovers not only the formal definition, but the problem space for the distributed systems.

What’s a distributed system?
It’s not microservices exchanging messages and storing data in a DB


Database Isolation Levels Explained 🤟

Isolation levels are a critical feature in Database affecting consistency and performance. It's also tricky to understand. That's why I am bring a two-part video of 2,5 hours long which explains those levels in deep details.

#distributedsystems #db

What is Platform Engineering? 🍼

Platform Engineering replaced the term DevOps, but what it actually means? What it has to do with the Theory of Constraints and Continuous Improvement? I am speaking with Anton Weiss, a Software Developer Futurist from PlanetScale.

#interivew #video

Composite SLO 👷‍♂️

Availability is a frequent topic in system design. Typically a software system contain several components with their own availability guarantees. How do you calculate your availability? Nice article by Alex Ewerlöf.

Composite SLO
How to calculate the SLO of a complex system that is made of multiple components?


SRE at startups and smaller organizations 🍼

Site Reliability Engineering originates from Google. The problem though is cargo-cult: people even in tiny startup tries to apply practices which make sense in much larger organizations. However, when you lunched MVP and start getting tracktion it makes sense to introduce some of the practices one by one. See more inside!

Starting SRE at startups and smaller organizations – Boost SRE work | SREpath



The brutal and unjustified war against Ukraine continues already 2 years. If you want to help Ukraine directly visit this fund.

Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita and Dmytro for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. They also see my daily updates on all the things I am working on. Join them at Patreon or Boosty!