Architecture Weekly Issue #38. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram as well.


It's already been 221 days since Russia's crazy, brutal, and unjustified war against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible.


Video version of this issue is available on YouTube. Subscribe!

Reducing Logging Cost by Two Orders of Magnitude using CLP 🤟

The more your business grows the more you need to understand what happens within your system. With Uber and tons of analytical data and therefore log data generated each day, it became an issue. They need to retain a sufficient amount of logs to understand what happens within their Spark jobs and not pay for it too much. So they used a Compressed Log Processor, which was able to reduce the number of writes to SSD and store the logs in a searchable manner but occupying two orders of magnitude less space. More details inside!  

#logging #costoptimization #bigdata

Reducing Logging Cost by Two Orders of Magnitude using CLP
Long, long ago, the amount of data our systems output to logs was small enough that we were able to retain all of the log files. This allowed our engineers to freely analyze the logs, say for troubleshooting our systems or improving applications. But as Uber’s business grew rapidly, the amount of da…

Complete System Design Series 🍼

I stumbled upon a good series on System Design on Medium. Naina Chaturvedi makes the introduction to what System Design is, and proceeds to scaling, load balancing, database sharding and many more. She also has a series of posts about designing real-life services like Dropbox, Instagram, Web Crawler and many more.

Complete System Design Series — Part 1
With examples and intelligible explanations…


Distributed Architecture Concepts 🍼

Gergely Orosz is famous for his "The pragmatic engineer" newsletter, but it started small and gain its first pick with an article on distributed architecture concepts. He explains what are SLAs, scaling, consistency, data durability and more, he learned during his work at Uber.

Distributed architecture concepts I learned while building a large payments system
When building a large scale, highly available and distributed system, what architecture concepts do you need to use, in practice? In this post, I am summarizing ones I have found essential to learn and apply when building the payments system that powers Uber. This is a system with a load

#systemdesign #consistency

End-to-end field-level encryption for Apache Kafka Connect 👷‍♂️

In some markets and business domains, you will face high requirements for security and privacy. If you use Apache Kafka for data processing, you may find yourself in a situation where you can't allow PII data into the streaming platform. One of the solutions would be to use e2e field-level encryption. Please find an article by Hans-Peter Grahsh from Red Hat on how it can be realized.

End-to-end field-level encryption for Apache Kafka Connect | Red Hat Developer
This article introduces end-to-end encryption for data integration scenarios built on top of Apache Kafka using Kafka Connect together with the open-source library Kryptonite for Kafka.

#security #kafka

How Block Unified on One Graph 👷‍♂️

GraphQL is a technology which provides API clients to control what fields they want to query. Block(previously Square) adopted GraphQL for different parts of their systems, including data about merchants, devices, payments and more. They decided to unify those APIs under a single Federated Graph. Find a story about how they achieved that.

How We Unified on One Graph at Block
Using GraphQL federation to unify under one “supergraph”


No-code permissions with Kong and 🍼

Kong API is one of the popular API Gateways out there. And as with such systems, you always want to check if an API user can call a particular API. From that perspective, a solution called can help, as they provide a platform for easy permissions management. Nowadays, you can combine those two to minimize coding effort for policy management for your API. A short guide on how to configure the two together inside.

No-code permissions with Kong and
Kong is one of the most popular API gateways out there; but managing access to API and services behind it can be quite a bit of work especially as the application evolves requiring more and more advanced permissions models (RBAC, ABAC, ReBAC, …) Enter Permit - a full stack permissions service, that…

#security #api #nocode

Are you spending too much on Kubernetes? 👷‍♂️

Kubernetes is pretty widespread in modern distributed systems. Part of the architect's job is to improve the cost efficiency of the solutions, e.g. not spending too much on the infrastructure. Denilson Nastacio from IBM wrote about six considerations that you might find useful and thought provocative about where the cost comes from within a Kubernetes cluster.

Are You Spending Too Much on Kubernetes?
Probably, yes.

#k8s #cost #bestpractices

Discord supercharge for network disks for low latency 🤟

Discord runs 4 billion messages each day. They want to be reliable and fast. Discord also uses GCP for its hosting, and local SSDs do not fully fulfil their requirements. Please find out how they created their own write-through cache to achieve their goal.

How Discord Supercharges Network Disks for Extreme Low Latency
It’s no secret that Discord’s your place to talk; 4 billion messages sent a day have us convinced. But text only accounts for a chunk of the features that Discord supports — learn how Discord optimizes its platform to respond to the high frequency of queries for all types of content and data as quic…

#databases #architecture

StackOverflow architecture 👷‍♂️

This is an old, but very well-aged post on how the biggest programming(and not only programming) Q&A site works. Despite expectations of heavy caching, microservices and other stuff, they manage to do the job with a pretty basic setup. Read the whole story down below!

Nick Craver - Stack Overflow: The Architecture - 2016 Edition
This is #1 in a very long series of posts on Stack Overflow’s architecture. Welcome.Previous post (#0): Stack Overflow: A Technical DeconstructionNext post...


Logging and monitoring solution for Azure exercise 👷‍♂️

Logging and monitoring solution is a must-have part of every production software system. Near 30% of the real-world systems running in the cloud do it on Azure. In turn, Azure has a good exercise on how to build a proper monitoring solution for your Azure resources. Totally recommend passing if you happen to work in the cloud by Microsoft.

Design a solution to log and monitor Azure resources - Training
Azure Architects design and recommend logging and monitoring solutions.

#azure #cloud #training

Like the newsletter? Consider helping to run it at Patreon or Boosty. Patrons and Boosty subscribers of certain levels also get access to a private Architecture Community. Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel and Robert for already supporting the newsletter.