Architecture Weekly Issue #41. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🀟 means hardcore, πŸ‘·β€β™‚οΈ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram as well.


It's already been 249 days since Russia's crazy, brutal, and unjustified war against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible. If you want to help directly, visit this fund.


Kafka Consumer Lag Monitoring πŸ‘·β€β™‚οΈ

Consumer Lag is the difference between consumer offset and the write offset of a producer. This lag is essential when you process near real-time data; if it's too big the data you produce can be useless. That's why you need to monitor the consumer lag of Kafka clusters. Sematext explains the reasons for lags and suggests their own monitoring solution to discover those.

Kafka Consumer Lag Monitoring - Sematext
Learn how to monitor Consumer Lag in Apache Kafka. Tutorial on how to calculate and avoid it with Kafka monitoring tools.

#monitoring #kafka

Message delivery and deduplication strategies πŸ‘·β€β™‚οΈ

We spoke about the Outbox pattern last week. Continuing the message delivery narrative, I am sharing an article on the underwater stones of using the unique id per message for obtaining idempotency. In short, it requires transactionality and it can be tricky in a distributed system. More details inside.

Message delivery and deduplication strategies | SoftwareMill
Did you know that you can manage atleast-once delivery not only on the producer but also on the consumer side?

#messaging #idempotency

Presto Speed Up with Alluxio Local Cache at Uber  🀟

Analytics at big companies such as Bolt or Uber is a load-heavy task. Uber runs half a million analytical queries a day against their set of Presto Clusters. But even having load balancing and split load for on-demand and scheduled jobs does not fulfil all the requirements. Uber introduced caching of the query results and applied Alluxio Cache Library, consistent hashing for nodes, cache filters and cache metadata to solve some of the raised issues.

#presto #bigdata #performance

Thoughtworks Technology Radar πŸ‘·β€β™‚οΈ

Thoughtworks published a new issue of the technology radar this week. Important highlights include the trial of Camunda, Svelte, and Kotlin Gradle DSL. Threat Modelling finally received the Adopt verdict alongside Cognitive Load for teams. Check out the full radar below! Β 

Technology Radar | An opinionated guide to technology frontiers | Thoughtworks
The Technology Radar is an opinionated guide to technology frontiers. Read the latest here.

#techradar #radar

Dynamic Security Testing in CI/CD pipeline 🍼

Dynamic Security testing is a strategy of running an app and trying to find and exploit potential vulnerabilities. As per the 'Shift-left' approach, we want to conduct security testing as soon as possible in the development lifecycle. Grab a short note by Akira Brand on strategies to integrate the DAST and what approaches to avoid. Β 

Successfully Integrating Dynamic Security Testing into Your CI/CD Pipeline
Dynamic security testing tools don’t require advanced cybersecurity knowledge to operate. Integrating DAST into your CI/CD pipeline should be done in stages by focusing on the riskiest areas first.

#security #cicd

What is OpenTelemetry? 🍼

Observability is one of the important properties of a software system; OpenTelemetry is a standard and a set of libraries and software components intended to abstract obtaining the monitoring data and sending it to a collection solution. Find out the architecture, principles and current state on the page.


TiDB architecture overview 🀟

At Bolt we are migrating from MySQL to Titanium DB for the sake of scalability and storage efficiency. Get to know what TitaniumDB offers and how it works under the hood!

TiDBβ€Šβ€”β€ŠSQL at Scale!
TiDB (β€œTi” stands for Titanium) is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing (HTAP)…


A different flavor of the distributed transaction πŸ‘·β€β™‚οΈ

In this talk Martin Stefanko demonstrates the Java library which can implement a Saga pattern for you in a very convenient way. Basically, if you need to make 3 actions in a distributed transaction, the library can help you doing so with minimal coding effort, handling the rollbackes and the necessary context. Watch the full video!


List of Foudational distributed systems papers 🀟

Murat has been a great source of distributed system papers overviews. I am sharing a list of foundation papers on time, consensus and other topics which are absolutely necessary to grok the distributed systems well. Β 

Foundational distributed systems papers
I talked about the importance of reading foundational papers last week. To followup, here is my compilation of foundational papers in the d...

#distributedsystems #whitepaper

Failure Types 🍼

Dominik Tornow, who's newsletter I am subscribed to, shared an article of himself about temporal, intermittent and permanent errors and how they can be classified in spacial dimension, e.g. where in the system they should be handled. Good guide on handling application errors.

Handling Failures From First Principles
The post presents a blueprint for a principled failure handling strategy that guarantees correctness while maximizing the chance of success

#applicationarchitecture #errorhandling

Like the newsletter? Consider helping to run it at Patreon or Boosty. Patrons and Boosty subscribers of a certain level also get access to a private Architecture Community. Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel and Robert for already supporting the newsletter.