Architecture Weekly Issue #50. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: ๐ŸคŸ means hardcore, ๐Ÿ‘ทโ€โ™‚๏ธ is technically applicable right away, ย ๐Ÿผ - is an introduction to the topic or an overview. Now in telegram as well.

WARNING ๐Ÿ‡บ๐Ÿ‡ฆ

It's already been 319 days since Russia's crazy, brutal and unjustified war against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible. If you want to help directly, visit this fund.

Video

Highlights

CircleCI breach ๐Ÿผ

The CI/CD pipelines literally have 2 jobs: build and stay secure. Obviously, CircleCI failed to fill in the second part with an announcement that the secrets stored in all the pipelines can be compromised. So if you're a client - first of all do so, and if it's not yet done - implement a secret rotation policy. The news came in on the 4th of December with an email to the clients. Gergely Orosz has made an overview of the incident and CircleCI's communication of it, grab it here.

#cicd #security #incident

GitHub solution for converting the columns to ActiveRecord's encrypted ๐Ÿ‘ทโ€โ™‚๏ธ

As we just learned storing data is better encrypted. But we don't always implement this strategy right from the get-go. And once we want to do that, a challenge arises, how to convert the plain text fields to encrypted ones without downtime. And if some fields were encrypted how to upgrade them to use better protection? Follow the GitHub blog on how they did it with their data. Multiple encryption keys, decypher rounds and error handling inside.

How GitHub converts previously encrypted and unencrypted columns to ActiveRecord encrypted columns | The GitHub Blog
This post is the second part in a series about ActiveRecord::Encryption that shows how GitHub upgrades previously encrypted and unencrypted columns to ActiveRecord::Encryption.

#security #encryption

How CloudFlare run their Kafka clusters with more than a trillion messages a day ๐Ÿ‘ทโ€โ™‚๏ธ

Kafka has been used at Cloudflare for 8 years already and processed over 1 trillion messages in a general-purpose cluster. They built several internal tools, sunset part of them for better versions, and decided to share their journey in a detailed post. They show how their Connectors framework allows declaring data transformation with a single configuration file connecting multiple systems without repeating the same code over and over again. Nice one here.

Follow-Up

The lost art of Software Design ๐Ÿผ

Do you do a big upfront design? Or do you do the architecture design phase ever? Simon Brown's talk on the lost art of Software Design explains why there is a good middle-ground between the two and spices it up with multiple software design kata examples.

#architecture #systemdesign

You should be reading academic papers ๐Ÿผ

For a long time, I was only reading blog posts. But once I started doing some system design I quickly realized that blogs are not enough: you need to understand the underlying problems. That's what academic papers are good for. StackOverflow Blog includes a short article on further motivation to do so.

#reading

Time, clocks, and event ordering in distributed systems ๐ŸคŸ

As long as we are talking about the papers, I want to remind you of a foundation paper on time, logical and physical clocks, and events ordering in distributed systems by Lesly Lamport. Have some fun with mathematical proofs!

#distributedsystem #paper

Leveraging CDC for real-time Inventory Data Processing ๐Ÿ‘ทโ€โ™‚๏ธ

Proper inventory allows for having the appropriate amount of goods to sell. This is crucial for any e-shop and of course for grocery delivery. One option to know about the inventory updates is to make the notifications by the code which modifies the related tables. But if there are many code paths it can be a problem. DoorDash went with a different solution: having a Change-Data-Capture enables the near-real-time update. Check out the details inside!

Leveraging CockroachDBโ€™s Change Feed for Real-Time Inventory Data Processing
In this post, we explore how DashMartโ€™s engineering team used CockroachDBโ€™s changefeed to enable real time inventory updates

#kafka #cdc

Enterprise Workloads at Scale with a Next-Gen IaaC Platform ๐Ÿผ

Intuit is a U.S.-based company doing tax reporting automation. They are also a partner of AWS. Intuit was an early adopter of many cloud technologies, and CloudFormation was one of them. But running it at scale brought its own challenges, like a long feedback loop and the difficulty of finding and fixing the error. In this post, they tell the story of how they contributed to several open-source and AWS Cloud Development Kit as well.

Running Enterprise Workloads at Scale with a Next-Gen Infrastructure-as-Code Platform
This blog post is co-authored by Brett Weaver (Distinguished Engineer) and Jerome Kuptz (Principal Engineer) at Intuit.

Configuring the Vacuum process in the PostgreSQL ๐Ÿ‘ทโ€โ™‚๏ธ

PostgreSQL leverages MVCC - multi-version concurrency control, which is a mechanism for ensuring consistency. Basically, a new version of a row does not overwrite the old one, but gets appended, but with a higher transaction id. With time the old records add up to occupied space, and VACUUM is the operation to remove them. Read here for more about it and how to configure it.

Like the newsletter? Consider helping to run it at Patreon or Boosty. The funds go to pay for the hosting and some software like a Camo Studio license. Patrons and Boosty subscribers of a certain level also get access to a private Architecture Community. Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, and Roman for already supporting the newsletter. Join them as well!