Architecture Weekly #25

Architecture Weekly Issue #25. Articles, books, and playlists on architecture and related topics. Every record has the complexity indication: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - introduction to the topic or an overview. Now in telegram as well.

WARNING 🇺🇦

It's already 137 days of crazy, inhuman, unjustified war of Russia against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible.

RFC and Design Docs example 🍼

Writing culture is essential in successful technical companies as it allows to properly discuss the solutions and then understand the reasons behind made solutions. Gergely Orosz shows examples of Request-for-Comments and Design Documents used at different companies.

Companies Using RFCs or Design Docs and Examples of These
What companies follow an RFC-like process, and what are templates and examples to get inspiration from?

Cassandra Teams 🍼

Have you ever heard of the teams which warn the stakeholders about the consequences of not tackling some existing problems, but still fail to convince them? And then the said consequences manifest themselves? Bill Wake describes how to avoid such situations with proper communication.

Cassandra Teams - XP123
In mythology, Cassandra was cursed to prophesy correctly, but have nobody believe her. Software teams sometimes re-enact Cassandra.

10 patterns for more resilient applications 🍼

Uwe Friedrichsen, which articles we already included in our newsletter, gave a talk on the 10 patterns for resilient applications. It considers downstream and upstream connections between services and patterns which can address different problems like service unavailability, faulty behaviour etc. Find a screencast below.

Recruited and Jobs data migration at LinkedIn 👷‍♂️

Data migrations are always hard. Especially when you are not allowed to have any downtime and only have zero data loss. Especially at scale. LinkedIn faced such problem during combining of their 2 products into one. Read the story in the post.

New Recruiter & Jobs: The largest enterprise data migration at LinkedIn
Co-authors: Xiaoyang Gu, Xie Lu, and Xiaoguang Wang

Real-Time Document check for Uber Drivers 👷‍♂️

Enabling new riders fast requires a reliable and fast document check. Uber faced this problem and developed a whole solution around it. Data Flow, ML models description and challenges inside

Uber’s Real-Time Document Check
Introduction Justification for Identity Verification Latin America is a rich cultural region, known for its world-renowned gastronomy, its abundant biodiversity, and its welcoming population. However, socio-economic inequality has been a challenge for the region, and is generally considered a major…

Distributed Systems Algorithms 🤟

Designing a distributed system often solving known problems, like coming up with a hash for some data or fast search in the database. Famous ByteByteGo newsletter includes a description of the algorithms that can help.

Algorithms you should know before you take system design interviews
I put together a list and explained why they are important. Those algorithms are not only useful for interviews but good to understand for any software engineer. One thing to keep in mind is that understanding “how those algorithms are used in real-world systems” is generally more important than the…

Circuit Breaker Pattern 👷‍♂️

Circuit Breaker is one of the most famous pattern in microservice communication. Azure Architecture Center has a collection of patterns including CB. The reasons , states and reference implementation included.

Circuit Breaker pattern - Azure Architecture Center
Handle faults that might take a variable amount of time to fix when connecting to a remote service or resource.

InfluxDB, Kafka and 1 million of custom metrics at Hulu 👷‍♂️

Hulu is a Disney's streaming service. As a typical large software system there are a lot of metrics that they gather and process. Find out how they use timeseries db and Kafka to handle over a million of custom metrics.

How Hulu Uses InfluxDB and Kafka to Scale to Over 1 Million Metrics a Second
By Samir Jafferali, Senior Systems Engineer

Cryptographic failures in radio encryption 👷‍♂️

A detailed and interactive blog post on how the whole security design fails if the well known security recommendations are avoided on the level of using security controls. Read and try the demos on how hardcoded initialization vectors, reused nonces and leaking cryptographic verdicts lead to system being compromised.  

Cryptographic failures in RF encryption allow stealing robotic devices | Cossack Labs
Stunned by losing their robotic devices, [REDACTED] learnt that they were hijacked by attackers even with communication being encrypted. Having researched its firmware and found numerous cryptographic failures, we’ve crafted a few demos on how cryptography goes wrong in real life.

Introduction to reliability and AMA session 👷‍♂️

Google Cloud Community conducted a session on Service Reliability. They go deeply into the core principles of service reliability and metrics to ensure it.

How to Maximize Service Reliability with the Google Cloud Architecture Framework
Your services and applications need to be reliable to provide a great customer experience. But how can you make sure this is the case? This is the primary question we set out to answer in our latest Architecture Framework Ask Me Anything event. During this session, , Customer Engineer and Infrastr…

Brought to you by Vladimir @vvsevolodovich Ivanov and Ilya @puzan Zonov