Skip to content

Latest commit

 

History

History
315 lines (234 loc) · 33.4 KB

distributed-systems.md

File metadata and controls

315 lines (234 loc) · 33.4 KB

drawing

Facebook System Design Interview; Design Facebook NewsFeed; Design Status Search; Design Live Commenting; Design Facebook Messenger / WhatsApp; Design Instagram

HarperDB is more than just a database, and for certain users or projects, HarperDB is not serving as a database at all. How can this be possible?

The lightweight Kubernetes OS that is known as k3OS has quickly been gaining popularity in the cloud-native community as a compact and edge-focused Linux distribution that cuts the fat away from the traditional K8s distro. While k3OS is picking up steam, it is still on the bleeding edge and there is still a bit of a shortage of learning material out there for it.

Distributed consensus simulation visualization, DYOR

In a microservice architecture, you can get dependencies that impose restrictions on the services used

Outbound logistics play a critical role in a company's overall supply chain management and can significantly impact its bottom line.

The cost of microservices from a developer's perspective.

We talk about two design patterns that highlight best practices for building resilient microservices architectures at scale.

What is common between streaming movies on Netflix, searching for information on Google, buying clothes on Amazon? You interacted with data services built on distributed systems. You interact with the largest distributed system daily: the Internet.

Part One: ClickHouse Failures, by Marcel Birkner

In a world where most of the apps that we use on the internet are collaborative in nature, conflicts in data are common. Is there a way to avoid it?

It's been 2 years since I joined a Dutch EdTech company as Lead Dev and in this article I'll explain how we are transforming communication for Dutch schools.

Major companies using AI and machine learning now use federated learning – a form of machine learning that trains algorithms on a distributed set of devices.

Many products solve for global issues and load balancing but unless a platform is built from the ground up with the necessary backbones, it becomes a nightmare to manage.

One of the most terrifying parts of the current crisis is uncertainty. Uncertainty is one of the most terrifying things people can experience in general. Absolutely everyone I have spoken to is absolutely convinced that a lot of the information available is either biased, doctored or flat-out false.

What are bimodal failure modes and how to avoid them

In this tutorial, we are going to demonstrate how to deploy a distributed Node.js app at the Edge using Section's Edge Compute Platform.

In this article, we explore custodial, semi-custodial, and non-custodial staking services and review the industry's leading non-custodial protocols for ETH 2.0

how Fluence network enables creative apps on an example of surprise party planning app

This blog provides you with some strong rationale to use Kubernetes on large AI/ML datasets on which distributed inferences are performed. Loop in for more.

One of the big debates in the Genesis DAO started by DAOstack was the question of anonymity. Should people be able to make proposals and ask for budgets without providing a real identity? 

This is a condensed version of this post on the Client-Server Model. We use a client, such as a web-browser or chat app, and communicate with a single entity.

This article introduces Structured Data Management (Developer Preview) available in the latest Alluxio 2.1.0 release, a new effort to provide further benefits to SQL and structured data workloads using Alluxio. The original concept was discussed on Alluxio’s engineering blog. This article is part one of the two articles on the Structured Data Management feature my team worked on.

A very Beginner-friendly Guide to Understanding the Blockchain (Part 1: Introduction to Blockchain Technology)

Building a distributed message processing queue using Apache Kafka requires some thought. We walk through how we process thousands of large messages per second.

Last year, after a bit of wrangling and lots of editing by the fantastic Jenn Webb, O’Reilly published a discussion Mark Burgess and I had on one of his trips through the Valley as a podcast.

There's a big hole in reusability on the web. An entertaining statistic - not the most accurate but still fascinating - was generated by Simon Wardley from a Twitter poll. He calculated that basic user registration had been written over a million times. The average developer had written user registration about 5 times. I'm sure you've built it a few times yourself.

Companies Must Transform Or Else (Photo by eelnosiva on Adobe)

This article describes how engineers in the Data Service Center (DSC) at Tencent PCG (Platform and Content Business Group) leverages Alluxio to optimize the analytics performance and minimize the operating costs in building Tencent Beacon Growing, a real-time data analytics platform. 

This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed. The original article can be found on Alluxio's Engineering Blog.

The global supply chain is in a gridlock. Let's fix that.

Consistency, availability, and partition tolerance are the three musketeers of distributed systems. They ensure that your system operates correctly.

In the previous article, I described the concept and design of the Structured Data Service in the Alluxio 2.1.0 release. This article will go through an example to demonstrate how it helps SQL and structured data workloads.

What makes Kafka so Fast? A Deep Dive into Kafka Storage Internals.

Instead of consumers' delivery guarantees in message queues, in this article, we're going to talk about producers' guarantees in distributed systems.

IPVFS: A light weight version control system for files on the Interplanetary File System.

One of the big debates in the Genesis DAO started by DAOstack was the question of anonymity. Should people be able to make proposals and ask for budgets without providing a real identity? 

Chaos engineering is the practice of deliberately injecting an error into a system, in order to observe, in vivo, the consequences.

In the previous article, I described the concept and design of the Structured Data Service in the Alluxio 2.1.0 release. This article will go through an example to demonstrate how it helps SQL and structured data workloads.

In the past few months we have been getting this question a lot:

Shift-right testing improves product resiliency by uncovering issues that surface under heavy user traffic and are difficult to simulate in test environments.

Each remote service that we call eventually going to fail. No matter how reliable they are, it is inevitable.

I II III IV V

In this post, we will talk about the features of working with SQL. We will talk about how you can possibly improve your database queries and speed up your app

A distributed architecture brings in several challenges when it comes to operability and monitoring. Here, one may be dealing with tens if not hundreds of microservices, each of which may or may not have been built by the same team.

PostgreSQL replication using python and RabbitMQ for providing your database server with High Availability by easily making replicas of your master server.

Going serverless has many benefits, but it's not without its issues. Learn about the most common serverless challenges & how to overcome them.

You'll likely be asked some system design questions when interviewing at many tech companies today. Here's how to use the whiteboard to answer them effectively.

The main network is running, transactions are being sent, the wallet is working. What's next? In this article, we will consider how to maintain a network and solve its problems.

Both NoSQL databases and modern Blockchain ledgers benefit from a set of common principles. When they are both implemented for an application, a lot can be accomplished as the platforms can complement each other.

Make use of your downtime and read something good!

That dreaded system design interview. I remember the first system design question I was asked. “Design WhatsApp”, he said. I didn’t know where to start! I was a fresher. Data structures and algorithms were the only things I knew. I am sure you can guess how that interview went. Then after enough research, I made myself a checklist of components, of sorts, to navigate me through my next system design interviews. And I sh*t you not, it works!

In the past few months we have been getting this question a lot:

What is Nameko? Nameko is a framework for building lightweight, highly scalable and fault-tolerant service in Python.

Having worked with Kafka for more than two years now, there are two configs whose interaction I've seen be ubiquitously confused.

In my previous article, I talked about the importance of logs and the differences between structured and unstructured logging. Logs are easy to integrate into your application and provide the ability to represent any type of data in the form of strings.

DisCO is a cooperative, feminist economic, commons-oriented and P2P way of working and an alternative to DAOs.

How to use Platypush and other open-source tools to build a notebook synchronized across multiple devices

Comparing Enterprise messaging and event streaming across different dimensions to see how they excel at solving different but related messaging problems

Bloom filters are a data structure developed by Burton Howard Bloom in 1970. You can see them as a hash tables’ cousin. They also allow for efficient insert and lookup operations while occupying very little space

In some blockchains validators are pre-defined, in others independent teams and individuals  own the nodes. Game-based approach is an excellent way to choose validators wisely.

Blockchain is a term utilized to represent distributed ledger technology.

Certain industries greatly benefit from high-performing, low-latency, geo-distributed technologies.

Route traffic between microservices during development with this one simple trick that will save you setup time and, well, headache.

Imagine — You’re in a system design interview and need to pick a database to store, let’s say, order-related data in an e-commerce system. Your data is structured and needs to be consistent, but your query pattern doesn’t match with a standard relational DB’s. You need your transactions to be isolated, and atomic and all things ACID… But OMG it needs to scale infinitely like Cassandra!! So how would you decide what storage solution to choose? Well, let’s see!

When learning about blockchain consensus algorithms and distributed systems in general, you will inevitably come across terms like FLP impossibility and Byzantine fault tolerance. While there is plenty of literature on these subjects, it often suffers from a narrow focus, failing to explain the connections and relationships between them. Furthermore, much of the existing literature gives either too much or not enough technical detail — I found this to be especially true when learning about consensus algorithms like the proof of stake.

With the emergence of microservices architecture, applications are developed by using a large number of smaller programs. These programs are built individually and deployed into a platform where they can scale independently. These programs communicate with each other over the network through simple Application Programming Interfaces (APIs). With the disaggregated and network distributed nature of these applications, developers have to deal with the Fallacies of Distributed Computing as part of their application logic.

Blockchain 3.0 will be upon us very soon, With Ethereum and so many other blockchain networks fighting for this, can directed acyclic graphs be the future?

Looking to 2020 and beyond, the proportion of data produced and consumed in realtime is growing exponentially. IDC predict that by 2025 1/3 of all data produced globally will be realtime.

A diff algorithm outputs the set of differences between two inputs. These algorithms are the basis of a number of commonly used developer tools. Yet understanding the inner workings of diff algorithms is rarely necessary to use said tools.

Latency is caused by offloading processing from an app to an external server. But what if there was a solution to the monolithic common single-cloud geography?

Many industries are on the brink of the next technological revolution in record keeping. Ten years after Bitcoin made its splash, we’re seeing many inspired by some of the benefits promised by the technology outside of the money use case:

In this article we will cover the core concepts of Kafka and also will touch upon a few of the advanced topics.

In this article, we propose a blockchain network that acts as a centralized append-only distributed file system (DFS) such as Hadoop Distributed File System (HDFS) or Google File System (GFS). The potential advantages of blockchain as a distributed file system (BaaDFS) include:

This article presents the collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problem of Deep Learning model training in the cloud. Various performance bottlenecks are analyzed with detailed optimizations of each component in the architecture. This content was previously published on Alluxio's Engineering Blog, featuring Alibaba Cloud Container Service Team's case study (White Paper here). Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.

This is a guest blog contributed by datasapiens’ Juraj Pohanka, Koen Michiels and Sam Gilbert. This article described how engineers at datasapiens brought down S3 API costs by 200x by implementing Alluxio as a data orchestration layer between S3 and Presto.

Tiered Locality is a feature led by my colleague Andrew Audibert at Alluxio. This article dives into the details of how tiered locality helps provide optimized performance and lower costs. The original article was published on Alluxio’s engineering blog