From fa8ccdb9ce8d3c666363cf669c8e3239cd174647 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Mon, 18 Dec 2023 19:46:20 +0000 Subject: [PATCH 01/19] communication-protocols --- translations/README-ptbr.md | 1739 +++++++++++++++++++++++++++++++++++ 1 file changed, 1739 insertions(+) create mode 100644 translations/README-ptbr.md diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md new file mode 100644 index 0000000..f98abbf --- /dev/null +++ b/translations/README-ptbr.md @@ -0,0 +1,1739 @@ +

+ +

+ +

+ 【 + + 👨🏻‍💻 YouTube + | + + 📮 Newsletter + 】 +

+ +ByteByteGoHq%2Fsystem-design-101 | Trendshift + +# System Design 101 + +Explicando sistemas complexos com visuais e termos simples. + +Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou simplesmente deseja entender como sistemas funcionam por baixo do capô, esperamos que este repositório possa te ajudar. + +# Tabela de Conteúdos + + + +- [Protocólos de Comunicação](#protocolos-de-comunicacao) + - [REST API vs. GraphQL](#rest-api-vs-graphql) + - [How does gRPC work?](#how-does-grpc-work) + - [What is a webhook?](#what-is-a-webhook) + - [How to improve API performance?](#how-to-improve-api-performance) + - [HTTP 1.0 -\> HTTP 1.1 -\> HTTP 2.0 -\> HTTP 3.0 (QUIC)](#http-10---http-11---http-20---http-30-quic) + - [SOAP vs REST vs GraphQL vs RPC](#soap-vs-rest-vs-graphql-vs-rpc) + - [Code First vs. API First](#code-first-vs-api-first) + - [HTTP status codes](#http-status-codes) + - [What does API gateway do?](#what-does-api-gateway-do) + - [How do we design effective and safe APIs?](#how-do-we-design-effective-and-safe-apis) + - [TCP/IP encapsulation](#tcpip-encapsulation) + - [Why is Nginx called a “reverse” proxy?](#why-is-nginx-called-a-reverse-proxy) + - [What are the common load-balancing algorithms?](#what-are-the-common-load-balancing-algorithms) + - [URL, URI, URN - Do you know the differences?](#url-uri-urn---do-you-know-the-differences) +- [CI/CD](#cicd) + - [CI/CD Pipeline Explained in Simple Terms](#cicd-pipeline-explained-in-simple-terms) + - [Netflix Tech Stack (CI/CD Pipeline)](#netflix-tech-stack-cicd-pipeline) +- [Architecture patterns](#architecture-patterns) + - [MVC, MVP, MVVM, MVVM-C, and VIPER](#mvc-mvp-mvvm-mvvm-c-and-viper) + - [18 Key Design Patterns Every Developer Should Know](#18-key-design-patterns-every-developer-should-know) +- [Database](#database) + - [A nice cheat sheet of different databases in cloud services](#a-nice-cheat-sheet-of-different-databases-in-cloud-services) + - [8 Data Structures That Power Your Databases](#8-data-structures-that-power-your-databases) + - [How is an SQL statement executed in the database?](#how-is-an-sql-statement-executed-in-the-database) + - [CAP theorem](#cap-theorem) + - [Types of Memory and Storage](#types-of-memory-and-storage) + - [Visualizing a SQL query](#visualizing-a-sql-query) + - [SQL language](#sql-language) +- [Cache](#cache) + - [Data is cached everywhere](#data-is-cached-everywhere) + - [Why is Redis so fast?](#why-is-redis-so-fast) + - [How can Redis be used?](#how-can-redis-be-used) + - [Top caching strategies](#top-caching-strategies) +- [Microservice architecture](#microservice-architecture) + - [What does a typical microservice architecture look like?](#what-does-a-typical-microservice-architecture-look-like) + - [Microservice Best Practices](#microservice-best-practices) + - [What tech stack is commonly used for microservices?](#what-tech-stack-is-commonly-used-for-microservices) + - [Why is Kafka fast](#why-is-kafka-fast) +- [Payment systems](#payment-systems) + - [How to learn payment systems?](#how-to-learn-payment-systems) + - [Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money?](#why-is-the-credit-card-called-the-most-profitable-product-in-banks-how-does-visamastercard-make-money) + - [How does VISA work when we swipe a credit card at a merchant’s shop?](#how-does-visa-work-when-we-swipe-a-credit-card-at-a-merchants-shop) + - [Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India](#payment-systems-around-the-world-series-part-1-unified-payments-interface-upi-in-india) +- [DevOps](#devops) + - [DevOps vs. SRE vs. Platform Engineering. What is the difference?](#devops-vs-sre-vs-platform-engineering-what-is-the-difference) + - [What is k8s (Kubernetes)?](#what-is-k8s-kubernetes) + - [Docker vs. Kubernetes. Which one should we use?](#docker-vs-kubernetes-which-one-should-we-use) + - [How does Docker work?](#how-does-docker-work) +- [GIT](#git) + - [How Git Commands work](#how-git-commands-work) + - [How does Git Work?](#how-does-git-work) + - [Git merge vs. Git rebase](#git-merge-vs-git-rebase) +- [Cloud Services](#cloud-services) + - [A nice cheat sheet of different cloud services (2023 edition)](#a-nice-cheat-sheet-of-different-cloud-services-2023-edition) + - [What is cloud native?](#what-is-cloud-native) +- [Developer productivity tools](#developer-productivity-tools) + - [Visualize JSON files](#visualize-json-files) + - [Automatically turn code into architecture diagrams](#automatically-turn-code-into-architecture-diagrams) +- [Linux](#linux) + - [Linux file system explained](#linux-file-system-explained) + - [18 Most-used Linux Commands You Should Know](#18-most-used-linux-commands-you-should-know) +- [Security](#security) + - [How does HTTPS work?](#how-does-https-work) + - [Oauth 2.0 Explained With Simple Terms.](#oauth-20-explained-with-simple-terms) + - [Top 4 Forms of Authentication Mechanisms](#top-4-forms-of-authentication-mechanisms) + - [Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they?](#session-cookie-jwt-token-sso-and-oauth-20---what-are-they) + - [How to store passwords safely in the database and how to validate a password?](#how-to-store-passwords-safely-in-the-database-and-how-to-validate-a-password) + - [Explaining JSON Web Token (JWT) to a 10 year old Kid](#explaining-json-web-token-jwt-to-a-10-year-old-kid) + - [How does Google Authenticator (or other types of 2-factor authenticators) work?](#how-does-google-authenticator-or-other-types-of-2-factor-authenticators-work) +- [Real World Case Studies](#real-world-case-studies) + - [Netflix's Tech Stack](#netflixs-tech-stack) + - [Twitter Architecture 2022](#twitter-architecture-2022) + - [Evolution of Airbnb’s microservice architecture over the past 15 years](#evolution-of-airbnbs-microservice-architecture-over-the-past-15-years) + - [Monorepo vs. Microrepo.](#monorepo-vs-microrepo) + - [How will you design the Stack Overflow website?](#how-will-you-design-the-stack-overflow-website) + - [Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost?](#why-did-amazon-prime-video-monitoring-move-from-serverless-to-monolithic-how-can-it-save-90-cost) + - [How does Disney Hotstar capture 5 Billion Emojis during a tournament?](#how-does-disney-hotstar-capture-5-billion-emojis-during-a-tournament) + - [How Discord Stores Trillions Of Messages](#how-discord-stores-trillions-of-messages) + - [How do video live streamings work on YouTube, TikTok live, or Twitch?](#how-do-video-live-streamings-work-on-youtube-tiktok-live-or-twitch) + + + +## Protocólos de Comunicação + +Estilos de arquiteturas definem como os diferentes componentes de uma interface de programação de aplicações (API, *Application Programming Interface*) interagem entre si. Como resultado, eles garantem eficiência, confiabilidade e facilidade de integração com outros sistemas, proporcionando uma abordagem padrão para projetar e construir APIs. Aqui estão os estilos mais utilizados: + +

+ +

+ +- SOAP: + + Amadurecido, abrangente, baseado em XML + + Ideal para aplicações empresariais + +- RESTful: + + Popular, fácil implementação, métodos HTTP + + Ideal para serviços web + +- GraphQL: + + Linguagem de Busca, requisita dados específicos + + Reduz sobrecarga de rede, respostas mais rápidas + +- gRPC: + + Moderno, alta performance, Protocol Buffers + + Adequado para arquiteturas de microsserviços + +- WebSocket: + + Tempo-real, bidirecional, conexões persistentes + + Perfeito para troca de dados de baixa latência + +- Webhook: + + Orientado a eventos, chamadas de retorno HTTP, assíncrono + + Notifica o sistema sobre a ocorrência de um evento + + +### REST API vs. GraphQL + +Quando se trata de design de APIs, REST e GraphQL tem suas forças e fraquezas. + +O diagrama abaixo mostra uma rápida comparação entre REST e GraphQL. + +

+ +

+ +REST + +- Uses standard HTTP methods like GET, POST, PUT, DELETE for CRUD operations. +- Works well when you need simple, uniform interfaces between separate services/applications. +- Caching strategies are straightforward to implement. +- The downside is it may require multiple roundtrips to assemble related data from separate endpoints. + +GraphQL + +- Provides a single endpoint for clients to query for precisely the data they need. +- Clients specify the exact fields required in nested queries, and the server returns optimized payloads containing just those fields. +- Supports Mutations for modifying data and Subscriptions for real-time notifications. +- Great for aggregating data from multiple sources and works well with rapidly evolving frontend requirements. +- However, it shifts complexity to the client side and can allow abusive queries if not properly safeguarded +- Caching strategies can be more complicated than REST. + +The best choice between REST and GraphQL depends on the specific requirements of the application and development team. GraphQL is a good fit for complex or frequently changing frontend needs, while REST suits applications where simple and consistent contracts are preferred. + +Neither API approach is a silver bullet. Carefully evaluating requirements and tradeoffs is important to pick the right style. Both REST and GraphQL are valid options for exposing data and powering modern applications. + + +### How does gRPC work? + +RPC (Remote Procedure Call) is called “**remote**” because it enables communications between remote services when services are deployed to different servers under microservice architecture. From the user’s point of view, it acts like a local function call. + +The diagram below illustrates the overall data flow for **gRPC**. + +

+ +

+ +Step 1: A REST call is made from the client. The request body is usually in JSON format. + +Steps 2 - 4: The order service (gRPC client) receives the REST call, transforms it, and makes an RPC call to the payment service. gRPC encodes the **client stub** into a binary format and sends it to the low-level transport layer. + +Step 5: gRPC sends the packets over the network via HTTP2. Because of binary encoding and network optimizations, gRPC is said to be 5X faster than JSON. + +Steps 6 - 8: The payment service (gRPC server) receives the packets from the network, decodes them, and invokes the server application. + +Steps 9 - 11: The result is returned from the server application, and gets encoded and sent to the transport layer. + +Steps 12 - 14: The order service receives the packets, decodes them, and sends the result to the client application. + +### What is a webhook? + +The diagram below shows a comparison between polling and Webhook.  + +

+ +

+ +Assume we run an eCommerce website. The clients send orders to the order service via the API gateway, which goes to the payment service for payment transactions. The payment service then talks to an external payment service provider (PSP) to complete the transactions.  + +There are two ways to handle communications with the external PSP.  + +**1. Short polling**  + +After sending the payment request to the PSP, the payment service keeps asking the PSP about the payment status. After several rounds, the PSP finally returns with the status.  + +Short polling has two drawbacks:  +* Constant polling of the status requires resources from the payment service.  +* The External service communicates directly with the payment service, creating security vulnerabilities.  + +**2. Webhook**  + +We can register a webhook with the external service. It means: call me back at a certain URL when you have updates on the request. When the PSP has completed the processing, it will invoke the HTTP request to update the payment status. + +In this way, the programming paradigm is changed, and the payment service doesn’t need to waste resources to poll the payment status anymore. + +What if the PSP never calls back? We can set up a housekeeping job to check payment status every hour. + +Webhooks are often referred to as reverse APIs or push APIs because the server sends HTTP requests to the client. We need to pay attention to 3 things when using a webhook: + +1. We need to design a proper API for the external service to call. +2. We need to set up proper rules in the API gateway for security reasons. +3. We need to register the correct URL at the external service. + +### How to improve API performance? + +The diagram below shows 5 common tricks to improve API performance. + +

+ +

+ +Pagination + +This is a common optimization when the size of the result is large. The results are streaming back to the client to improve the service responsiveness. + +Asynchronous Logging + +Synchronous logging deals with the disk for every call and can slow down the system. Asynchronous logging sends logs to a lock-free buffer first and immediately returns. The logs will be flushed to the disk periodically. This significantly reduces the I/O overhead. + +Caching + +We can cache frequently accessed data into a cache. The client can query the cache first instead of visiting the database directly. If there is a cache miss, the client can query from the database. Caches like Redis store data in memory, so the data access is much faster than the database. + +Payload Compression + +The requests and responses can be compressed using gzip etc so that the transmitted data size is much smaller. This speeds up the upload and download. + +Connection Pool + +When accessing resources, we often need to load data from the database. Opening the closing db connections adds significant overhead. So we should connect to the db via a pool of open connections. The connection pool is responsible for managing the connection lifecycle. + +### HTTP 1.0 -> HTTP 1.1 -> HTTP 2.0 -> HTTP 3.0 (QUIC) + +What problem does each generation of HTTP solve? + +The diagram below illustrates the key features. + +

+ +

+ +- HTTP 1.0 was finalized and fully documented in 1996. Every request to the same server requires a separate TCP connection. + +- HTTP 1.1 was published in 1997. A TCP connection can be left open for reuse (persistent connection), but it doesn’t solve the HOL (head-of-line) blocking issue. + + HOL blocking - when the number of allowed parallel requests in the browser is used up, subsequent requests need to wait for the former ones to complete. + +- HTTP 2.0 was published in 2015. It addresses HOL issue through request multiplexing, which eliminates HOL blocking at the application layer, but HOL still exists at the transport (TCP) layer. + + As you can see in the diagram, HTTP 2.0 introduced the concept of HTTP “streams”: an abstraction that allows multiplexing different HTTP exchanges onto the same TCP connection. Each stream doesn’t need to be sent in order. + +- HTTP 3.0 first draft was published in 2020. It is the proposed successor to HTTP 2.0. It uses QUIC instead of TCP for the underlying transport protocol, thus removing HOL blocking in the transport layer. + +QUIC is based on UDP. It introduces streams as first-class citizens at the transport layer. QUIC streams share the same QUIC connection, so no additional handshakes and slow starts are required to create new ones, but QUIC streams are delivered independently such that in most cases packet loss affecting one stream doesn't affect others. + +### SOAP vs REST vs GraphQL vs RPC + +The diagram below illustrates the API timeline and API styles comparison. + +Over time, different API architectural styles are released. Each of them has its own patterns of standardizing data exchange. + +You can check out the use cases of each style in the diagram. + +

+ +

+ + +### Code First vs. API First + +The diagram below shows the differences between code-first development and API-first development. Why do we want to consider API first design? + +

+ +

+ + +- Microservices increase system complexity and we have separate services to serve different functions of the system. While this kind of architecture facilitates decoupling and segregation of duty, we need to handle the various communications among services. + +It is better to think through the system's complexity before writing the code and carefully defining the boundaries of the services. + +- Separate functional teams need to speak the same language and the dedicated functional teams are only responsible for their own components and services. It is recommended that the organization speak the same language via API design. + +We can mock requests and responses to validate the API design before writing code. + +- Improve software quality and developer productivity Since we have ironed out most of the uncertainties when the project starts, the overall development process is smoother, and the software quality is greatly improved. + +Developers are happy about the process as well because they can focus on functional development instead of negotiating sudden changes. + +The possibility of having surprises toward the end of the project lifecycle is reduced. + +Because we have designed the API first, the tests can be designed while the code is being developed. In a way, we also have TDD (Test Driven Design) when using API first development. + +### HTTP status codes + +

+ +

+ + +The response codes for HTTP are divided into five categories: + +Informational (100-199) +Success (200-299) +Redirection (300-399) +Client Error (400-499) +Server Error (500-599) + +### What does API gateway do? + +The diagram below shows the details. + +

+ +

+ +Step 1 - The client sends an HTTP request to the API gateway. + +Step 2 - The API gateway parses and validates the attributes in the HTTP request. + +Step 3 - The API gateway performs allow-list/deny-list checks. + +Step 4 - The API gateway talks to an identity provider for authentication and authorization. + +Step 5 - The rate limiting rules are applied to the request. If it is over the limit, the request is rejected. + +Steps 6 and 7 - Now that the request has passed basic checks, the API gateway finds the relevant service to route to by path matching. + +Step 8 - The API gateway transforms the request into the appropriate protocol and sends it to backend microservices. + +Steps 9-12: The API gateway can handle errors properly, and deals with faults if the error takes a longer time to recover (circuit break). It can also leverage ELK (Elastic-Logstash-Kibana) stack for logging and monitoring. We sometimes cache data in the API gateway. + +### How do we design effective and safe APIs? + +The diagram below shows typical API designs with a shopping cart example. + +

+ +

+ + +Note that API design is not just URL path design. Most of the time, we need to choose the proper resource names, identifiers, and path patterns. It is equally important to design proper HTTP header fields or to design effective rate-limiting rules within the API gateway. + +### TCP/IP encapsulation + +How is data sent over the network? Why do we need so many layers in the OSI model? + +

+ +

+ +The diagram below shows how data is encapsulated and de-encapsulated when transmitting over the network. + +Step 1: When Device A sends data to Device B over the network via the HTTP protocol, it is first added an HTTP header at the application layer. + +Step 2: Then a TCP or a UDP header is added to the data. It is encapsulated into TCP segments at the transport layer. The header contains the source port, destination port, and sequence number. + +Step 3: The segments are then encapsulated with an IP header at the network layer. The IP header contains the source/destination IP addresses. + +Step 4: The IP datagram is added a MAC header at the data link layer, with source/destination MAC addresses. + +Step 5: The encapsulated frames are sent to the physical layer and sent over the network in binary bits. + +Steps 6-10: When Device B receives the bits from the network, it performs the de-encapsulation process, which is a reverse processing of the encapsulation process. The headers are removed layer by layer, and eventually, Device B can read the data. + +We need layers in the network model because each layer focuses on its own responsibilities. Each layer can rely on the headers for processing instructions and does not need to know the meaning of the data from the last layer. + +### Why is Nginx called a “reverse” proxy? + +The diagram below shows the differences between a 𝐟𝐨𝐫𝐰𝐚𝐫𝐝 𝐩𝐫𝐨𝐱𝐲 and a 𝐫𝐞𝐯𝐞𝐫𝐬𝐞 𝐩𝐫𝐨𝐱𝐲. + +

+ +

+ +A forward proxy is a server that sits between user devices and the internet. + +A forward proxy is commonly used for: + +1. Protecting clients +2. Circumventing browsing restrictions +3. Blocking access to certain content + +A reverse proxy is a server that accepts a request from the client, forwards the request to web servers, and returns the results to the client as if the proxy server had processed the request. + +A reverse proxy is good for: + +1. Protecting servers +2. Load balancing +3. Caching static contents +4. Encrypting and decrypting SSL communications + +### What are the common load-balancing algorithms? + +The diagram below shows 6 common algorithms. + +

+ +

+ +- Static Algorithms + +1. Round robin + + The client requests are sent to different service instances in sequential order. The services are usually required to be stateless. + +3. Sticky round-robin + + This is an improvement of the round-robin algorithm. If Alice’s first request goes to service A, the following requests go to service A as well. + +4. Weighted round-robin + + The admin can specify the weight for each service. The ones with a higher weight handle more requests than others. + +6. Hash + + This algorithm applies a hash function on the incoming requests’ IP or URL. The requests are routed to relevant instances based on the hash function result. + +- Dynamic Algorithms + +5. Least connections + + A new request is sent to the service instance with the least concurrent connections. + +7. Least response time + + A new request is sent to the service instance with the fastest response time. + +### URL, URI, URN - Do you know the differences? + +The diagram below shows a comparison of URL, URI, and URN. + +

+ +

+ +- URI + +URI stands for Uniform Resource Identifier. It identifies a logical or physical resource on the web. URL and URN are subtypes of URI. URL locates a resource, while URN names a resource. + +A URI is composed of the following parts: +scheme:[//authority]path[?query][#fragment] + +- URL + +URL stands for Uniform Resource Locator, the key concept of HTTP. It is the address of a unique resource on the web. It can be used with other protocols like FTP and JDBC. + +- URN + +URN stands for Uniform Resource Name. It uses the urn scheme. URNs cannot be used to locate a resource. A simple example given in the diagram is composed of a namespace and a namespace-specific string. + +If you would like to learn more detail on the subject, I would recommend [W3C’s clarification](https://www.w3.org/TR/uri-clarification/). + +## CI/CD + +### CI/CD Pipeline Explained in Simple Terms + +

+ +

+ +Section 1 - SDLC with CI/CD + +The software development life cycle (SDLC) consists of several key stages: development, testing, deployment, and maintenance. CI/CD automates and integrates these stages to enable faster and more reliable releases. + +When code is pushed to a git repository, it triggers an automated build and test process. End-to-end (e2e) test cases are run to validate the code. If tests pass, the code can be automatically deployed to staging/production. If issues are found, the code is sent back to development for bug fixing. This automation provides fast feedback to developers and reduces the risk of bugs in production. + +Section 2 - Difference between CI and CD + +Continuous Integration (CI) automates the build, test, and merge process. It runs tests whenever code is committed to detect integration issues early. This encourages frequent code commits and rapid feedback. + +Continuous Delivery (CD) automates release processes like infrastructure changes and deployment. It ensures software can be released reliably at any time through automated workflows. CD may also automate the manual testing and approval steps required before production deployment. + +Section 3 - CI/CD Pipeline + +A typical CI/CD pipeline has several connected stages: +- The developer commits code changes to the source control +- CI server detects changes and triggers the build +- Code is compiled, and tested (unit, integration tests) +- Test results reported to the developer +- On success, artifacts are deployed to staging environments +- Further testing may be done on staging before release +- CD system deploys approved changes to production + +### Netflix Tech Stack (CI/CD Pipeline) + +

+ +

+ +Planning: Netflix Engineering uses JIRA for planning and Confluence for documentation. + +Coding: Java is the primary programming language for the backend service, while other languages are used for different use cases. + +Build: Gradle is mainly used for building, and Gradle plugins are built to support various use cases. + +Packaging: Package and dependencies are packed into an Amazon Machine Image (AMI) for release. + +Testing: Testing emphasizes the production culture's focus on building chaos tools. + +Deployment: Netflix uses its self-built Spinnaker for canary rollout deployment. + +Monitoring: The monitoring metrics are centralized in Atlas, and Kayenta is used to detect anomalies. + +Incident report: Incidents are dispatched according to priority, and PagerDuty is used for incident handling. + +## Architecture patterns + +### MVC, MVP, MVVM, MVVM-C, and VIPER +These architecture patterns are among the most commonly used in app development, whether on iOS or Android platforms. Developers have introduced them to overcome the limitations of earlier patterns. So, how do they differ? + +

+ +

+ +- MVC, the oldest pattern, dates back almost 50 years +- Every pattern has a "view" (V) responsible for displaying content and receiving user input +- Most patterns include a "model" (M) to manage business data +- "Controller," "presenter," and "view-model" are translators that mediate between the view and the model ("entity" in the VIPER pattern) + +### 18 Key Design Patterns Every Developer Should Know + +Patterns are reusable solutions to common design problems, resulting in a smoother, more efficient development process. They serve as blueprints for building better software structures. These are some of the most popular patterns: + +

+ +

+ +- Abstract Factory: Family Creator - Makes groups of related items. +- Builder: Lego Master - Builds objects step by step, keeping creation and appearance separate. +- Prototype: Clone Maker - Creates copies of fully prepared examples. +- Singleton: One and Only - A special class with just one instance. +- Adapter: Universal Plug - Connects things with different interfaces. +- Bridge: Function Connector - Links how an object works to what it does. +- Composite: Tree Builder - Forms tree-like structures of simple and complex parts. +- Decorator: Customizer - Adds features to objects without changing their core. +- Facade: One-Stop-Shop - Represents a whole system with a single, simplified interface. +- Flyweight: Space Saver - Shares small, reusable items efficiently. +- Proxy: Stand-In Actor - Represents another object, controlling access or actions. +- Chain of Responsibility: Request Relay - Passes a request through a chain of objects until handled. +- Command: Task Wrapper - Turns a request into an object, ready for action. +- Iterator: Collection Explorer - Accesses elements in a collection one by one. +- Mediator: Communication Hub - Simplifies interactions between different classes. +- Memento: Time Capsule - Captures and restores an object's state. +- Observer: News Broadcaster - Notifies classes about changes in other objects. +- Visitor: Skillful Guest - Adds new operations to a class without altering it. + +## Database + +### A nice cheat sheet of different databases in cloud services + +

+ +

+ +Choosing the right database for your project is a complex task. Many database options, each suited to distinct use cases, can quickly lead to decision fatigue. + +We hope this cheat sheet provides high-level direction to pinpoint the right service that aligns with your project's needs and avoid potential pitfalls. + +Note: Google has limited documentation for their database use cases. Even though we did our best to look at what was available and arrived at the best option, some of the entries may need to be more accurate. + +### 8 Data Structures That Power Your Databases + +The answer will vary depending on your use case. Data can be indexed in memory or on disk. Similarly, data formats vary, such as numbers, strings, geographic coordinates, etc. The system might be write-heavy or read-heavy. All of these factors affect your choice of database index format. + +

+ +

+ +The following are some of the most popular data structures used for indexing data: + +- Skiplist: a common in-memory index type. Used in Redis +- Hash index: a very common implementation of the “Map” data structure (or “Collection”) +- SSTable: immutable on-disk “Map” implementation +- LSM tree: Skiplist + SSTable. High write throughput +- B-tree: disk-based solution. Consistent read/write performance +- Inverted index: used for document indexing. Used in Lucene +- Suffix tree: for string pattern search +- R-tree: multi-dimension search, such as finding the nearest neighbor + +### How is an SQL statement executed in the database? + +The diagram below shows the process. Note that the architectures for different databases are different, the diagram demonstrates some common designs. + +

+ +

+ + +Step 1 - A SQL statement is sent to the database via a transport layer protocol (e.g.TCP). + +Step 2 - The SQL statement is sent to the command parser, where it goes through syntactic and semantic analysis, and a query tree is generated afterward. + +Step 3 - The query tree is sent to the optimizer. The optimizer creates an execution plan. + +Step 4 - The execution plan is sent to the executor. The executor retrieves data from the execution. + +Step 5 - Access methods provide the data fetching logic required for execution, retrieving data from the storage engine. + +Step 6 - Access methods decide whether the SQL statement is read-only. If the query is read-only (SELECT statement), it is passed to the buffer manager for further processing. The buffer manager looks for the data in the cache or data files. + +Step 7 - If the statement is an UPDATE or INSERT, it is passed to the transaction manager for further processing. + +Step 8 - During a transaction, the data is in lock mode. This is guaranteed by the lock manager. It also ensures the transaction’s ACID properties. + +### CAP theorem + +The CAP theorem is one of the most famous terms in computer science, but I bet different developers have different understandings. Let’s examine what it is and why it can be confusing. + +

+ +

+ +CAP theorem states that a distributed system can't provide more than two of these three guarantees simultaneously. + +**Consistency**: consistency means all clients see the same data at the same time no matter which node they connect to. + +**Availability**: availability means any client that requests data gets a response even if some of the nodes are down. + +**Partition Tolerance**: a partition indicates a communication break between two nodes. Partition tolerance means the system continues to operate despite network partitions. + +The “2 of 3” formulation can be useful, **but this simplification could be misleading**. + +1. Picking a database is not easy. Justifying our choice purely based on the CAP theorem is not enough. For example, companies don't choose Cassandra for chat applications simply because it is an AP system. There is a list of good characteristics that make Cassandra a desirable option for storing chat messages. We need to dig deeper. + +2. “CAP prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare”. Quoted from the paper: CAP Twelve Years Later: How the “Rules” Have Changed. + +3. The theorem is about 100% availability and consistency. A more realistic discussion would be the trade-offs between latency and consistency when there is no network partition. See PACELC theorem for more details. + +**Is the CAP theorem actually useful?** + +I think it is still useful as it opens our minds to a set of tradeoff discussions, but it is only part of the story. We need to dig deeper when picking the right database. + +### Types of Memory and Storage + +

+ +

+ + +### Visualizing a SQL query + +

+ +

+ +SQL statements are executed by the database system in several steps, including: + +- Parsing the SQL statement and checking its validity +- Transforming the SQL into an internal representation, such as relational algebra +- Optimizing the internal representation and creating an execution plan that utilizes index information +- Executing the plan and returning the results + +The execution of SQL is highly complex and involves many considerations, such as: + +- The use of indexes and caches +- The order of table joins +- Concurrency control +- Transaction management + +### SQL language + +In 1986, SQL (Structured Query Language) became a standard. Over the next 40 years, it became the dominant language for relational database management systems. Reading the latest standard (ANSI SQL 2016) can be time-consuming. How can I learn it? + +

+ +

+ +There are 5 components of the SQL language: + +- DDL: data definition language, such as CREATE, ALTER, DROP +- DQL: data query language, such as SELECT +- DML: data manipulation language, such as INSERT, UPDATE, DELETE +- DCL: data control language, such as GRANT, REVOKE +- TCL: transaction control language, such as COMMIT, ROLLBACK + +For a backend engineer, you may need to know most of it. As a data analyst, you may need to have a good understanding of DQL. Select the topics that are most relevant to you. + +## Cache + +### Data is cached everywhere + +This diagram illustrates where we cache data in a typical architecture. + +

+ +

+ + +There are **multiple layers** along the flow. + +1. Client apps: HTTP responses can be cached by the browser. We request data over HTTP for the first time, and it is returned with an expiry policy in the HTTP header; we request data again, and the client app tries to retrieve the data from the browser cache first. +2. CDN: CDN caches static web resources. The clients can retrieve data from a CDN node nearby. +3. Load Balancer: The load Balancer can cache resources as well. +4. Messaging infra: Message brokers store messages on disk first, and then consumers retrieve them at their own pace. Depending on the retention policy, the data is cached in Kafka clusters for a period of time. +5. Services: There are multiple layers of cache in a service. If the data is not cached in the CPU cache, the service will try to retrieve the data from memory. Sometimes the service has a second-level cache to store data on disk. +6. Distributed Cache: Distributed cache like Redis holds key-value pairs for multiple services in memory. It provides much better read/write performance than the database. +7. Full-text Search: we sometimes need to use full-text searches like Elastic Search for document search or log search. A copy of data is indexed in the search engine as well. +8. Database: Even in the database, we have different levels of caches: +- WAL(Write-ahead Log): data is written to WAL first before building the B tree index +- Bufferpool: A memory area allocated to cache query results +- Materialized View: Pre-compute query results and store them in the database tables for better query performance +- Transaction log: record all the transactions and database updates +- Replication Log: used to record the replication state in a database cluster + +### Why is Redis so fast? + +There are 3 main reasons as shown in the diagram below. + +

+ +

+ + +1. Redis is a RAM-based data store. RAM access is at least 1000 times faster than random disk access. +2. Redis leverages IO multiplexing and single-threaded execution loop for execution efficiency. +3. Redis leverages several efficient lower-level data structures. + +Question: Another popular in-memory store is Memcached. Do you know the differences between Redis and Memcached? + +You might have noticed the style of this diagram is different from my previous posts. Please let me know which one you prefer. + +### How can Redis be used? + +

+ +

+ + +There is more to Redis than just caching. + +Redis can be used in a variety of scenarios as shown in the diagram. + +- Session + + We can use Redis to share user session data among different services. + +- Cache + + We can use Redis to cache objects or pages, especially for hotspot data. + +- Distributed lock + + We can use a Redis string to acquire locks among distributed services. + +- Counter + + We can count how many likes or how many reads for articles. + +- Rate limiter + + We can apply a rate limiter for certain user IPs. + +- Global ID generator + + We can use Redis Int for global ID. + +- Shopping cart + + We can use Redis Hash to represent key-value pairs in a shopping cart. + +- Calculate user retention + + We can use Bitmap to represent the user login daily and calculate user retention. + +- Message queue + + We can use List for a message queue. + +- Ranking + + We can use ZSet to sort the articles. + +### Top caching strategies + +Designing large-scale systems usually requires careful consideration of caching. +Below are five caching strategies that are frequently utilized. + +

+ +

+ + + +## Microservice architecture + +### What does a typical microservice architecture look like? + +

+ +

+ + +The diagram below shows a typical microservice architecture. + +- Load Balancer: This distributes incoming traffic across multiple backend services. +- CDN (Content Delivery Network): CDN is a group of geographically distributed servers that hold static content for faster delivery. The clients look for content in CDN first, then progress to backend services. +- API Gateway: This handles incoming requests and routes them to the relevant services. It talks to the identity provider and service discovery. +- Identity Provider: This handles authentication and authorization for users. +- Service Registry & Discovery: Microservice registration and discovery happen in this component, and the API gateway looks for relevant services in this component to talk to. +- Management: This component is responsible for monitoring the services. +- Microservices: Microservices are designed and deployed in different domains. Each domain has its own database. The API gateway talks to the microservices via REST API or other protocols, and the microservices within the same domain talk to each other using RPC (Remote Procedure Call). + +Benefits of microservices: + +- They can be quickly designed, deployed, and horizontally scaled. +- Each domain can be independently maintained by a dedicated team. +- Business requirements can be customized in each domain and better supported, as a result. + +### Microservice Best Practices + +A picture is worth a thousand words: 9 best practices for developing microservices. + +

+ +

+ + +When we develop microservices, we need to follow the following best practices: + +1. Use separate data storage for each microservice +2. Keep code at a similar level of maturity +3. Separate build for each microservice +4. Assign each microservice with a single responsibility +5. Deploy into containers +6. Design stateless services +7. Adopt domain-driven design +8. Design micro frontend +9. Orchestrating microservices + +### What tech stack is commonly used for microservices? + +Below you will find a diagram showing the microservice tech stack, both for the development phase and for production. + +

+ +

+ + +▶️ 𝐏𝐫𝐞-𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 + +- Define API - This establishes a contract between frontend and backend. We can use Postman or OpenAPI for this. +- Development - Node.js or react is popular for frontend development, and java/python/go for backend development. Also, we need to change the configurations in the API gateway according to API definitions. +- Continuous Integration - JUnit and Jenkins for automated testing. The code is packaged into a Docker image and deployed as microservices. + +▶️ 𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 + +- NGinx is a common choice for load balancers. Cloudflare provides CDN (Content Delivery Network). +- API Gateway - We can use spring boot for the gateway, and use Eureka/Zookeeper for service discovery. +- The microservices are deployed on clouds. We have options among AWS, Microsoft Azure, or Google GCP. +Cache and Full-text Search - Redis is a common choice for caching key-value pairs. Elasticsearch is used for full-text search. +- Communications - For services to talk to each other, we can use messaging infra Kafka or RPC. +- Persistence - We can use MySQL or PostgreSQL for a relational database, and Amazon S3 for object store. We can also use Cassandra for the wide-column store if necessary. +- Management & Monitoring - To manage so many microservices, the common Ops tools include Prometheus, Elastic Stack, and Kubernetes. + +### Why is Kafka fast + +There are many design decisions that contributed to Kafka’s performance. In this post, we’ll focus on two. We think these two carried the most weight. + +

+ +

+ +1. The first one is Kafka’s reliance on Sequential I/O. +2. The second design choice that gives Kafka its performance advantage is its focus on efficiency: zero copy principle. + +The diagram illustrates how the data is transmitted between producer and consumer, and what zero-copy means. + +- Step 1.1 - 1.3: Producer writes data to the disk +- Step 2: Consumer reads data without zero-copy + +2.1 The data is loaded from disk to OS cache + +2.2 The data is copied from OS cache to Kafka application + +2.3 Kafka application copies the data into the socket buffer + +2.4 The data is copied from socket buffer to network card + +2.5 The network card sends data out to the consumer + + +- Step 3: Consumer reads data with zero-copy + +3.1: The data is loaded from disk to OS cache +3.2 OS cache directly copies the data to the network card via sendfile() command +3.3 The network card sends data out to the consumer + +Zero copy is a shortcut to save the multiple data copies between application context and kernel context. + +## Payment systems + +### How to learn payment systems? + +

+ +

+ +### Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money? + +The diagram below shows the economics of the credit card payment flow. + +

+ +

+ +1.  The cardholder pays a merchant $100 to buy a product. + +2. The merchant benefits from the use of the credit card with higher sales volume and needs to compensate the issuer and the card network for providing the payment service. The acquiring bank sets a fee with the merchant, called the “merchant discount fee.” + +3 - 4. The acquiring bank keeps $0.25 as the acquiring markup, and $1.75 is paid to the issuing bank as the interchange fee. The merchant discount fee should cover the interchange fee. + + The interchange fee is set by the card network because it is less efficient for each issuing bank to negotiate fees with each merchant. + +5.  The card network sets up the network assessments and fees with each bank, which pays the card network for its services every month. For example, VISA charges a 0.11% assessment, plus a $0.0195 usage fee, for every swipe. + +6.  The cardholder pays the issuing bank for its services. + +Why should the issuing bank be compensated? + +- The issuer pays the merchant even if the cardholder fails to pay the issuer. +- The issuer pays the merchant before the cardholder pays the issuer. +- The issuer has other operating costs, including managing customer accounts, providing statements, fraud detection, risk management, clearing & settlement, etc. + +### How does VISA work when we swipe a credit card at a merchant’s shop? + +

+ +

+ + +VISA, Mastercard, and American Express act as card networks for the clearing and settling of funds. The card acquiring bank and the card issuing bank can be – and often are – different. If banks were to settle transactions one by one without an intermediary, each bank would have to settle the transactions with all the other banks. This is quite inefficient. + +The diagram below shows VISA’s role in the credit card payment process. There are two flows involved. Authorization flow happens when the customer swipes the credit card. Capture and settlement flow happens when the merchant wants to get the money at the end of the day. + +- Authorization Flow + +Step 0: The card issuing bank issues credit cards to its customers. + +Step 1: The cardholder wants to buy a product and swipes the credit card at the Point of Sale (POS) terminal in the merchant’s shop. + +Step 2: The POS terminal sends the transaction to the acquiring bank, which has provided the POS terminal. + +Steps 3 and 4: The acquiring bank sends the transaction to the card network, also called the card scheme. The card network sends the transaction to the issuing bank for approval. + +Steps 4.1, 4.2 and 4.3: The issuing bank freezes the money if the transaction is approved. The approval or rejection is sent back to the acquirer, as well as the POS terminal. + +- Capture and Settlement Flow + +Steps 1 and 2: The merchant wants to collect the money at the end of the day, so they hit ”capture” on the POS terminal. The transactions are sent to the acquirer in batch. The acquirer sends the batch file with transactions to the card network. + +Step 3: The card network performs clearing for the transactions collected from different acquirers, and sends the clearing files to different issuing banks. + +Step 4: The issuing banks confirm the correctness of the clearing files, and transfer money to the relevant acquiring banks. + +Step 5: The acquiring bank then transfers money to the merchant’s bank. + +Step 4: The card network clears up the transactions from different acquiring banks. Clearing is a process in which mutual offset transactions are netted, so the number of total transactions is reduced. + +In the process, the card network takes on the burden of talking to each bank and receives service fees in return. + +### Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India + + +What’s UPI? UPI is an instant real-time payment system developed by the National Payments Corporation of India. + +It accounts for 60% of digital retail transactions in India today. + +UPI = payment markup language + standard for interoperable payments + + +

+ +

+ + +## DevOps + +### DevOps vs. SRE vs. Platform Engineering. What is the difference? + +The concepts of DevOps, SRE, and Platform Engineering have emerged at different times and have been developed by various individuals and organizations. + +

+ +

+ +DevOps as a concept was introduced in 2009 by Patrick Debois and Andrew Shafer at the Agile conference. They sought to bridge the gap between software development and operations by promoting a collaborative culture and shared responsibility for the entire software development lifecycle. + +SRE, or Site Reliability Engineering, was pioneered by Google in the early 2000s to address operational challenges in managing large-scale, complex systems. Google developed SRE practices and tools, such as the Borg cluster management system and the Monarch monitoring system, to improve the reliability and efficiency of their services. + +Platform Engineering is a more recent concept, building on the foundation of SRE engineering. The precise origins of Platform Engineering are less clear, but it is generally understood to be an extension of the DevOps and SRE practices, with a focus on delivering a comprehensive platform for product development that supports the entire business perspective. + +It's worth noting that while these concepts emerged at different times. They are all related to the broader trend of improving collaboration, automation, and efficiency in software development and operations. + +### What is k8s (Kubernetes)? + +K8s is a container orchestration system. It is used for container deployment and management. Its design is greatly impacted by Google’s internal system Borg. + +

+ +

+ +A k8s cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. + +The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster. In production environments, the control plane usually runs across multiple computers, and a cluster usually runs multiple nodes, providing fault tolerance and high availability. + +- Control Plane Components + +1. API Server + + The API server talks to all the components in the k8s cluster. All the operations on pods are executed by talking to the API server. + +2. Scheduler + + The scheduler watches pod workloads and assigns loads on newly created pods. + +3. Controller Manager + + The controller manager runs the controllers, including Node Controller, Job Controller, EndpointSlice Controller, and ServiceAccount Controller. + +4. Etcd + + etcd is a key-value store used as Kubernetes' backing store for all cluster data. + +- Nodes + +1. Pods + + A pod is a group of containers and is the smallest unit that k8s administers. Pods have a single IP address applied to every container within the pod. + +2. Kubelet + + An agent that runs on each node in the cluster. It ensures containers are running in a Pod. + +3. Kube Proxy + + Kube-proxy is a network proxy that runs on each node in your cluster. It routes traffic coming into a node from the service. It forwards requests for work to the correct containers. + +### Docker vs. Kubernetes. Which one should we use? + +

+ +

+ + +What is Docker ? + +Docker is an open-source platform that allows you to package, distribute, and run applications in isolated containers. It focuses on containerization, providing lightweight environments that encapsulate applications and their dependencies. + +What is Kubernetes ? + +Kubernetes, often referred to as K8s, is an open-source container orchestration platform. It provides a framework for automating the deployment, scaling, and management of containerized applications across a cluster of nodes. + +How are both different from each other ? + +Docker: Docker operates at the individual container level on a single operating system host. + +You must manually manage each host and setting up networks, security policies, and storage for multiple related containers can be complex. + +Kubernetes: Kubernetes operates at the cluster level. It manages multiple containerized applications across multiple hosts, providing automation for tasks like load balancing, scaling, and ensuring the desired state of applications. + +In short, Docker focuses on containerization and running containers on individual hosts, while Kubernetes specializes in managing and orchestrating containers at scale across a cluster of hosts. + +### How does Docker work? + +The diagram below shows the architecture of Docker and how it works when we run “docker build”, “docker pull” +and “docker run”. + +

+ +

+ +There are 3 components in Docker architecture: + +- Docker client + + The docker client talks to the Docker daemon. + +- Docker host + + The Docker daemon listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes. + +- Docker registry + + A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use. + +Let’s take the “docker run” command as an example. + + 1. Docker pulls the image from the registry. + 1. Docker creates a new container. + 1. Docker allocates a read-write filesystem to the container. + 1. Docker creates a network interface to connect the container to the default network. + 1. Docker starts the container. + +## GIT + +### How Git Commands work + +To begin with, it's essential to identify where our code is stored. The common assumption is that there are only two locations - one on a remote server like Github and the other on our local machine. However, this isn't entirely accurate. Git maintains three local storages on our machine, which means that our code can be found in four places: + +

+ +

+ + +- Working directory: where we edit files +- Staging area: a temporary location where files are kept for the next commit +- Local repository: contains the code that has been committed +- Remote repository: the remote server that stores the code + +Most Git commands primarily move files between these four locations. + +### How does Git Work? + +The diagram below shows the Git workflow. + +

+ +

+ + +Git is a distributed version control system. + +Every developer maintains a local copy of the main repository and edits and commits to the local copy. + +The commit is very fast because the operation doesn’t interact with the remote repository. + +If the remote repository crashes, the files can be recovered from the local repositories. + +### Git merge vs. Git rebase + +What are the differences? + +

+ +

+ + +When we **merge changes** from one Git branch to another, we can use ‘git merge’ or ‘git rebase’. The diagram below shows how the two commands work. + +**Git merge** + +This creates a new commit G’ in the main branch. G’ ties the histories of both main and feature branches. + +Git merge is **non-destructive**. Neither the main nor the feature branch is changed. + +**Git rebase** + +Git rebase moves the feature branch histories to the head of the main branch. It creates new commits E’, F’, and G’ for each commit in the feature branch. + +The benefit of rebase is that it has a linear **commit history**. + +Rebase can be dangerous if “the golden rule of git rebase” is not followed. + +**The Golden Rule of Git Rebase** + +Never use it on public branches! + +## Cloud Services + +### A nice cheat sheet of different cloud services (2023 edition) + +

+ +

+ + +### What is cloud native? + +Below is a diagram showing the evolution of architecture and processes since the 1980s. + +

+ +

+ +Organizations can build and run scalable applications on public, private, and hybrid clouds using cloud native technologies. + +This means the applications are designed to leverage cloud features, so they are resilient to load and easy to scale. + +Cloud native includes 4 aspects: + +1. Development process + + This has progressed from waterfall to agile to DevOps. + +2. Application Architecture + + The architecture has gone from monolithic to microservices. Each service is designed to be small, adaptive to the limited resources in cloud containers. + +3. Deployment & packaging + + The applications used to be deployed on physical servers. Then around 2000, the applications that were not sensitive to latency were usually deployed on virtual servers. With cloud native applications, they are packaged into docker images and deployed in containers. + +4. Application infrastructure + + The applications are massively deployed on cloud infrastructure instead of self-hosted servers. + +## Developer productivity tools + +### Visualize JSON files + +Nested JSON files are hard to read. + +**JsonCrack** generates graph diagrams from JSON files and makes them easy to read. + +Additionally, the generated diagrams can be downloaded as images. + +

+ +

+ + +### Automatically turn code into architecture diagrams + +

+ +

+ + +What does it do? + +- Draw the cloud system architecture in Python code. +- Diagrams can also be rendered directly inside the Jupyter Notebooks. +- No design tools are needed. +- Supports the following providers: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud, etc. + +[Github repo](https://github.com/mingrammer/diagrams) + +## Linux + +### Linux file system explained + +

+ +

+ +The Linux file system used to resemble an unorganized town where individuals constructed their houses wherever they pleased. However, in 1994, the Filesystem Hierarchy Standard (FHS) was introduced to bring order to the Linux file system. + +By implementing a standard like the FHS, software can ensure a consistent layout across various Linux distributions. Nonetheless, not all Linux distributions strictly adhere to this standard. They often incorporate their own unique elements or cater to specific requirements. +To become proficient in this standard, you can begin by exploring. Utilize commands such as "cd" for navigation and "ls" for listing directory contents. Imagine the file system as a tree, starting from the root (/). With time, it will become second nature to you, transforming you into a skilled Linux administrator. + +### 18 Most-used Linux Commands You Should Know + +Linux commands are instructions for interacting with the operating system. They help manage files, directories, system processes, and many other aspects of the system. You need to become familiar with these commands in order to navigate and maintain Linux-based systems efficiently and effectively. + +This diagram below shows popular Linux commands: + +

+ +

+ + +- ls - List files and directories +- cd - Change the current directory +- mkdir - Create a new directory +- rm - Remove files or directories +- cp - Copy files or directories +- mv - Move or rename files or directories +- chmod - Change file or directory permissions +- grep - Search for a pattern in files +- find - Search for files and directories +- tar - manipulate tarball archive files +- vi - Edit files using text editors +- cat - display the content of files +- top - Display processes and resource usage +- ps - Display processes information +- kill - Terminate a process by sending a signal +- du - Estimate file space usage +- ifconfig - Configure network interfaces +- ping - Test network connectivity between hosts + +## Security + +### How does HTTPS work? + +Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP.) HTTPS transmits encrypted data using Transport Layer Security (TLS.) If the data is hijacked online, all the hijacker gets is binary code. + +

+ +

+ + +How is the data encrypted and decrypted? + +Step 1 - The client (browser) and the server establish a TCP connection. + +Step 2 - The client sends a “client hello” to the server. The message contains a set of necessary encryption algorithms (cipher suites) and the latest TLS version it can support. The server responds with a “server hello” so the browser knows whether it can support the algorithms and TLS version. + +The server then sends the SSL certificate to the client. The certificate contains the public key, host name, expiry dates, etc. The client validates the certificate. + +Step 3 - After validating the SSL certificate, the client generates a session key and encrypts it using the public key. The server receives the encrypted session key and decrypts it with the private key. + +Step 4 - Now that both the client and the server hold the same session key (symmetric encryption), the encrypted data is transmitted in a secure bi-directional channel. + +Why does HTTPS switch to symmetric encryption during data transmission? There are two main reasons: + +1. Security: The asymmetric encryption goes only one way. This means that if the server tries to send the encrypted data back to the client, anyone can decrypt the data using the public key. + +2. Server resources: The asymmetric encryption adds quite a lot of mathematical overhead. It is not suitable for data transmissions in long sessions. + +### Oauth 2.0 Explained With Simple Terms. + +OAuth 2.0 is a powerful and secure framework that allows different applications to securely interact with each other on behalf of users without sharing sensitive credentials. + +

+ +

+ +The entities involved in OAuth are the User, the Server, and the Identity Provider (IDP). + +What Can an OAuth Token Do? + +When you use OAuth, you get an OAuth token that represents your identity and permissions. This token can do a few important things: + +Single Sign-On (SSO): With an OAuth token, you can log into multiple services or apps using just one login, making life easier and safer. + +Authorization Across Systems: The OAuth token allows you to share your authorization or access rights across various systems, so you don't have to log in separately everywhere. + +Accessing User Profile: Apps with an OAuth token can access certain parts of your user profile that you allow, but they won't see everything. + +Remember, OAuth 2.0 is all about keeping you and your data safe while making your online experiences seamless and hassle-free across different applications and services. + +### Top 4 Forms of Authentication Mechanisms + +

+ +

+ +1. SSH Keys: + + Cryptographic keys are used to access remote systems and servers securely + +1. OAuth Tokens: + + Tokens that provide limited access to user data on third-party applications + +1. SSL Certificates: + + Digital certificates ensure secure and encrypted communication between servers and clients + +1. Credentials: + + User authentication information is used to verify and grant access to various systems and services + +### Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they? + +These terms are all related to user identity management. When you log into a website, you declare who you are (identification). Your identity is verified (authentication), and you are granted the necessary permissions (authorization). Many solutions have been proposed in the past, and the list keeps growing. + +

+ +

+ +From simple to complex, here is my understanding of user identity management: + +- WWW-Authenticate is the most basic method. You are asked for the username and password by the browser. As a result of the inability to control the login life cycle, it is seldom used today. + +- A finer control over the login life cycle is session-cookie. The server maintains session storage, and the browser keeps the ID of the session. A cookie usually only works with browsers and is not mobile app friendly. + +- To address the compatibility issue, the token can be used. The client sends the token to the server, and the server validates the token. The downside is that the token needs to be encrypted and decrypted, which may be time-consuming. + +- JWT is a standard way of representing tokens. This information can be verified and trusted because it is digitally signed. Since JWT contains the signature, there is no need to save session information on the server side. + +- By using SSO (single sign-on), you can sign on only once and log in to multiple websites. It uses CAS (central authentication service) to maintain cross-site information. + +- By using OAuth 2.0, you can authorize one website to access your information on another website. + +### How to store passwords safely in the database and how to validate a password? + +

+ +

+ + +**Things NOT to do** + +- Storing passwords in plain text is not a good idea because anyone with internal access can see them. + +- Storing password hashes directly is not sufficient because it is pruned to precomputation attacks, such as rainbow tables. + +- To mitigate precomputation attacks, we salt the passwords. + +**What is salt?** + +According to OWASP guidelines, “a salt is a unique, randomly generated string that is added to each password as part of the hashing process”. + +**How to store a password and salt?** + +1. the hash result is unique to each password. +1. The password can be stored in the database using the following format: hash(password + salt). + +**How to validate a password?** + +To validate a password, it can go through the following process: + +1. A client enters the password. +1. The system fetches the corresponding salt from the database. +1. The system appends the salt to the password and hashes it. Let’s call the hashed value H1. +1. The system compares H1 and H2, where H2 is the hash stored in the database. If they are the same, the password is valid. + +### Explaining JSON Web Token (JWT) to a 10 year old Kid + +

+ +

+ +Imagine you have a special box called a JWT. Inside this box, there are three parts: a header, a payload, and a signature. + +The header is like the label on the outside of the box. It tells us what type of box it is and how it's secured. It's usually written in a format called JSON, which is just a way to organize information using curly braces { } and colons : . + +The payload is like the actual message or information you want to send. It could be your name, age, or any other data you want to share. It's also written in JSON format, so it's easy to understand and work with. +Now, the signature is what makes the JWT secure. It's like a special seal that only the sender knows how to create. The signature is created using a secret code, kind of like a password. This signature ensures that nobody can tamper with the contents of the JWT without the sender knowing about it. + +When you want to send the JWT to a server, you put the header, payload, and signature inside the box. Then you send it over to the server. The server can easily read the header and payload to understand who you are and what you want to do. + +### How does Google Authenticator (or other types of 2-factor authenticators) work? + +Google Authenticator is commonly used for logging into our accounts when 2-factor authentication is enabled. How does it guarantee security? + +Google Authenticator is a software-based authenticator that implements a two-step verification service. The diagram below provides detail. + +

+ +

+ + +There are two stages involved: + +- Stage 1 - The user enables Google two-step verification. +- Stage 2 - The user uses the authenticator for logging in, etc. + +Let’s look at these stages. + +**Stage 1** + +Steps 1 and 2: Bob opens the web page to enable two-step verification. The front end requests a secret key. The authentication service generates the secret key for Bob and stores it in the database. + +Step 3: The authentication service returns a URI to the front end. The URI is composed of a key issuer, username, and secret key. The URI is displayed in the form of a QR code on the web page. + +Step 4: Bob then uses Google Authenticator to scan the generated QR code. The secret key is stored in the authenticator. + +**Stage 2** +Steps 1 and 2: Bob wants to log into a website with Google two-step verification. For this, he needs the password. Every 30 seconds, Google Authenticator generates a 6-digit password using TOTP (Time-based One Time Password) algorithm. Bob uses the password to enter the website. + +Steps 3 and 4: The frontend sends the password Bob enters to the backend for authentication. The authentication service reads the secret key from the database and generates a 6-digit password using the same TOTP algorithm as the client. + +Step 5: The authentication service compares the two passwords generated by the client and the server, and returns the comparison result to the frontend. Bob can proceed with the login process only if the two passwords match. + +Is this authentication mechanism safe? + +- Can the secret key be obtained by others? + + We need to make sure the secret key is transmitted using HTTPS. The authenticator client and the database store the secret key, and we need to make sure the secret keys are encrypted. + +- Can the 6-digit password be guessed by hackers? + + No. The password has 6 digits, so the generated password has 1 million potential combinations. Plus, the password changes every 30 seconds. If hackers want to guess the password in 30 seconds, they need to enter 30,000 combinations per second. + + +## Real World Case Studies + +### Netflix's Tech Stack + +This post is based on research from many Netflix engineering blogs and open-source projects. If you come across any inaccuracies, please feel free to inform us. + +

+ +

+ +**Mobile and web**: Netflix has adopted Swift and Kotlin to build native mobile apps. For its web application, it uses React. + +**Frontend/server communication**: Netflix uses GraphQL. + +**Backend services**: Netflix relies on ZUUL, Eureka, the Spring Boot framework, and other technologies. + +**Databases**: Netflix utilizes EV cache, Cassandra, CockroachDB, and other databases. + +**Messaging/streaming**: Netflix employs Apache Kafka and Fink for messaging and streaming purposes. + +**Video storage**: Netflix uses S3 and Open Connect for video storage. + +**Data processing**: Netflix utilizes Flink and Spark for data processing, which is then visualized using Tableau. Redshift is used for processing structured data warehouse information. + +**CI/CD**: Netflix employs various tools such as JIRA, Confluence, PagerDuty, Jenkins, Gradle, Chaos Monkey, Spinnaker, Atlas, and more for CI/CD processes. + +### Twitter Architecture 2022 + +Yes, this is the real Twitter architecture. It is posted by Elon Musk and redrawn by us for better readability. + +

+ +

+ + +### Evolution of Airbnb’s microservice architecture over the past 15 years + +Airbnb’s microservice architecture went through 3 main stages. + +

+ +

+ + +Monolith (2008 - 2017) + +Airbnb began as a simple marketplace for hosts and guests. This is built in a Ruby on Rails application - the monolith. + +What’s the challenge? + +- Confusing team ownership + unowned code +- Slow deployment + +Microservices (2017 - 2020) + +Microservice aims to solve those challenges. In the microservice architecture, key services include: + +- Data fetching service +- Business logic data service +- Write workflow service +- UI aggregation service +- Each service had one owning team + +What’s the challenge? + +Hundreds of services and dependencies were difficult for humans to manage. + +Micro + macroservices (2020 - present) + +This is what Airbnb is working on now. The micro and macroservice hybrid model focuses on the unification of APIs. + +### Monorepo vs. Microrepo. + +Which is the best? Why do different companies choose different options? + +

+ +

+ + +Monorepo isn't new; Linux and Windows were both created using Monorepo. To improve scalability and build speed, Google developed its internal dedicated toolchain to scale it faster and strict coding quality standards to keep it consistent. + +Amazon and Netflix are major ambassadors of the Microservice philosophy. This approach naturally separates the service code into separate repositories. It scales faster but can lead to governance pain points later on. + +Within Monorepo, each service is a folder, and every folder has a BUILD config and OWNERS permission control. Every service member is responsible for their own folder. + +On the other hand, in Microrepo, each service is responsible for its repository, with the build config and permissions typically set for the entire repository. + +In Monorepo, dependencies are shared across the entire codebase regardless of your business, so when there's a version upgrade, every codebase upgrades their version. + +In Microrepo, dependencies are controlled within each repository. Businesses choose when to upgrade their versions based on their own schedules. + +Monorepo has a standard for check-ins. Google's code review process is famously known for setting a high bar, ensuring a coherent quality standard for Monorepo, regardless of the business. + +Microrepo can either set its own standard or adopt a shared standard by incorporating the best practices. It can scale faster for business, but the code quality might be a bit different. +Google engineers built Bazel, and Meta built Buck. There are other open-source tools available, including Nx, Lerna, and others. + +Over the years, Microrepo has had more supported tools, including Maven and Gradle for Java, NPM for NodeJS, and CMake for C/C++, among others. + +### How will you design the Stack Overflow website? + +If your answer is on-premise servers and monolith (on the bottom of the following image), you would likely fail the interview, but that's how it is built in reality! + +

+ +

+ + +**What people think it should look like** + +The interviewer is probably expecting something like the top portion of the picture. + +- Microservice is used to decompose the system into small components. +- Each service has its own database. Use cache heavily. +- The service is sharded. +- The services talk to each other asynchronously through message queues. +- The service is implemented using Event Sourcing with CQRS. +- Showing off knowledge in distributed systems such as eventual consistency, CAP theorem, etc. + +**What it actually is** + +Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud. + +This is contrary to all our popular beliefs these days. + +### Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost? + +The diagram below shows the architecture comparison before and after the migration. + +

+ +

+ + +What is Amazon Prime Video Monitoring Service? + +Prime Video service needs to monitor the quality of thousands of live streams. The monitoring tool automatically analyzes the streams in real time and identifies quality issues like block corruption, video freeze, and sync problems. This is an important process for customer satisfaction. + +There are 3 steps: media converter, defect detector, and real-time notification. + +- What is the problem with the old architecture? + + The old architecture was based on Amazon Lambda, which was good for building services quickly. However, it was not cost-effective when running the architecture at a high scale. The two most expensive operations are: + +1. The orchestration workflow - AWS step functions charge users by state transitions and the orchestration performs multiple state transitions every second. + +2. Data passing between distributed components - the intermediate data is stored in Amazon S3 so that the next stage can download. The download can be costly when the volume is high. + +- Monolithic architecture saves 90% cost + + A monolithic architecture is designed to address the cost issues. There are still 3 components, but the media converter and defect detector are deployed in the same process, saving the cost of passing data over the network. Surprisingly, this approach to deployment architecture change led to 90% cost savings! + +This is an interesting and unique case study because microservices have become a go-to and fashionable choice in the tech industry. It's good to see that we are having more discussions about evolving the architecture and having more honest discussions about its pros and cons. Decomposing components into distributed microservices comes with a cost. + +- What did Amazon leaders say about this? + + Amazon CTO Werner Vogels: “Building **evolvable software systems** is a strategy, not a religion. And revisiting your architecture with an open mind is a must.” + +Ex Amazon VP Sustainability Adrian Cockcroft: “The Prime Video team had followed a path I call **Serverless First**…I don’t advocate **Serverless Only**”. + +### How does Disney Hotstar capture 5 Billion Emojis during a tournament? + +

+ +

+ + +1. Clients send emojis through standard HTTP requests. You can think of Golang Service as a typical Web Server. Golang is chosen because it supports concurrency well. Threads in Golang are lightweight. + +2. Since the write volume is very high, Kafka (message queue) is used as a buffer. + +3. Emoji data are aggregated by a streaming processing service called Spark. It aggregates data every 2 seconds, which is configurable. There is a trade-off to be made based on the interval. A shorter interval means emojis are delivered to other clients faster but it also means more computing resources are needed. + +4. Aggregated data is written to another Kafka. + +5. The PubSub consumers pull aggregated emoji data from Kafka. + +6. Emojis are delivered to other clients in real-time through the PubSub infrastructure. The PubSub infrastructure is interesting. Hotstar considered the following protocols: Socketio, NATS, MQTT, and gRPC, and settled with MQTT. + +A similar design is adopted by LinkedIn which streams a million likes/sec. + +### How Discord Stores Trillions Of Messages + +The diagram below shows the evolution of message storage at Discord: + +

+ +

+ + +MongoDB ➡️ Cassandra ➡️ ScyllaDB + +In 2015, the first version of Discord was built on top of a single MongoDB replica. Around Nov 2015, MongoDB stored 100 million messages and the RAM couldn’t hold the data and index any longer. The latency became unpredictable. Message storage needs to be moved to another database. Cassandra was chosen. + +In 2017, Discord had 12 Cassandra nodes and stored billions of messages. + +At the beginning of 2022, it had 177 nodes with trillions of messages. At this point, latency was unpredictable, and maintenance operations became too expensive to run. + +There are several reasons for the issue: + +- Cassandra uses the LSM tree for the internal data structure. The reads are more expensive than the writes. There can be many concurrent reads on a server with hundreds of users, resulting in hotspots. +- Maintaining clusters, such as compacting SSTables, impacts performance. +- Garbage collection pauses would cause significant latency spikes + +ScyllaDB is Cassandra compatible database written in C++. Discord redesigned its architecture to have a monolithic API, a data service written in Rust, and ScyllaDB-based storage. + +The p99 read latency in ScyllaDB is 15ms compared to 40-125ms in Cassandra. The p99 write latency is 5ms compared to 5-70ms in Cassandra. + +### How do video live streamings work on YouTube, TikTok live, or Twitch? + +Live streaming differs from regular streaming because the video content is sent via the internet in real-time, usually with a latency of just a few seconds. + +The diagram below explains what happens behind the scenes to make this possible. + +

+ +

+ + +Step 1: The raw video data is captured by a microphone and camera. The data is sent to the server side. + +Step 2: The video data is compressed and encoded. For example, the compressing algorithm separates the background and other video elements. After compression, the video is encoded to standards such as H.264. The size of the video data is much smaller after this step. + +Step 3: The encoded data is divided into smaller segments, usually seconds in length, so it takes much less time to download or stream. + +Step 4: The segmented data is sent to the streaming server. The streaming server needs to support different devices and network conditions. This is called ‘Adaptive Bitrate Streaming.’ This means we need to produce multiple files at different bitrates in steps 2 and 3. + +Step 5: The live streaming data is pushed to edge servers supported by CDN (Content Delivery Network.) Millions of viewers can watch the video from an edge server nearby. CDN significantly lowers data transmission latency. + +Step 6: The viewers’ devices decode and decompress the video data and play the video in a video player. + +Steps 7 and 8: If the video needs to be stored for replay, the encoded data is sent to a storage server, and viewers can request a replay from it later. + +Standard protocols for live streaming include: + +- RTMP (Real-Time Messaging Protocol): This was originally developed by Macromedia to transmit data between a Flash player and a server. Now it is used for streaming video data over the internet. Note that video conferencing applications like Skype use RTC (Real-Time Communication) protocol for lower latency. +- HLS (HTTP Live Streaming): It requires the H.264 or H.265 encoding. Apple devices accept only HLS format. +- DASH (Dynamic Adaptive Streaming over HTTP): DASH does not support Apple devices. +- Both HLS and DASH support adaptive bitrate streaming. + +## License + +

This work is licensed under CC BY-NC-ND 4.0

From 62d3277fcff07d6254945280b2c9c659ee6572b7 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Tue, 19 Dec 2023 01:17:53 +0000 Subject: [PATCH 02/19] REST API vs. GraphQL --- translations/README-ptbr.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index f98abbf..8acc938 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -129,7 +129,7 @@ Estilos de arquiteturas definem como os diferentes componentes de uma interface - GraphQL: - Linguagem de Busca, requisita dados específicos + Linguagem de Consulta, requisita dados específicos Reduz sobrecarga de rede, respostas mais rápidas @@ -164,23 +164,23 @@ O diagrama abaixo mostra uma rápida comparação entre REST e GraphQL. REST -- Uses standard HTTP methods like GET, POST, PUT, DELETE for CRUD operations. -- Works well when you need simple, uniform interfaces between separate services/applications. -- Caching strategies are straightforward to implement. -- The downside is it may require multiple roundtrips to assemble related data from separate endpoints. +- Utiliza métodos HTTP padrões como GET, POST, PUT, DELETE para operações CRUD. +- Funciona bem quando você precisa de uma interface simples e uniforme entre serviços/aplicações separadas. +- Estratégias de cache são de simples implementação. +- O lado negativo é que pode levar diversas viagens de ida-e-volta para montar os dados relacionados de endpoints separados. GraphQL -- Provides a single endpoint for clients to query for precisely the data they need. -- Clients specify the exact fields required in nested queries, and the server returns optimized payloads containing just those fields. -- Supports Mutations for modifying data and Subscriptions for real-time notifications. -- Great for aggregating data from multiple sources and works well with rapidly evolving frontend requirements. -- However, it shifts complexity to the client side and can allow abusive queries if not properly safeguarded -- Caching strategies can be more complicated than REST. +- Fornece um único ponto para clientes realizarem consultas em qualquer dado que precisem. +- Os clientes especificam os campos exatos necessários em consultas aninhadas, e o servidor retorna cargas otimizadas contendo apenas esses campos. +- Suporta Mutations para modificar dados e Subscriptions para notificações em tempo real. +- Ótima para agregar dados de diversas fontes e se adapta bem com requeritos requisitos de frontend que evoluem rapidamente. +- No entanto, isso transfere a complexidade para o lado do cliente e pode permitir consultas abusivas se não forem devidamente protegidas. +- Estratégias de Caching podem ser mais complexas que REST. -The best choice between REST and GraphQL depends on the specific requirements of the application and development team. GraphQL is a good fit for complex or frequently changing frontend needs, while REST suits applications where simple and consistent contracts are preferred. +A melhor escolha entre REST e GraphQL depende nos requisitos específicos da aplicação e time de desenvolvimento. GraphQL é uma boa opção para necessidades frontend complexas ou que mudam frequentemente, enquanto REST é adequado para aplicações onde contratos simples e consistentes são preferidos. -Neither API approach is a silver bullet. Carefully evaluating requirements and tradeoffs is important to pick the right style. Both REST and GraphQL are valid options for exposing data and powering modern applications. +Nenhuma abordagem dessas APIs é uma solução milagrosa. Avaliar cuidadosamente os requisitos e as compensações é importante para escolher o estilo certo. Tanto REST quanto GraphQL são opções válidas para expor dados e impulsionar aplicações modernas. ### How does gRPC work? From f574a084efa29e3ec98438058ae3dd26d9108c81 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Tue, 19 Dec 2023 01:46:40 +0000 Subject: [PATCH 03/19] gRPC --- translations/README-ptbr.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 8acc938..8477e44 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -26,7 +26,7 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Protocólos de Comunicação](#protocolos-de-comunicacao) - [REST API vs. GraphQL](#rest-api-vs-graphql) - - [How does gRPC work?](#how-does-grpc-work) + - [Como o gRPC funciona?](#como-o-grpc-funciona) - [What is a webhook?](#what-is-a-webhook) - [How to improve API performance?](#how-to-improve-api-performance) - [HTTP 1.0 -\> HTTP 1.1 -\> HTTP 2.0 -\> HTTP 3.0 (QUIC)](#http-10---http-11---http-20---http-30-quic) @@ -183,27 +183,27 @@ A melhor escolha entre REST e GraphQL depende nos requisitos específicos da apl Nenhuma abordagem dessas APIs é uma solução milagrosa. Avaliar cuidadosamente os requisitos e as compensações é importante para escolher o estilo certo. Tanto REST quanto GraphQL são opções válidas para expor dados e impulsionar aplicações modernas. -### How does gRPC work? +### Como o gRPC funciona? -RPC (Remote Procedure Call) is called “**remote**” because it enables communications between remote services when services are deployed to different servers under microservice architecture. From the user’s point of view, it acts like a local function call. +RCP (Chamada de Procedimento Remota, _Remote Procedure Call_) é chamada de "**remota**" pois habilita comunicação entre serviços quando estes são implantados em servidores sob a arquitetura de microsserviços. Do ponto de vista do usuário, ele age como uma chamada de função local. -The diagram below illustrates the overall data flow for **gRPC**. +O diagrama abaixo ilustra o fluxo geral de dados para o **gRPC**.

-Step 1: A REST call is made from the client. The request body is usually in JSON format. +Passo 1: Uma chamada REST parte do cliente. O corpo da requisição é geralmetne em formato JSON. -Steps 2 - 4: The order service (gRPC client) receives the REST call, transforms it, and makes an RPC call to the payment service. gRPC encodes the **client stub** into a binary format and sends it to the low-level transport layer. +Passo 2 - 4: O Serviço de Pedidos (_Order Service_, que é o cliente gRPC) recebe a chamada REST, a transforma e realiza uma chamada RPC para o Serviço de Pagamentos (_Payment Service_). O gRPC codifica o stub do cliente em um formato binário e o envia para a camada de transporte de baixo nível. -Step 5: gRPC sends the packets over the network via HTTP2. Because of binary encoding and network optimizations, gRPC is said to be 5X faster than JSON. +Passp 5: O gRPC envia os pacotes pela rede via HTTP2. Por conta da codificação binária e otimizações de rede, o gRPC é dito ser 5x mais rápido que JSON. -Steps 6 - 8: The payment service (gRPC server) receives the packets from the network, decodes them, and invokes the server application. +Passo 6 - 8: O Serviço de Pagamentos (_Payment Service_, servidor gRPC) recebe os pacotes da rede, os decodifica e invoca a aplicação do servidor. -Steps 9 - 11: The result is returned from the server application, and gets encoded and sent to the transport layer. +Passos 9 - 11: O resultado é retornado pela aplicação do servidor, codificado e enviado para a camada de transporte. -Steps 12 - 14: The order service receives the packets, decodes them, and sends the result to the client application. +Passos 12 - 14: O Serviço de Pedidos (_Order Service_, cliente gRPC) recebe os pacotes, os decodifica e envia o resultado para a aplicação cliente. ### What is a webhook? From b67e020eaa314d18de06f4d4e4755b6501843e1a Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Tue, 19 Dec 2023 17:33:56 -0300 Subject: [PATCH 04/19] =?UTF-8?q?Communications=20Protocol=20Done=20?= =?UTF-8?q?=E2=9C=85?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 1101 +++++++++++++++++------------------ 1 file changed, 533 insertions(+), 568 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 8477e44..e2bd4ca 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -24,92 +24,95 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou -- [Protocólos de Comunicação](#protocolos-de-comunicacao) - - [REST API vs. GraphQL](#rest-api-vs-graphql) - - [Como o gRPC funciona?](#como-o-grpc-funciona) - - [What is a webhook?](#what-is-a-webhook) - - [How to improve API performance?](#how-to-improve-api-performance) - - [HTTP 1.0 -\> HTTP 1.1 -\> HTTP 2.0 -\> HTTP 3.0 (QUIC)](#http-10---http-11---http-20---http-30-quic) - - [SOAP vs REST vs GraphQL vs RPC](#soap-vs-rest-vs-graphql-vs-rpc) - - [Code First vs. API First](#code-first-vs-api-first) - - [HTTP status codes](#http-status-codes) - - [What does API gateway do?](#what-does-api-gateway-do) - - [How do we design effective and safe APIs?](#how-do-we-design-effective-and-safe-apis) - - [TCP/IP encapsulation](#tcpip-encapsulation) - - [Why is Nginx called a “reverse” proxy?](#why-is-nginx-called-a-reverse-proxy) - - [What are the common load-balancing algorithms?](#what-are-the-common-load-balancing-algorithms) - - [URL, URI, URN - Do you know the differences?](#url-uri-urn---do-you-know-the-differences) -- [CI/CD](#cicd) - - [CI/CD Pipeline Explained in Simple Terms](#cicd-pipeline-explained-in-simple-terms) - - [Netflix Tech Stack (CI/CD Pipeline)](#netflix-tech-stack-cicd-pipeline) -- [Architecture patterns](#architecture-patterns) - - [MVC, MVP, MVVM, MVVM-C, and VIPER](#mvc-mvp-mvvm-mvvm-c-and-viper) - - [18 Key Design Patterns Every Developer Should Know](#18-key-design-patterns-every-developer-should-know) -- [Database](#database) - - [A nice cheat sheet of different databases in cloud services](#a-nice-cheat-sheet-of-different-databases-in-cloud-services) - - [8 Data Structures That Power Your Databases](#8-data-structures-that-power-your-databases) - - [How is an SQL statement executed in the database?](#how-is-an-sql-statement-executed-in-the-database) - - [CAP theorem](#cap-theorem) - - [Types of Memory and Storage](#types-of-memory-and-storage) - - [Visualizing a SQL query](#visualizing-a-sql-query) - - [SQL language](#sql-language) -- [Cache](#cache) - - [Data is cached everywhere](#data-is-cached-everywhere) - - [Why is Redis so fast?](#why-is-redis-so-fast) - - [How can Redis be used?](#how-can-redis-be-used) - - [Top caching strategies](#top-caching-strategies) -- [Microservice architecture](#microservice-architecture) - - [What does a typical microservice architecture look like?](#what-does-a-typical-microservice-architecture-look-like) - - [Microservice Best Practices](#microservice-best-practices) - - [What tech stack is commonly used for microservices?](#what-tech-stack-is-commonly-used-for-microservices) - - [Why is Kafka fast](#why-is-kafka-fast) -- [Payment systems](#payment-systems) - - [How to learn payment systems?](#how-to-learn-payment-systems) - - [Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money?](#why-is-the-credit-card-called-the-most-profitable-product-in-banks-how-does-visamastercard-make-money) - - [How does VISA work when we swipe a credit card at a merchant’s shop?](#how-does-visa-work-when-we-swipe-a-credit-card-at-a-merchants-shop) - - [Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India](#payment-systems-around-the-world-series-part-1-unified-payments-interface-upi-in-india) -- [DevOps](#devops) - - [DevOps vs. SRE vs. Platform Engineering. What is the difference?](#devops-vs-sre-vs-platform-engineering-what-is-the-difference) - - [What is k8s (Kubernetes)?](#what-is-k8s-kubernetes) - - [Docker vs. Kubernetes. Which one should we use?](#docker-vs-kubernetes-which-one-should-we-use) - - [How does Docker work?](#how-does-docker-work) -- [GIT](#git) - - [How Git Commands work](#how-git-commands-work) - - [How does Git Work?](#how-does-git-work) - - [Git merge vs. Git rebase](#git-merge-vs-git-rebase) -- [Cloud Services](#cloud-services) - - [A nice cheat sheet of different cloud services (2023 edition)](#a-nice-cheat-sheet-of-different-cloud-services-2023-edition) - - [What is cloud native?](#what-is-cloud-native) -- [Developer productivity tools](#developer-productivity-tools) - - [Visualize JSON files](#visualize-json-files) - - [Automatically turn code into architecture diagrams](#automatically-turn-code-into-architecture-diagrams) -- [Linux](#linux) - - [Linux file system explained](#linux-file-system-explained) - - [18 Most-used Linux Commands You Should Know](#18-most-used-linux-commands-you-should-know) -- [Security](#security) - - [How does HTTPS work?](#how-does-https-work) - - [Oauth 2.0 Explained With Simple Terms.](#oauth-20-explained-with-simple-terms) - - [Top 4 Forms of Authentication Mechanisms](#top-4-forms-of-authentication-mechanisms) - - [Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they?](#session-cookie-jwt-token-sso-and-oauth-20---what-are-they) - - [How to store passwords safely in the database and how to validate a password?](#how-to-store-passwords-safely-in-the-database-and-how-to-validate-a-password) - - [Explaining JSON Web Token (JWT) to a 10 year old Kid](#explaining-json-web-token-jwt-to-a-10-year-old-kid) - - [How does Google Authenticator (or other types of 2-factor authenticators) work?](#how-does-google-authenticator-or-other-types-of-2-factor-authenticators-work) -- [Real World Case Studies](#real-world-case-studies) - - [Netflix's Tech Stack](#netflixs-tech-stack) - - [Twitter Architecture 2022](#twitter-architecture-2022) - - [Evolution of Airbnb’s microservice architecture over the past 15 years](#evolution-of-airbnbs-microservice-architecture-over-the-past-15-years) - - [Monorepo vs. Microrepo.](#monorepo-vs-microrepo) - - [How will you design the Stack Overflow website?](#how-will-you-design-the-stack-overflow-website) - - [Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost?](#why-did-amazon-prime-video-monitoring-move-from-serverless-to-monolithic-how-can-it-save-90-cost) - - [How does Disney Hotstar capture 5 Billion Emojis during a tournament?](#how-does-disney-hotstar-capture-5-billion-emojis-during-a-tournament) - - [How Discord Stores Trillions Of Messages](#how-discord-stores-trillions-of-messages) - - [How do video live streamings work on YouTube, TikTok live, or Twitch?](#how-do-video-live-streamings-work-on-youtube-tiktok-live-or-twitch) +- [System Design 101](#system-design-101) +- [Tabela de Conteúdos](#tabela-de-conteúdos) + - [Protocólos de Comunicação](#protocólos-de-comunicação) + - [REST API vs. GraphQL](#rest-api-vs-graphql) + - [Como o gRPC funciona?](#como-o-grpc-funciona) + - [O que é um webhook?](#o-que-é-um-webhook) + - [Como melhorar a performance de uma API?](#como-melhorar-a-performance-de-uma-api) + - [HTTP 1.0 -\> HTTP 1.1 -\> HTTP 2.0 -\> HTTP 3.0 (QUIC)](#http-10---http-11---http-20---http-30-quic) + - [SOAP vs REST vs GraphQL vs RPC](#soap-vs-rest-vs-graphql-vs-rpc) + - [Code First vs. API First](#code-first-vs-api-first) + - [Códigos de status HTTP](#códigos-de-status-http) + - [O que faz um API gatway?](#o-que-faz-um-api-gatway) + - [Como projetar APIs seguras e efetivas?](#como-projetar-apis-seguras-e-efetivas) + - [Encapsulação TCP/IP](#encapsulação-tcpip) + - [Por que o Nginx é chamado de uma proxy "reversa"?](#por-que-o-nginx-é-chamado-de-uma-proxy-reversa) + - [Quais são os algoritmos de distribuição de carga comuns?](#quais-são-os-algoritmos-de-distribuição-de-carga-comuns) + - [URL, URI, URN - Você sabe a diferênça?](#url-uri-urn---você-sabe-a-diferênça) + - [CI/CD](#cicd) + - [CI/CD Pipeline Explained in Simple Terms](#cicd-pipeline-explained-in-simple-terms) + - [Netflix Tech Stack (CI/CD Pipeline)](#netflix-tech-stack-cicd-pipeline) + - [Architecture patterns](#architecture-patterns) + - [MVC, MVP, MVVM, MVVM-C, and VIPER](#mvc-mvp-mvvm-mvvm-c-and-viper) + - [18 Key Design Patterns Every Developer Should Know](#18-key-design-patterns-every-developer-should-know) + - [Database](#database) + - [A nice cheat sheet of different databases in cloud services](#a-nice-cheat-sheet-of-different-databases-in-cloud-services) + - [8 Data Structures That Power Your Databases](#8-data-structures-that-power-your-databases) + - [How is an SQL statement executed in the database?](#how-is-an-sql-statement-executed-in-the-database) + - [CAP theorem](#cap-theorem) + - [Types of Memory and Storage](#types-of-memory-and-storage) + - [Visualizing a SQL query](#visualizing-a-sql-query) + - [SQL language](#sql-language) + - [Cache](#cache) + - [Data is cached everywhere](#data-is-cached-everywhere) + - [Why is Redis so fast?](#why-is-redis-so-fast) + - [How can Redis be used?](#how-can-redis-be-used) + - [Top caching strategies](#top-caching-strategies) + - [Microservice architecture](#microservice-architecture) + - [What does a typical microservice architecture look like?](#what-does-a-typical-microservice-architecture-look-like) + - [Microservice Best Practices](#microservice-best-practices) + - [What tech stack is commonly used for microservices?](#what-tech-stack-is-commonly-used-for-microservices) + - [Why is Kafka fast](#why-is-kafka-fast) + - [Payment systems](#payment-systems) + - [How to learn payment systems?](#how-to-learn-payment-systems) + - [Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money?](#why-is-the-credit-card-called-the-most-profitable-product-in-banks-how-does-visamastercard-make-money) + - [How does VISA work when we swipe a credit card at a merchant’s shop?](#how-does-visa-work-when-we-swipe-a-credit-card-at-a-merchants-shop) + - [Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India](#payment-systems-around-the-world-series-part-1-unified-payments-interface-upi-in-india) + - [DevOps](#devops) + - [DevOps vs. SRE vs. Platform Engineering. What is the difference?](#devops-vs-sre-vs-platform-engineering-what-is-the-difference) + - [What is k8s (Kubernetes)?](#what-is-k8s-kubernetes) + - [Docker vs. Kubernetes. Which one should we use?](#docker-vs-kubernetes-which-one-should-we-use) + - [How does Docker work?](#how-does-docker-work) + - [GIT](#git) + - [How Git Commands work](#how-git-commands-work) + - [How does Git Work?](#how-does-git-work) + - [Git merge vs. Git rebase](#git-merge-vs-git-rebase) + - [Cloud Services](#cloud-services) + - [A nice cheat sheet of different cloud services (2023 edition)](#a-nice-cheat-sheet-of-different-cloud-services-2023-edition) + - [What is cloud native?](#what-is-cloud-native) + - [Developer productivity tools](#developer-productivity-tools) + - [Visualize JSON files](#visualize-json-files) + - [Automatically turn code into architecture diagrams](#automatically-turn-code-into-architecture-diagrams) + - [Linux](#linux) + - [Linux file system explained](#linux-file-system-explained) + - [18 Most-used Linux Commands You Should Know](#18-most-used-linux-commands-you-should-know) + - [Security](#security) + - [How does HTTPS work?](#how-does-https-work) + - [Oauth 2.0 Explained With Simple Terms.](#oauth-20-explained-with-simple-terms) + - [Top 4 Forms of Authentication Mechanisms](#top-4-forms-of-authentication-mechanisms) + - [Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they?](#session-cookie-jwt-token-sso-and-oauth-20---what-are-they) + - [How to store passwords safely in the database and how to validate a password?](#how-to-store-passwords-safely-in-the-database-and-how-to-validate-a-password) + - [Explaining JSON Web Token (JWT) to a 10 year old Kid](#explaining-json-web-token-jwt-to-a-10-year-old-kid) + - [How does Google Authenticator (or other types of 2-factor authenticators) work?](#how-does-google-authenticator-or-other-types-of-2-factor-authenticators-work) + - [Real World Case Studies](#real-world-case-studies) + - [Netflix's Tech Stack](#netflixs-tech-stack) + - [Twitter Architecture 2022](#twitter-architecture-2022) + - [Evolution of Airbnb’s microservice architecture over the past 15 years](#evolution-of-airbnbs-microservice-architecture-over-the-past-15-years) + - [Monorepo vs. Microrepo.](#monorepo-vs-microrepo) + - [How will you design the Stack Overflow website?](#how-will-you-design-the-stack-overflow-website) + - [Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost?](#why-did-amazon-prime-video-monitoring-move-from-serverless-to-monolithic-how-can-it-save-90-cost) + - [How does Disney Hotstar capture 5 Billion Emojis during a tournament?](#how-does-disney-hotstar-capture-5-billion-emojis-during-a-tournament) + - [How Discord Stores Trillions Of Messages](#how-discord-stores-trillions-of-messages) + - [How do video live streamings work on YouTube, TikTok live, or Twitch?](#how-do-video-live-streamings-work-on-youtube-tiktok-live-or-twitch) + - [License](#license) ## Protocólos de Comunicação -Estilos de arquiteturas definem como os diferentes componentes de uma interface de programação de aplicações (API, *Application Programming Interface*) interagem entre si. Como resultado, eles garantem eficiência, confiabilidade e facilidade de integração com outros sistemas, proporcionando uma abordagem padrão para projetar e construir APIs. Aqui estão os estilos mais utilizados: +Estilos de arquiteturas definem como os diferentes componentes de uma interface de programação de aplicações (API, _Application Programming Interface_) interagem entre si. Como resultado, eles garantem eficiência, confiabilidade e facilidade de integração com outros sistemas, proporcionando uma abordagem padrão para projetar e construir APIs. Aqui estão os estilos mais utilizados:

@@ -151,7 +154,6 @@ Estilos de arquiteturas definem como os diferentes componentes de uma interface Notifica o sistema sobre a ocorrência de um evento - ### REST API vs. GraphQL Quando se trata de design de APIs, REST e GraphQL tem suas forças e fraquezas. @@ -182,7 +184,6 @@ A melhor escolha entre REST e GraphQL depende nos requisitos específicos da apl Nenhuma abordagem dessas APIs é uma solução milagrosa. Avaliar cuidadosamente os requisitos e as compensações é importante para escolher o estilo certo. Tanto REST quanto GraphQL são opções válidas para expor dados e impulsionar aplicações modernas. - ### Como o gRPC funciona? RCP (Chamada de Procedimento Remota, _Remote Procedure Call_) é chamada de "**remota**" pois habilita comunicação entre serviços quando estes são implantados em servidores sob a arquitetura de microsserviços. Do ponto de vista do usuário, ele age como uma chamada de função local. @@ -205,289 +206,286 @@ Passos 9 - 11: O resultado é retornado pela aplicação do servidor, codificado Passos 12 - 14: O Serviço de Pedidos (_Order Service_, cliente gRPC) recebe os pacotes, os decodifica e envia o resultado para a aplicação cliente. -### What is a webhook? +### O que é um webhook? -The diagram below shows a comparison between polling and Webhook.  +O diagrama abaixo mostra uma comparação entre Polling e Webhook.

-Assume we run an eCommerce website. The clients send orders to the order service via the API gateway, which goes to the payment service for payment transactions. The payment service then talks to an external payment service provider (PSP) to complete the transactions.  +Assuma que estamos rodando um website de eCommerce. Os clientes enviam ordens para o Serviço de Ordems pelo API gateway, que aciona o Serviço de Pagamento para transações de pagament. O Serviço de Pagamento fala com um provedor de serviços de pagamento (PSP) para completar as transações. + +Há duas forms de lidar comcomunicações com um PSP. -There are two ways to handle communications with the external PSP.  +**1. Polling Curto (Short Polling)** -**1. Short polling**  +Depois de enviar um pedido de pagamento para o PSP, o serviço de pagamento fica perguntando ao PSP sobre o status do pagamento. Depois de alguns rounds, o PSP finalmente retorna o status. -After sending the payment request to the PSP, the payment service keeps asking the PSP about the payment status. After several rounds, the PSP finally returns with the status.  +Polling curto possui duas desvantagens: -Short polling has two drawbacks:  -* Constant polling of the status requires resources from the payment service.  -* The External service communicates directly with the payment service, creating security vulnerabilities.  +- Relizar polling constante a respeito do status consome recursos do Serviço de Pagamento. +- O Serviço Externo se comuica diretamente com o Serviço de Pagamento, criando vulnerabilidades. -**2. Webhook**  +**2. Webhook** -We can register a webhook with the external service. It means: call me back at a certain URL when you have updates on the request. When the PSP has completed the processing, it will invoke the HTTP request to update the payment status. +Nós podemos registrar um webhook com o Serviço Externo. Isso é: Me avise neste URL quando você tiver alguma atualização da requisição. Quando o PSP completaqr o processamento, ele irá invocar uma requisição HTTP para atualizar o status do pagamento. -In this way, the programming paradigm is changed, and the payment service doesn’t need to waste resources to poll the payment status anymore. +Desta forma, o paradigma da programação mudou, quando o Serviço de Pagamento não precisa mais desperdiçar recursos fazendo polling no status de pagamento. -What if the PSP never calls back? We can set up a housekeeping job to check payment status every hour. +Mas e se o PSP não nos retorna? Podemos fazer uma tarefa/serviço de limpeza para checar o status de pagamento a cada hora. -Webhooks are often referred to as reverse APIs or push APIs because the server sends HTTP requests to the client. We need to pay attention to 3 things when using a webhook: +Webhooks são geralmente referidos como APIs reversos ou push APIs, já que o servidor envia requisições HTTP para o cliente. Nós precisamos prestar atenção a 3 pontos quando utilizamos webhooks: -1. We need to design a proper API for the external service to call. -2. We need to set up proper rules in the API gateway for security reasons. -3. We need to register the correct URL at the external service. +1. Precisamos projetar uma API adequada para ser chamada pelo serviço externo. +2. Precisamos configurar regras adequadas para o API gateway por razões de segurança. +3. Precisamos registrar a URL correta no Serviço Externo. -### How to improve API performance? +### Como melhorar a performance de uma API? -The diagram below shows 5 common tricks to improve API performance. +O diagrama abaixo mostra 5 truques comuns para melhorar a performance de uma API.

-Pagination +Paginação -This is a common optimization when the size of the result is large. The results are streaming back to the client to improve the service responsiveness. +Esta é uma otimização comun quando o tamanho do resultado é muito grande. Os resultados são transmitidos de volta para o cliete para melhorar a reponsividade do serviço. -Asynchronous Logging +Logging Assíncrono -Synchronous logging deals with the disk for every call and can slow down the system. Asynchronous logging sends logs to a lock-free buffer first and immediately returns. The logs will be flushed to the disk periodically. This significantly reduces the I/O overhead. +O log síncrono lida com o disco a cada chamada e pode diminuir a velocidade do sistema. O log assíncrono envia os logs primeiro para um buffer sem bloqueio e retorna imediatamente. Os logs serão periodicamente gravados no disco. Isso reduz significativamente a sobrecarga de E/S (Entrada/Saída). Caching -We can cache frequently accessed data into a cache. The client can query the cache first instead of visiting the database directly. If there is a cache miss, the client can query from the database. Caches like Redis store data in memory, so the data access is much faster than the database. +Nós podemos armazenar dados frequentemente acessados num cache. O cliente pode realizar uma consulta no cache primeiramente, ao invés de ir direto ao banco de dados. Se ocorrer um cache miss, o cliente deve realizar a consulta ao banco. Caches como Redis armazenam dados em memória, deixando o acesso aos dados muito mais rápido que em um banco de dados. -Payload Compression +Compressão da Carga Útil -The requests and responses can be compressed using gzip etc so that the transmitted data size is much smaller. This speeds up the upload and download. +As slicitações e respostas podem ser comprimidas utilizando gzip etc para que os dados transmitidos tenham um tamanho menor. Isso acelera o upload e download. -Connection Pool +Pool de Conexões -When accessing resources, we often need to load data from the database. Opening the closing db connections adds significant overhead. So we should connect to the db via a pool of open connections. The connection pool is responsible for managing the connection lifecycle. +Ao acessar recursos, nós precisamos carregar dados do banco de dados. Abrir e fechar conexões com o baco adiciona uma sobrecarga significativa. Portanto podemos nos conectar por um pool de conexões abertas. O pool de conexões é responsável por gerenciar o ciclo de vida da conexão. ### HTTP 1.0 -> HTTP 1.1 -> HTTP 2.0 -> HTTP 3.0 (QUIC) -What problem does each generation of HTTP solve? +Quais problemas são resolvidos por cada geração de HTTP? -The diagram below illustrates the key features. +O diagrama abaixo ilustra as principais características.

-- HTTP 1.0 was finalized and fully documented in 1996. Every request to the same server requires a separate TCP connection. +- HTTP 1.0 foi finalizado e totalmente documentado em 1996. Cada solicitação ao mesmo servidor requer uma conexão TCP separada. -- HTTP 1.1 was published in 1997. A TCP connection can be left open for reuse (persistent connection), but it doesn’t solve the HOL (head-of-line) blocking issue. +- HTTP 1.1 foi publicado em 1997. Uma conexão TCP pode ser deixada aberta para reutilização (conexão persistente), mas não resolve o problema do bloqueio HOL (head-of-line). - HOL blocking - when the number of allowed parallel requests in the browser is used up, subsequent requests need to wait for the former ones to complete. + Bloqueio HOL - quando o número de solicitações paralelas permitidas no navegador é esgotado, solicitações subsequentes precisam aguardar as anteriores serem concluídas. -- HTTP 2.0 was published in 2015. It addresses HOL issue through request multiplexing, which eliminates HOL blocking at the application layer, but HOL still exists at the transport (TCP) layer. +- HTTP 2.0 foi publicado em 2015. Ele aborda o problema HOL por meio do multiplexamento de solicitações, o que elimina o bloqueio HOL na camada de aplicação, mas HOL ainda existe na camada de transporte (TCP). - As you can see in the diagram, HTTP 2.0 introduced the concept of HTTP “streams”: an abstraction that allows multiplexing different HTTP exchanges onto the same TCP connection. Each stream doesn’t need to be sent in order. + Como você pode ver no diagrama, o HTTP 2.0 introduziu o conceito de "streams" HTTP: uma abstração que permite multiplexar diferentes trocas de HTTP na mesma conexão TCP. Cada stream não precisa ser enviado em ordem. -- HTTP 3.0 first draft was published in 2020. It is the proposed successor to HTTP 2.0. It uses QUIC instead of TCP for the underlying transport protocol, thus removing HOL blocking in the transport layer. +- HTTP 3.0 teve seu rascunho publicado em 2020. Ele é o sucessor proposto do HTTP 2.0. Utiliza o QUIC em vez do TCP como protocolo de transporte subjacente, removendo assim o bloqueio HOL na camada de transporte. -QUIC is based on UDP. It introduces streams as first-class citizens at the transport layer. QUIC streams share the same QUIC connection, so no additional handshakes and slow starts are required to create new ones, but QUIC streams are delivered independently such that in most cases packet loss affecting one stream doesn't affect others. + O QUIC é baseado no UDP. Ele introduz streams como cidadãos de primeira classe na camada de transporte. Os streams QUIC compartilham a mesma conexão QUIC, portanto, não são necessários handshake adicionais e inícios lentos para criar novas conexões, mas os streams QUIC são entregues independentemente, de modo que, na maioria dos casos, a perda de pacotes que afeta uma stream não afeta as outras. ### SOAP vs REST vs GraphQL vs RPC -The diagram below illustrates the API timeline and API styles comparison. +O diagrama abaixo ilustra a linha do tempo da API e a comparação de estilos de API. -Over time, different API architectural styles are released. Each of them has its own patterns of standardizing data exchange. +Ao longo do tempo, diferentes estilos arquiteturais de API são lançados. Cada um deles tem suas próprias técnicas para padronizar a troca de dados. -You can check out the use cases of each style in the diagram. +Você pode conferir os casos de uso de cada estilo no diagrama.

+### Code First vs. API First -### Code First vs. API First - -The diagram below shows the differences between code-first development and API-first development. Why do we want to consider API first design? +O diagrama abaixo mostra as diferenças entre o desenvolvimento Code First e o desenvolvimento API First. Por que queremos considerar o design API First?

+- Microsserviços aumentar a complexidade do sistema e precisamos separar serviços para servir funções diferentes no sistema. Embora esse tipo de arquitetura facilite o desacoplamento e a segregação de responsabilidades, é necessário lidar com as diversas comunicações entre os serviços. -- Microservices increase system complexity and we have separate services to serve different functions of the system. While this kind of architecture facilitates decoupling and segregation of duty, we need to handle the various communications among services. - -It is better to think through the system's complexity before writing the code and carefully defining the boundaries of the services. +É melhor analisar a complexidade do sistema antes de escrever o código e definir cuidadosamente os limites dos serviços. -- Separate functional teams need to speak the same language and the dedicated functional teams are only responsible for their own components and services. It is recommended that the organization speak the same language via API design. +- Equipes funcionais separadas precisam falar a mesma linguagem, e as equipes funcionais dedicadas são responsáveis apenas por seus próprios componentes e serviços. Recomenda-se que a organização fale a mesma linguagem por meio do design de API. -We can mock requests and responses to validate the API design before writing code. +Podemos simular solicitações e respostas para validar o design da API antes de escrever o código. -- Improve software quality and developer productivity Since we have ironed out most of the uncertainties when the project starts, the overall development process is smoother, and the software quality is greatly improved. +- Melhora a qualidade do software e a produtividade do desenvolvedor. Como resolvemos a maioria das incertezas no início do projeto, o processo de desenvolvimento é mais suave, e a qualidade do software é significativamente aprimorada. -Developers are happy about the process as well because they can focus on functional development instead of negotiating sudden changes. +Os desenvolvedores ficam satisfeitos com o processo, pois podem se concentrar no desenvolvimento funcional em vez de lidar com mudanças repentinas. -The possibility of having surprises toward the end of the project lifecycle is reduced. +A possibilidade de surpresas no final do ciclo de vida do projeto é reduzida. -Because we have designed the API first, the tests can be designed while the code is being developed. In a way, we also have TDD (Test Driven Design) when using API first development. +Como projetamos a API primeiro, os testes podem ser planejados enquanto o código está sendo desenvolvido. De certa forma, também temos o TDD (Design Orientado a Testes) ao usar o desenvolvimento API first. -### HTTP status codes +### Códigos de status HTTP

+Os códigos de resposta para o HTTP são divididos em cinco categorias: -The response codes for HTTP are divided into five categories: +Informativos (100-199) +Sucesso (200-299) +Redirecionamento (300-399) +Erro do Cliente (400-499) +Erro do Servidor (500-599) -Informational (100-199) -Success (200-299) -Redirection (300-399) -Client Error (400-499) -Server Error (500-599) +### O que faz um API gatway? -### What does API gateway do? - -The diagram below shows the details. +O diagrama abaixo ilustra as principais características.

-Step 1 - The client sends an HTTP request to the API gateway. +Passo 1 - O cliente enviar uma requisição HTTP para o API gateway. -Step 2 - The API gateway parses and validates the attributes in the HTTP request. +Passo 2 - O API gateway analisa e valida os atributos na solicitação HTTP. -Step 3 - The API gateway performs allow-list/deny-list checks. +Passo 3 - O API gateway verifica a lista de permissões/negações. -Step 4 - The API gateway talks to an identity provider for authentication and authorization. +Passo 4 - O API gateway se comunica com um provedor de identidade para autenticação e autorização. -Step 5 - The rate limiting rules are applied to the request. If it is over the limit, the request is rejected. +Passo 5 - As regras de limitação de taxa são aplicadas à solicitação. Se estiver acima do limite, a solicitação é rejeitada. -Steps 6 and 7 - Now that the request has passed basic checks, the API gateway finds the relevant service to route to by path matching. +Passos 6 e 7 - Agora que a solicitação passou pelas verificações básicas, o API gateway encontra o serviço relevante para rotear por meio de correspondência de caminho. -Step 8 - The API gateway transforms the request into the appropriate protocol and sends it to backend microservices. +Passo 8 - O API gateway transforma a requisição no protocolo apropriado e o envia aos microsserviços do backend. -Steps 9-12: The API gateway can handle errors properly, and deals with faults if the error takes a longer time to recover (circuit break). It can also leverage ELK (Elastic-Logstash-Kibana) stack for logging and monitoring. We sometimes cache data in the API gateway. +Passos 9-12: O API agteway pode lidar adequadamente com erros e tratar falhas se o erro levar mais tempo para se recuperar (circuit break). Também pode usar a pilha ELK (Elastic-Logstash-Kibana) para registro e monitoramento. Às vezes, são armazenados dados em cache no gateway de API. -### How do we design effective and safe APIs? +### Como projetar APIs seguras e efetivas? -The diagram below shows typical API designs with a shopping cart example. +O diagrama abaixo ilustra os designs típicos de API com um exemplo de carrinho de compras.

+Observe que o design da API não se resume apenas ao design do caminho do URL. Na maioria das vezes, precisamos escolher nomes apropriados para os recursos, identificadores e padrões de caminho. É igualmente importante projetar campos de cabeçalho HTTP adequados ou criar regras eficazes de limitação de taxa dentro do API gateway. -Note that API design is not just URL path design. Most of the time, we need to choose the proper resource names, identifiers, and path patterns. It is equally important to design proper HTTP header fields or to design effective rate-limiting rules within the API gateway. +### Encapsulação TCP/IP -### TCP/IP encapsulation +Como os dados são enviados pela rede? Por que nós precisamos de tantas camadas no modelo OSI? -How is data sent over the network? Why do we need so many layers in the OSI model? +O diagrama abaixo mostra como dados são encapsulados e desencapsulados ao trasmitidos pela rede.

-The diagram below shows how data is encapsulated and de-encapsulated when transmitting over the network. - -Step 1: When Device A sends data to Device B over the network via the HTTP protocol, it is first added an HTTP header at the application layer. +Passo 1: Quando o Dispositivo A envia dados para o Dispositivo B pela rede via protocolo HTTP, é primeiro adicionado um cabeçalho HTTP na camada de aplicação. -Step 2: Then a TCP or a UDP header is added to the data. It is encapsulated into TCP segments at the transport layer. The header contains the source port, destination port, and sequence number. +Passo 2: Em seguida, é adicionado um cabeçalho TCP ou UDP aos dados. Eles são encapsulados em segmentos TCP na camada de transporte. O cabeçalho contém a porta de origem, a porta de destino e o número de sequência. -Step 3: The segments are then encapsulated with an IP header at the network layer. The IP header contains the source/destination IP addresses. +Passo 3: Os segmentos são então encapsulados com um cabeçalho IP na camada de rede. O cabeçalho IP contém os endereços IP de origem/destino. -Step 4: The IP datagram is added a MAC header at the data link layer, with source/destination MAC addresses. +Passo 4: O datagrama IP é adicionado a um cabeçalho MAC na camada de enlace de dados, com endereços MAC de origem/destino. -Step 5: The encapsulated frames are sent to the physical layer and sent over the network in binary bits. +Passo 5: Os quadros encapsulados são enviados para a camada física e transmitidos pela rede em bits binários. -Steps 6-10: When Device B receives the bits from the network, it performs the de-encapsulation process, which is a reverse processing of the encapsulation process. The headers are removed layer by layer, and eventually, Device B can read the data. +Passos 6-10: Quando o Dispositivo B recebe os bits da rede, ele realiza o processo de desencapsulamento, que é um processo reverso do processo de encapsulamento. Os cabeçalhos são removidos camada por camada e, eventualmente, o Dispositivo B pode ler os dados. -We need layers in the network model because each layer focuses on its own responsibilities. Each layer can rely on the headers for processing instructions and does not need to know the meaning of the data from the last layer. +Precisamos de camadas no modelo de rede porque cada camada se concentra em suas próprias responsabilidades. Cada camada pode depender dos cabeçalhos para obter instruções de processamento e não precisa conhecer o significado dos dados da camada anterior. -### Why is Nginx called a “reverse” proxy? +### Por que o Nginx é chamado de uma proxy "reversa"? -The diagram below shows the differences between a 𝐟𝐨𝐫𝐰𝐚𝐫𝐝 𝐩𝐫𝐨𝐱𝐲 and a 𝐫𝐞𝐯𝐞𝐫𝐬𝐞 𝐩𝐫𝐨𝐱𝐲. +O diagrama abaixo mostra as diferenças entre um proxy de encaminhamento (𝐟𝐨𝐫𝐰𝐚𝐫𝐝 𝐩𝐫𝐨𝐱𝐲) e um proxy de reverso (𝐫𝐞𝐯𝐞𝐫𝐬𝐞 𝐩𝐫𝐨𝐱𝐲).

-A forward proxy is a server that sits between user devices and the internet. +Uma proxy de encaminhamento é um servidor que fica entre usuários e dispositivos na internet. -A forward proxy is commonly used for: +Uma proxy de encaminhamento é comumente utilizada para: -1. Protecting clients -2. Circumventing browsing restrictions -3. Blocking access to certain content +1. Proteger clientes +2. Burlar restrições de navegação +3. Bloquear acesso a conteúdos específicos -A reverse proxy is a server that accepts a request from the client, forwards the request to web servers, and returns the results to the client as if the proxy server had processed the request. +Uma proxy reversa é um servidor que aceita uma requisição de um cliente, direciona a requisição aos servidores web e retorna os resultados para o cliete como se o próprio servidor proxy tivesse processado a requisição. -A reverse proxy is good for: +Uma proxy reversa é boa para: -1. Protecting servers -2. Load balancing -3. Caching static contents -4. Encrypting and decrypting SSL communications +1. Proteger servidores +2. Distribuição de Carga +3. Cachear conteúdo estático +4. Encriptar e decriptar comunicações SSL -### What are the common load-balancing algorithms? +### Quais são os algoritmos de distribuição de carga comuns? -The diagram below shows 6 common algorithms. +O diagrama abaixo mostra 6 algoritmos comuns.

-- Static Algorithms +- Algoritmos estáticos 1. Round robin - The client requests are sent to different service instances in sequential order. The services are usually required to be stateless. + As requisições dos clientes são enviadas para diferentes instancias de serviços em uma ordem sequencial. Os serviços são usualmente requeridos a serem stateless. -3. Sticky round-robin +2. Round-robin pegajoso (_sticky_) - This is an improvement of the round-robin algorithm. If Alice’s first request goes to service A, the following requests go to service A as well. + Este é uma melhoria do algoritmo de orund-robin. De a primeira requisição de Alice vai para o serviço A, então as requisições seguintes também vão para o serviço A. -4. Weighted round-robin +3. Round-robin ponderado (_weighted_) - The admin can specify the weight for each service. The ones with a higher weight handle more requests than others. + O administrador pode especificar um peso para cada serviço. Os que tiverem um peso maior, lidam com mais requisições que os outros. -6. Hash +4. Hash - This algorithm applies a hash function on the incoming requests’ IP or URL. The requests are routed to relevant instances based on the hash function result. + Este algoritmo aplica uma função hash (de disperção) nos IPs ou URLs das requisições. As requisições são roteadas para instâncias relvantes com base no resultado do resultado da função hash. - Dynamic Algorithms 5. Least connections - A new request is sent to the service instance with the least concurrent connections. + A new request is sent to the service instance with the least concurrent connections. -7. Least response time +6. Least response time - A new request is sent to the service instance with the fastest response time. + A new request is sent to the service instance with the fastest response time. -### URL, URI, URN - Do you know the differences? +### URL, URI, URN - Você sabe a diferênça? -The diagram below shows a comparison of URL, URI, and URN. +O diagrama abaixo mostra uma comparação de URL, URI e URN.

-- URI +- URI -URI stands for Uniform Resource Identifier. It identifies a logical or physical resource on the web. URL and URN are subtypes of URI. URL locates a resource, while URN names a resource. +URI significa Idetificador Uniforme de Recursos (_Uniform Resource Identifier_). Ele identifica um recurso lógico ou físico na web. URL e URN são subtipos de URI. URL localiza um recurso, enquanto URN nomeia um recurso. -A URI is composed of the following parts: -scheme:[//authority]path[?query][#fragment] +Um URI é composto das seguintes partes: +scheme:[//authority]path[?query][#fragment] -- URL +- URL -URL stands for Uniform Resource Locator, the key concept of HTTP. It is the address of a unique resource on the web. It can be used with other protocols like FTP and JDBC. +URL significa Localizador Uniforme de Recurso (_Uniform Resource Locator_), o conceito-chave do HTTP. É o endereço de um recurso único na web. Pode ser usado com outros protocolos, como FTP e JDBC. -- URN +- URN -URN stands for Uniform Resource Name. It uses the urn scheme. URNs cannot be used to locate a resource. A simple example given in the diagram is composed of a namespace and a namespace-specific string. +URN significa Nome Uniforme de Recurso (_Uniform Resource Name_). Ele usa o esquema urn. URNs não podem ser usados para localizar um recurso. Um exemplo simples fornecido no diagrama é composto por um namespace e uma string específica do namespace. -If you would like to learn more detail on the subject, I would recommend [W3C’s clarification](https://www.w3.org/TR/uri-clarification/). +Se você deseja obter mais detalhes sobre o assunto, eu recomendaria a [explicação do W3C](https://www.w3.org/TR/uri-clarification/). ## CI/CD @@ -512,6 +510,7 @@ Continuous Delivery (CD) automates release processes like infrastructure changes Section 3 - CI/CD Pipeline A typical CI/CD pipeline has several connected stages: + - The developer commits code changes to the source control - CI server detects changes and triggers the build - Code is compiled, and tested (unit, integration tests) @@ -526,61 +525,62 @@ A typical CI/CD pipeline has several connected stages:

-Planning: Netflix Engineering uses JIRA for planning and Confluence for documentation. +Planning: Netflix Engineering uses JIRA for planning and Confluence for documentation. -Coding: Java is the primary programming language for the backend service, while other languages are used for different use cases. +Coding: Java is the primary programming language for the backend service, while other languages are used for different use cases. -Build: Gradle is mainly used for building, and Gradle plugins are built to support various use cases. +Build: Gradle is mainly used for building, and Gradle plugins are built to support various use cases. -Packaging: Package and dependencies are packed into an Amazon Machine Image (AMI) for release. +Packaging: Package and dependencies are packed into an Amazon Machine Image (AMI) for release. -Testing: Testing emphasizes the production culture's focus on building chaos tools. +Testing: Testing emphasizes the production culture's focus on building chaos tools. -Deployment: Netflix uses its self-built Spinnaker for canary rollout deployment. +Deployment: Netflix uses its self-built Spinnaker for canary rollout deployment. -Monitoring: The monitoring metrics are centralized in Atlas, and Kayenta is used to detect anomalies. +Monitoring: The monitoring metrics are centralized in Atlas, and Kayenta is used to detect anomalies. -Incident report: Incidents are dispatched according to priority, and PagerDuty is used for incident handling. +Incident report: Incidents are dispatched according to priority, and PagerDuty is used for incident handling. ## Architecture patterns ### MVC, MVP, MVVM, MVVM-C, and VIPER -These architecture patterns are among the most commonly used in app development, whether on iOS or Android platforms. Developers have introduced them to overcome the limitations of earlier patterns. So, how do they differ? + +These architecture patterns are among the most commonly used in app development, whether on iOS or Android platforms. Developers have introduced them to overcome the limitations of earlier patterns. So, how do they differ?

-- MVC, the oldest pattern, dates back almost 50 years -- Every pattern has a "view" (V) responsible for displaying content and receiving user input -- Most patterns include a "model" (M) to manage business data +- MVC, the oldest pattern, dates back almost 50 years +- Every pattern has a "view" (V) responsible for displaying content and receiving user input +- Most patterns include a "model" (M) to manage business data - "Controller," "presenter," and "view-model" are translators that mediate between the view and the model ("entity" in the VIPER pattern) ### 18 Key Design Patterns Every Developer Should Know -Patterns are reusable solutions to common design problems, resulting in a smoother, more efficient development process. They serve as blueprints for building better software structures. These are some of the most popular patterns: +Patterns are reusable solutions to common design problems, resulting in a smoother, more efficient development process. They serve as blueprints for building better software structures. These are some of the most popular patterns:

-- Abstract Factory: Family Creator - Makes groups of related items. -- Builder: Lego Master - Builds objects step by step, keeping creation and appearance separate. -- Prototype: Clone Maker - Creates copies of fully prepared examples. -- Singleton: One and Only - A special class with just one instance. -- Adapter: Universal Plug - Connects things with different interfaces. -- Bridge: Function Connector - Links how an object works to what it does. -- Composite: Tree Builder - Forms tree-like structures of simple and complex parts. -- Decorator: Customizer - Adds features to objects without changing their core. -- Facade: One-Stop-Shop - Represents a whole system with a single, simplified interface. -- Flyweight: Space Saver - Shares small, reusable items efficiently. -- Proxy: Stand-In Actor - Represents another object, controlling access or actions. -- Chain of Responsibility: Request Relay - Passes a request through a chain of objects until handled. -- Command: Task Wrapper - Turns a request into an object, ready for action. -- Iterator: Collection Explorer - Accesses elements in a collection one by one. -- Mediator: Communication Hub - Simplifies interactions between different classes. -- Memento: Time Capsule - Captures and restores an object's state. -- Observer: News Broadcaster - Notifies classes about changes in other objects. +- Abstract Factory: Family Creator - Makes groups of related items. +- Builder: Lego Master - Builds objects step by step, keeping creation and appearance separate. +- Prototype: Clone Maker - Creates copies of fully prepared examples. +- Singleton: One and Only - A special class with just one instance. +- Adapter: Universal Plug - Connects things with different interfaces. +- Bridge: Function Connector - Links how an object works to what it does. +- Composite: Tree Builder - Forms tree-like structures of simple and complex parts. +- Decorator: Customizer - Adds features to objects without changing their core. +- Facade: One-Stop-Shop - Represents a whole system with a single, simplified interface. +- Flyweight: Space Saver - Shares small, reusable items efficiently. +- Proxy: Stand-In Actor - Represents another object, controlling access or actions. +- Chain of Responsibility: Request Relay - Passes a request through a chain of objects until handled. +- Command: Task Wrapper - Turns a request into an object, ready for action. +- Iterator: Collection Explorer - Accesses elements in a collection one by one. +- Mediator: Communication Hub - Simplifies interactions between different classes. +- Memento: Time Capsule - Captures and restores an object's state. +- Observer: News Broadcaster - Notifies classes about changes in other objects. - Visitor: Skillful Guest - Adds new operations to a class without altering it. ## Database @@ -591,30 +591,30 @@ Patterns are reusable solutions to common design problems, resulting in a smooth

-Choosing the right database for your project is a complex task. Many database options, each suited to distinct use cases, can quickly lead to decision fatigue. +Choosing the right database for your project is a complex task. Many database options, each suited to distinct use cases, can quickly lead to decision fatigue. -We hope this cheat sheet provides high-level direction to pinpoint the right service that aligns with your project's needs and avoid potential pitfalls. +We hope this cheat sheet provides high-level direction to pinpoint the right service that aligns with your project's needs and avoid potential pitfalls. -Note: Google has limited documentation for their database use cases. Even though we did our best to look at what was available and arrived at the best option, some of the entries may need to be more accurate. +Note: Google has limited documentation for their database use cases. Even though we did our best to look at what was available and arrived at the best option, some of the entries may need to be more accurate. ### 8 Data Structures That Power Your Databases -The answer will vary depending on your use case. Data can be indexed in memory or on disk. Similarly, data formats vary, such as numbers, strings, geographic coordinates, etc. The system might be write-heavy or read-heavy. All of these factors affect your choice of database index format. +The answer will vary depending on your use case. Data can be indexed in memory or on disk. Similarly, data formats vary, such as numbers, strings, geographic coordinates, etc. The system might be write-heavy or read-heavy. All of these factors affect your choice of database index format.

-The following are some of the most popular data structures used for indexing data: +The following are some of the most popular data structures used for indexing data: -- Skiplist: a common in-memory index type. Used in Redis -- Hash index: a very common implementation of the “Map” data structure (or “Collection”) -- SSTable: immutable on-disk “Map” implementation -- LSM tree: Skiplist + SSTable. High write throughput -- B-tree: disk-based solution. Consistent read/write performance -- Inverted index: used for document indexing. Used in Lucene -- Suffix tree: for string pattern search -- R-tree: multi-dimension search, such as finding the nearest neighbor +- Skiplist: a common in-memory index type. Used in Redis +- Hash index: a very common implementation of the “Map” data structure (or “Collection”) +- SSTable: immutable on-disk “Map” implementation +- LSM tree: Skiplist + SSTable. High write throughput +- B-tree: disk-based solution. Consistent read/write performance +- Inverted index: used for document indexing. Used in Lucene +- Suffix tree: for string pattern search +- R-tree: multi-dimension search, such as finding the nearest neighbor ### How is an SQL statement executed in the database? @@ -624,26 +624,25 @@ The diagram below shows the process. Note that the architectures for different d

- Step 1 - A SQL statement is sent to the database via a transport layer protocol (e.g.TCP). Step 2 - The SQL statement is sent to the command parser, where it goes through syntactic and semantic analysis, and a query tree is generated afterward. -Step 3 - The query tree is sent to the optimizer. The optimizer creates an execution plan. +Step 3 - The query tree is sent to the optimizer. The optimizer creates an execution plan. Step 4 - The execution plan is sent to the executor. The executor retrieves data from the execution. -Step 5 - Access methods provide the data fetching logic required for execution, retrieving data from the storage engine. +Step 5 - Access methods provide the data fetching logic required for execution, retrieving data from the storage engine. Step 6 - Access methods decide whether the SQL statement is read-only. If the query is read-only (SELECT statement), it is passed to the buffer manager for further processing. The buffer manager looks for the data in the cache or data files. Step 7 - If the statement is an UPDATE or INSERT, it is passed to the transaction manager for further processing. -Step 8 - During a transaction, the data is in lock mode. This is guaranteed by the lock manager. It also ensures the transaction’s ACID properties. +Step 8 - During a transaction, the data is in lock mode. This is guaranteed by the lock manager. It also ensures the transaction’s ACID properties. -### CAP theorem +### CAP theorem -The CAP theorem is one of the most famous terms in computer science, but I bet different developers have different understandings. Let’s examine what it is and why it can be confusing. +The CAP theorem is one of the most famous terms in computer science, but I bet different developers have different understandings. Let’s examine what it is and why it can be confusing.

@@ -655,7 +654,7 @@ CAP theorem states that a distributed system can't provide more than two of thes **Availability**: availability means any client that requests data gets a response even if some of the nodes are down. -**Partition Tolerance**: a partition indicates a communication break between two nodes. Partition tolerance means the system continues to operate despite network partitions. +**Partition Tolerance**: a partition indicates a communication break between two nodes. Partition tolerance means the system continues to operate despite network partitions. The “2 of 3” formulation can be useful, **but this simplification could be misleading**. @@ -675,44 +674,43 @@ I think it is still useful as it opens our minds to a set of tradeoff discussion

- ### Visualizing a SQL query

-SQL statements are executed by the database system in several steps, including: +SQL statements are executed by the database system in several steps, including: -- Parsing the SQL statement and checking its validity -- Transforming the SQL into an internal representation, such as relational algebra -- Optimizing the internal representation and creating an execution plan that utilizes index information -- Executing the plan and returning the results +- Parsing the SQL statement and checking its validity +- Transforming the SQL into an internal representation, such as relational algebra +- Optimizing the internal representation and creating an execution plan that utilizes index information +- Executing the plan and returning the results -The execution of SQL is highly complex and involves many considerations, such as: +The execution of SQL is highly complex and involves many considerations, such as: -- The use of indexes and caches -- The order of table joins -- Concurrency control -- Transaction management +- The use of indexes and caches +- The order of table joins +- Concurrency control +- Transaction management -### SQL language +### SQL language -In 1986, SQL (Structured Query Language) became a standard. Over the next 40 years, it became the dominant language for relational database management systems. Reading the latest standard (ANSI SQL 2016) can be time-consuming. How can I learn it? +In 1986, SQL (Structured Query Language) became a standard. Over the next 40 years, it became the dominant language for relational database management systems. Reading the latest standard (ANSI SQL 2016) can be time-consuming. How can I learn it?

-There are 5 components of the SQL language: +There are 5 components of the SQL language: -- DDL: data definition language, such as CREATE, ALTER, DROP -- DQL: data query language, such as SELECT -- DML: data manipulation language, such as INSERT, UPDATE, DELETE -- DCL: data control language, such as GRANT, REVOKE -- TCL: transaction control language, such as COMMIT, ROLLBACK +- DDL: data definition language, such as CREATE, ALTER, DROP +- DQL: data query language, such as SELECT +- DML: data manipulation language, such as INSERT, UPDATE, DELETE +- DCL: data control language, such as GRANT, REVOKE +- TCL: transaction control language, such as COMMIT, ROLLBACK -For a backend engineer, you may need to know most of it. As a data analyst, you may need to have a good understanding of DQL. Select the topics that are most relevant to you. +For a backend engineer, you may need to know most of it. As a data analyst, you may need to have a good understanding of DQL. Select the topics that are most relevant to you. ## Cache @@ -724,7 +722,6 @@ This diagram illustrates where we cache data in a typical architecture.

- There are **multiple layers** along the flow. 1. Client apps: HTTP responses can be cached by the browser. We request data over HTTP for the first time, and it is returned with an expiry policy in the HTTP header; we request data again, and the client app tries to retrieve the data from the browser cache first. @@ -735,13 +732,14 @@ There are **multiple layers** along the flow. 6. Distributed Cache: Distributed cache like Redis holds key-value pairs for multiple services in memory. It provides much better read/write performance than the database. 7. Full-text Search: we sometimes need to use full-text searches like Elastic Search for document search or log search. A copy of data is indexed in the search engine as well. 8. Database: Even in the database, we have different levels of caches: + - WAL(Write-ahead Log): data is written to WAL first before building the B tree index - Bufferpool: A memory area allocated to cache query results - Materialized View: Pre-compute query results and store them in the database tables for better query performance - Transaction log: record all the transactions and database updates - Replication Log: used to record the replication state in a database cluster -### Why is Redis so fast? +### Why is Redis so fast? There are 3 main reasons as shown in the diagram below. @@ -749,7 +747,6 @@ There are 3 main reasons as shown in the diagram below.

- 1. Redis is a RAM-based data store. RAM access is at least 1000 times faster than random disk access. 2. Redis leverages IO multiplexing and single-threaded execution loop for execution efficiency. 3. Redis leverages several efficient lower-level data structures. @@ -764,78 +761,74 @@ You might have noticed the style of this diagram is different from my previous p

+There is more to Redis than just caching. -There is more to Redis than just caching. - -Redis can be used in a variety of scenarios as shown in the diagram. +Redis can be used in a variety of scenarios as shown in the diagram. -- Session +- Session - We can use Redis to share user session data among different services. + We can use Redis to share user session data among different services. -- Cache +- Cache - We can use Redis to cache objects or pages, especially for hotspot data. + We can use Redis to cache objects or pages, especially for hotspot data. -- Distributed lock +- Distributed lock - We can use a Redis string to acquire locks among distributed services. + We can use a Redis string to acquire locks among distributed services. -- Counter +- Counter - We can count how many likes or how many reads for articles. + We can count how many likes or how many reads for articles. -- Rate limiter +- Rate limiter - We can apply a rate limiter for certain user IPs. + We can apply a rate limiter for certain user IPs. -- Global ID generator +- Global ID generator - We can use Redis Int for global ID. + We can use Redis Int for global ID. -- Shopping cart +- Shopping cart - We can use Redis Hash to represent key-value pairs in a shopping cart. + We can use Redis Hash to represent key-value pairs in a shopping cart. -- Calculate user retention +- Calculate user retention - We can use Bitmap to represent the user login daily and calculate user retention. + We can use Bitmap to represent the user login daily and calculate user retention. -- Message queue +- Message queue - We can use List for a message queue. + We can use List for a message queue. -- Ranking +- Ranking - We can use ZSet to sort the articles. + We can use ZSet to sort the articles. ### Top caching strategies -Designing large-scale systems usually requires careful consideration of caching. -Below are five caching strategies that are frequently utilized. +Designing large-scale systems usually requires careful consideration of caching. +Below are five caching strategies that are frequently utilized.

- - ## Microservice architecture -### What does a typical microservice architecture look like? +### What does a typical microservice architecture look like?

+The diagram below shows a typical microservice architecture. -The diagram below shows a typical microservice architecture. - -- Load Balancer: This distributes incoming traffic across multiple backend services. -- CDN (Content Delivery Network): CDN is a group of geographically distributed servers that hold static content for faster delivery. The clients look for content in CDN first, then progress to backend services. +- Load Balancer: This distributes incoming traffic across multiple backend services. +- CDN (Content Delivery Network): CDN is a group of geographically distributed servers that hold static content for faster delivery. The clients look for content in CDN first, then progress to backend services. - API Gateway: This handles incoming requests and routes them to the relevant services. It talks to the identity provider and service discovery. -- Identity Provider: This handles authentication and authorization for users. -- Service Registry & Discovery: Microservice registration and discovery happen in this component, and the API gateway looks for relevant services in this component to talk to. +- Identity Provider: This handles authentication and authorization for users. +- Service Registry & Discovery: Microservice registration and discovery happen in this component, and the API gateway looks for relevant services in this component to talk to. - Management: This component is responsible for monitoring the services. - Microservices: Microservices are designed and deployed in different domains. Each domain has its own database. The API gateway talks to the microservices via REST API or other protocols, and the microservices within the same domain talk to each other using RPC (Remote Procedure Call). @@ -853,18 +846,17 @@ A picture is worth a thousand words: 9 best practices for developing microservic

- -When we develop microservices, we need to follow the following best practices: +When we develop microservices, we need to follow the following best practices: -1. Use separate data storage for each microservice -2. Keep code at a similar level of maturity -3. Separate build for each microservice -4. Assign each microservice with a single responsibility -5. Deploy into containers -6. Design stateless services +1. Use separate data storage for each microservice +2. Keep code at a similar level of maturity +3. Separate build for each microservice +4. Assign each microservice with a single responsibility +5. Deploy into containers +6. Design stateless services 7. Adopt domain-driven design -8. Design micro frontend -9. Orchestrating microservices +8. Design micro frontend +9. Orchestrating microservices ### What tech stack is commonly used for microservices? @@ -874,7 +866,6 @@ Below you will find a diagram showing the microservice tech stack, both for the

- ▶️ 𝐏𝐫𝐞-𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 - Define API - This establishes a contract between frontend and backend. We can use Postman or OpenAPI for this. @@ -883,10 +874,10 @@ Below you will find a diagram showing the microservice tech stack, both for the ▶️ 𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 -- NGinx is a common choice for load balancers. Cloudflare provides CDN (Content Delivery Network). +- NGinx is a common choice for load balancers. Cloudflare provides CDN (Content Delivery Network). - API Gateway - We can use spring boot for the gateway, and use Eureka/Zookeeper for service discovery. - The microservices are deployed on clouds. We have options among AWS, Microsoft Azure, or Google GCP. -Cache and Full-text Search - Redis is a common choice for caching key-value pairs. Elasticsearch is used for full-text search. + Cache and Full-text Search - Redis is a common choice for caching key-value pairs. Elasticsearch is used for full-text search. - Communications - For services to talk to each other, we can use messaging infra Kafka or RPC. - Persistence - We can use MySQL or PostgreSQL for a relational database, and Amazon S3 for object store. We can also use Cassandra for the wide-column store if necessary. - Management & Monitoring - To manage so many microservices, the common Ops tools include Prometheus, Elastic Stack, and Kubernetes. @@ -901,29 +892,28 @@ There are many design decisions that contributed to Kafka’s performance. In th 1. The first one is Kafka’s reliance on Sequential I/O. 2. The second design choice that gives Kafka its performance advantage is its focus on efficiency: zero copy principle. - + The diagram illustrates how the data is transmitted between producer and consumer, and what zero-copy means. - -- Step 1.1 - 1.3: Producer writes data to the disk + +- Step 1.1 - 1.3: Producer writes data to the disk - Step 2: Consumer reads data without zero-copy -2.1 The data is loaded from disk to OS cache + 2.1 The data is loaded from disk to OS cache -2.2 The data is copied from OS cache to Kafka application + 2.2 The data is copied from OS cache to Kafka application -2.3 Kafka application copies the data into the socket buffer + 2.3 Kafka application copies the data into the socket buffer -2.4 The data is copied from socket buffer to network card + 2.4 The data is copied from socket buffer to network card -2.5 The network card sends data out to the consumer + 2.5 The network card sends data out to the consumer - - Step 3: Consumer reads data with zero-copy -3.1: The data is loaded from disk to OS cache -3.2 OS cache directly copies the data to the network card via sendfile() command -3.3 The network card sends data out to the consumer - + 3.1: The data is loaded from disk to OS cache + 3.2 OS cache directly copies the data to the network card via sendfile() command + 3.3 The network card sends data out to the consumer + Zero copy is a shortcut to save the multiple data copies between application context and kernel context. ## Payment systems @@ -934,7 +924,7 @@ Zero copy is a shortcut to save the multiple data copies between application con

-### Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money? +### Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money? The diagram below shows the economics of the credit card payment flow. @@ -946,9 +936,9 @@ The diagram below shows the economics of the credit card payment flow. 2. The merchant benefits from the use of the credit card with higher sales volume and needs to compensate the issuer and the card network for providing the payment service. The acquiring bank sets a fee with the merchant, called the “merchant discount fee.” -3 - 4. The acquiring bank keeps $0.25 as the acquiring markup, and $1.75 is paid to the issuing bank as the interchange fee. The merchant discount fee should cover the interchange fee. +3 - 4. The acquiring bank keeps $0.25 as the acquiring markup, and $1.75 is paid to the issuing bank as the interchange fee. The merchant discount fee should cover the interchange fee. - The interchange fee is set by the card network because it is less efficient for each issuing bank to negotiate fees with each merchant. +The interchange fee is set by the card network because it is less efficient for each issuing bank to negotiate fees with each merchant. 5.  The card network sets up the network assessments and fees with each bank, which pays the card network for its services every month. For example, VISA charges a 0.11% assessment, plus a $0.0195 usage fee, for every swipe. @@ -956,9 +946,9 @@ The diagram below shows the economics of the credit card payment flow. Why should the issuing bank be compensated? -- The issuer pays the merchant even if the cardholder fails to pay the issuer. +- The issuer pays the merchant even if the cardholder fails to pay the issuer. - The issuer pays the merchant before the cardholder pays the issuer. -- The issuer has other operating costs, including managing customer accounts, providing statements, fraud detection, risk management, clearing & settlement, etc. +- The issuer has other operating costs, including managing customer accounts, providing statements, fraud detection, risk management, clearing & settlement, etc. ### How does VISA work when we swipe a credit card at a merchant’s shop? @@ -966,69 +956,65 @@ Why should the issuing bank be compensated?

+VISA, Mastercard, and American Express act as card networks for the clearing and settling of funds. The card acquiring bank and the card issuing bank can be – and often are – different. If banks were to settle transactions one by one without an intermediary, each bank would have to settle the transactions with all the other banks. This is quite inefficient. -VISA, Mastercard, and American Express act as card networks for the clearing and settling of funds. The card acquiring bank and the card issuing bank can be – and often are – different. If banks were to settle transactions one by one without an intermediary, each bank would have to settle the transactions with all the other banks. This is quite inefficient. - The diagram below shows VISA’s role in the credit card payment process. There are two flows involved. Authorization flow happens when the customer swipes the credit card. Capture and settlement flow happens when the merchant wants to get the money at the end of the day. - + - Authorization Flow -Step 0: The card issuing bank issues credit cards to its customers. - +Step 0: The card issuing bank issues credit cards to its customers. + Step 1: The cardholder wants to buy a product and swipes the credit card at the Point of Sale (POS) terminal in the merchant’s shop. - + Step 2: The POS terminal sends the transaction to the acquiring bank, which has provided the POS terminal. - + Steps 3 and 4: The acquiring bank sends the transaction to the card network, also called the card scheme. The card network sends the transaction to the issuing bank for approval. - -Steps 4.1, 4.2 and 4.3: The issuing bank freezes the money if the transaction is approved. The approval or rejection is sent back to the acquirer, as well as the POS terminal. - + +Steps 4.1, 4.2 and 4.3: The issuing bank freezes the money if the transaction is approved. The approval or rejection is sent back to the acquirer, as well as the POS terminal. + - Capture and Settlement Flow Steps 1 and 2: The merchant wants to collect the money at the end of the day, so they hit ”capture” on the POS terminal. The transactions are sent to the acquirer in batch. The acquirer sends the batch file with transactions to the card network. - + Step 3: The card network performs clearing for the transactions collected from different acquirers, and sends the clearing files to different issuing banks. - + Step 4: The issuing banks confirm the correctness of the clearing files, and transfer money to the relevant acquiring banks. - -Step 5: The acquiring bank then transfers money to the merchant’s bank. - + +Step 5: The acquiring bank then transfers money to the merchant’s bank. + Step 4: The card network clears up the transactions from different acquiring banks. Clearing is a process in which mutual offset transactions are netted, so the number of total transactions is reduced. - + In the process, the card network takes on the burden of talking to each bank and receives service fees in return. ### Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India - What’s UPI? UPI is an instant real-time payment system developed by the National Payments Corporation of India. It accounts for 60% of digital retail transactions in India today. UPI = payment markup language + standard for interoperable payments -

- ## DevOps -### DevOps vs. SRE vs. Platform Engineering. What is the difference? +### DevOps vs. SRE vs. Platform Engineering. What is the difference? -The concepts of DevOps, SRE, and Platform Engineering have emerged at different times and have been developed by various individuals and organizations. +The concepts of DevOps, SRE, and Platform Engineering have emerged at different times and have been developed by various individuals and organizations.

-DevOps as a concept was introduced in 2009 by Patrick Debois and Andrew Shafer at the Agile conference. They sought to bridge the gap between software development and operations by promoting a collaborative culture and shared responsibility for the entire software development lifecycle. +DevOps as a concept was introduced in 2009 by Patrick Debois and Andrew Shafer at the Agile conference. They sought to bridge the gap between software development and operations by promoting a collaborative culture and shared responsibility for the entire software development lifecycle. -SRE, or Site Reliability Engineering, was pioneered by Google in the early 2000s to address operational challenges in managing large-scale, complex systems. Google developed SRE practices and tools, such as the Borg cluster management system and the Monarch monitoring system, to improve the reliability and efficiency of their services. +SRE, or Site Reliability Engineering, was pioneered by Google in the early 2000s to address operational challenges in managing large-scale, complex systems. Google developed SRE practices and tools, such as the Borg cluster management system and the Monarch monitoring system, to improve the reliability and efficiency of their services. -Platform Engineering is a more recent concept, building on the foundation of SRE engineering. The precise origins of Platform Engineering are less clear, but it is generally understood to be an extension of the DevOps and SRE practices, with a focus on delivering a comprehensive platform for product development that supports the entire business perspective. +Platform Engineering is a more recent concept, building on the foundation of SRE engineering. The precise origins of Platform Engineering are less clear, but it is generally understood to be an extension of the DevOps and SRE practices, with a focus on delivering a comprehensive platform for product development that supports the entire business perspective. -It's worth noting that while these concepts emerged at different times. They are all related to the broader trend of improving collaboration, automation, and efficiency in software development and operations. +It's worth noting that while these concepts emerged at different times. They are all related to the broader trend of improving collaboration, automation, and efficiency in software development and operations. ### What is k8s (Kubernetes)? @@ -1046,124 +1032,121 @@ The worker node(s) host the Pods that are the components of the application work 1. API Server - The API server talks to all the components in the k8s cluster. All the operations on pods are executed by talking to the API server. + The API server talks to all the components in the k8s cluster. All the operations on pods are executed by talking to the API server. 2. Scheduler - The scheduler watches pod workloads and assigns loads on newly created pods. + The scheduler watches pod workloads and assigns loads on newly created pods. 3. Controller Manager - The controller manager runs the controllers, including Node Controller, Job Controller, EndpointSlice Controller, and ServiceAccount Controller. + The controller manager runs the controllers, including Node Controller, Job Controller, EndpointSlice Controller, and ServiceAccount Controller. 4. Etcd - - etcd is a key-value store used as Kubernetes' backing store for all cluster data. + + etcd is a key-value store used as Kubernetes' backing store for all cluster data. - Nodes 1. Pods - A pod is a group of containers and is the smallest unit that k8s administers. Pods have a single IP address applied to every container within the pod. + A pod is a group of containers and is the smallest unit that k8s administers. Pods have a single IP address applied to every container within the pod. 2. Kubelet - An agent that runs on each node in the cluster. It ensures containers are running in a Pod. + An agent that runs on each node in the cluster. It ensures containers are running in a Pod. 3. Kube Proxy - Kube-proxy is a network proxy that runs on each node in your cluster. It routes traffic coming into a node from the service. It forwards requests for work to the correct containers. + Kube-proxy is a network proxy that runs on each node in your cluster. It routes traffic coming into a node from the service. It forwards requests for work to the correct containers. -### Docker vs. Kubernetes. Which one should we use? +### Docker vs. Kubernetes. Which one should we use?

+What is Docker ? -What is Docker ? - -Docker is an open-source platform that allows you to package, distribute, and run applications in isolated containers. It focuses on containerization, providing lightweight environments that encapsulate applications and their dependencies. +Docker is an open-source platform that allows you to package, distribute, and run applications in isolated containers. It focuses on containerization, providing lightweight environments that encapsulate applications and their dependencies. -What is Kubernetes ? +What is Kubernetes ? -Kubernetes, often referred to as K8s, is an open-source container orchestration platform. It provides a framework for automating the deployment, scaling, and management of containerized applications across a cluster of nodes. +Kubernetes, often referred to as K8s, is an open-source container orchestration platform. It provides a framework for automating the deployment, scaling, and management of containerized applications across a cluster of nodes. -How are both different from each other ? +How are both different from each other ? -Docker: Docker operates at the individual container level on a single operating system host. +Docker: Docker operates at the individual container level on a single operating system host. -You must manually manage each host and setting up networks, security policies, and storage for multiple related containers can be complex. +You must manually manage each host and setting up networks, security policies, and storage for multiple related containers can be complex. -Kubernetes: Kubernetes operates at the cluster level. It manages multiple containerized applications across multiple hosts, providing automation for tasks like load balancing, scaling, and ensuring the desired state of applications. +Kubernetes: Kubernetes operates at the cluster level. It manages multiple containerized applications across multiple hosts, providing automation for tasks like load balancing, scaling, and ensuring the desired state of applications. -In short, Docker focuses on containerization and running containers on individual hosts, while Kubernetes specializes in managing and orchestrating containers at scale across a cluster of hosts. +In short, Docker focuses on containerization and running containers on individual hosts, while Kubernetes specializes in managing and orchestrating containers at scale across a cluster of hosts. -### How does Docker work? +### How does Docker work? -The diagram below shows the architecture of Docker and how it works when we run “docker build”, “docker pull” -and “docker run”. +The diagram below shows the architecture of Docker and how it works when we run “docker build”, “docker pull” +and “docker run”.

-There are 3 components in Docker architecture: +There are 3 components in Docker architecture: -- Docker client - - The docker client talks to the Docker daemon. +- Docker client -- Docker host + The docker client talks to the Docker daemon. - The Docker daemon listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes. +- Docker host -- Docker registry + The Docker daemon listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes. - A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use. +- Docker registry -Let’s take the “docker run” command as an example. + A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use. - 1. Docker pulls the image from the registry. - 1. Docker creates a new container. - 1. Docker allocates a read-write filesystem to the container. - 1. Docker creates a network interface to connect the container to the default network. - 1. Docker starts the container. +Let’s take the “docker run” command as an example. + +1. Docker pulls the image from the registry. +1. Docker creates a new container. +1. Docker allocates a read-write filesystem to the container. +1. Docker creates a network interface to connect the container to the default network. +1. Docker starts the container. ## GIT ### How Git Commands work -To begin with, it's essential to identify where our code is stored. The common assumption is that there are only two locations - one on a remote server like Github and the other on our local machine. However, this isn't entirely accurate. Git maintains three local storages on our machine, which means that our code can be found in four places: +To begin with, it's essential to identify where our code is stored. The common assumption is that there are only two locations - one on a remote server like Github and the other on our local machine. However, this isn't entirely accurate. Git maintains three local storages on our machine, which means that our code can be found in four places:

+- Working directory: where we edit files +- Staging area: a temporary location where files are kept for the next commit +- Local repository: contains the code that has been committed +- Remote repository: the remote server that stores the code -- Working directory: where we edit files -- Staging area: a temporary location where files are kept for the next commit -- Local repository: contains the code that has been committed -- Remote repository: the remote server that stores the code - -Most Git commands primarily move files between these four locations. +Most Git commands primarily move files between these four locations. ### How does Git Work? -The diagram below shows the Git workflow. +The diagram below shows the Git workflow.

+Git is a distributed version control system. -Git is a distributed version control system. - -Every developer maintains a local copy of the main repository and edits and commits to the local copy. +Every developer maintains a local copy of the main repository and edits and commits to the local copy. -The commit is very fast because the operation doesn’t interact with the remote repository. +The commit is very fast because the operation doesn’t interact with the remote repository. -If the remote repository crashes, the files can be recovered from the local repositories. +If the remote repository crashes, the files can be recovered from the local repositories. ### Git merge vs. Git rebase @@ -1173,7 +1156,6 @@ What are the differences?

- When we **merge changes** from one Git branch to another, we can use ‘git merge’ or ‘git rebase’. The diagram below shows how the two commands work. **Git merge** @@ -1202,36 +1184,35 @@ Never use it on public branches!

- ### What is cloud native? -Below is a diagram showing the evolution of architecture and processes since the 1980s. +Below is a diagram showing the evolution of architecture and processes since the 1980s.

-Organizations can build and run scalable applications on public, private, and hybrid clouds using cloud native technologies. +Organizations can build and run scalable applications on public, private, and hybrid clouds using cloud native technologies. -This means the applications are designed to leverage cloud features, so they are resilient to load and easy to scale. +This means the applications are designed to leverage cloud features, so they are resilient to load and easy to scale. -Cloud native includes 4 aspects: +Cloud native includes 4 aspects: -1. Development process +1. Development process - This has progressed from waterfall to agile to DevOps. + This has progressed from waterfall to agile to DevOps. -2. Application Architecture +2. Application Architecture - The architecture has gone from monolithic to microservices. Each service is designed to be small, adaptive to the limited resources in cloud containers. + The architecture has gone from monolithic to microservices. Each service is designed to be small, adaptive to the limited resources in cloud containers. -3. Deployment & packaging +3. Deployment & packaging - The applications used to be deployed on physical servers. Then around 2000, the applications that were not sensitive to latency were usually deployed on virtual servers. With cloud native applications, they are packaged into docker images and deployed in containers. + The applications used to be deployed on physical servers. Then around 2000, the applications that were not sensitive to latency were usually deployed on virtual servers. With cloud native applications, they are packaged into docker images and deployed in containers. -4. Application infrastructure +4. Application infrastructure - The applications are massively deployed on cloud infrastructure instead of self-hosted servers. + The applications are massively deployed on cloud infrastructure instead of self-hosted servers. ## Developer productivity tools @@ -1247,21 +1228,19 @@ Additionally, the generated diagrams can be downloaded as images.

- ### Automatically turn code into architecture diagrams

- What does it do? - Draw the cloud system architecture in Python code. - Diagrams can also be rendered directly inside the Jupyter Notebooks. -- No design tools are needed. -- Supports the following providers: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud, etc. - +- No design tools are needed. +- Supports the following providers: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud, etc. + [Github repo](https://github.com/mingrammer/diagrams) ## Linux @@ -1277,56 +1256,54 @@ The Linux file system used to resemble an unorganized town where individuals con By implementing a standard like the FHS, software can ensure a consistent layout across various Linux distributions. Nonetheless, not all Linux distributions strictly adhere to this standard. They often incorporate their own unique elements or cater to specific requirements. To become proficient in this standard, you can begin by exploring. Utilize commands such as "cd" for navigation and "ls" for listing directory contents. Imagine the file system as a tree, starting from the root (/). With time, it will become second nature to you, transforming you into a skilled Linux administrator. -### 18 Most-used Linux Commands You Should Know +### 18 Most-used Linux Commands You Should Know -Linux commands are instructions for interacting with the operating system. They help manage files, directories, system processes, and many other aspects of the system. You need to become familiar with these commands in order to navigate and maintain Linux-based systems efficiently and effectively. +Linux commands are instructions for interacting with the operating system. They help manage files, directories, system processes, and many other aspects of the system. You need to become familiar with these commands in order to navigate and maintain Linux-based systems efficiently and effectively. -This diagram below shows popular Linux commands: +This diagram below shows popular Linux commands:

- -- ls - List files and directories -- cd - Change the current directory -- mkdir - Create a new directory -- rm - Remove files or directories -- cp - Copy files or directories -- mv - Move or rename files or directories -- chmod - Change file or directory permissions -- grep - Search for a pattern in files -- find - Search for files and directories -- tar - manipulate tarball archive files -- vi - Edit files using text editors -- cat - display the content of files -- top - Display processes and resource usage -- ps - Display processes information -- kill - Terminate a process by sending a signal -- du - Estimate file space usage -- ifconfig - Configure network interfaces -- ping - Test network connectivity between hosts +- ls - List files and directories +- cd - Change the current directory +- mkdir - Create a new directory +- rm - Remove files or directories +- cp - Copy files or directories +- mv - Move or rename files or directories +- chmod - Change file or directory permissions +- grep - Search for a pattern in files +- find - Search for files and directories +- tar - manipulate tarball archive files +- vi - Edit files using text editors +- cat - display the content of files +- top - Display processes and resource usage +- ps - Display processes information +- kill - Terminate a process by sending a signal +- du - Estimate file space usage +- ifconfig - Configure network interfaces +- ping - Test network connectivity between hosts ## Security ### How does HTTPS work? -Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP.) HTTPS transmits encrypted data using Transport Layer Security (TLS.) If the data is hijacked online, all the hijacker gets is binary code. +Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP.) HTTPS transmits encrypted data using Transport Layer Security (TLS.) If the data is hijacked online, all the hijacker gets is binary code.

- How is the data encrypted and decrypted? Step 1 - The client (browser) and the server establish a TCP connection. Step 2 - The client sends a “client hello” to the server. The message contains a set of necessary encryption algorithms (cipher suites) and the latest TLS version it can support. The server responds with a “server hello” so the browser knows whether it can support the algorithms and TLS version. -The server then sends the SSL certificate to the client. The certificate contains the public key, host name, expiry dates, etc. The client validates the certificate. +The server then sends the SSL certificate to the client. The certificate contains the public key, host name, expiry dates, etc. The client validates the certificate. -Step 3 - After validating the SSL certificate, the client generates a session key and encrypts it using the public key. The server receives the encrypted session key and decrypts it with the private key. +Step 3 - After validating the SSL certificate, the client generates a session key and encrypts it using the public key. The server receives the encrypted session key and decrypts it with the private key. Step 4 - Now that both the client and the server hold the same session key (symmetric encryption), the encrypted data is transmitted in a secure bi-directional channel. @@ -1336,49 +1313,49 @@ Why does HTTPS switch to symmetric encryption during data transmission? There ar 2. Server resources: The asymmetric encryption adds quite a lot of mathematical overhead. It is not suitable for data transmissions in long sessions. -### Oauth 2.0 Explained With Simple Terms. +### Oauth 2.0 Explained With Simple Terms. -OAuth 2.0 is a powerful and secure framework that allows different applications to securely interact with each other on behalf of users without sharing sensitive credentials. +OAuth 2.0 is a powerful and secure framework that allows different applications to securely interact with each other on behalf of users without sharing sensitive credentials.

-The entities involved in OAuth are the User, the Server, and the Identity Provider (IDP). +The entities involved in OAuth are the User, the Server, and the Identity Provider (IDP). -What Can an OAuth Token Do? +What Can an OAuth Token Do? -When you use OAuth, you get an OAuth token that represents your identity and permissions. This token can do a few important things: +When you use OAuth, you get an OAuth token that represents your identity and permissions. This token can do a few important things: -Single Sign-On (SSO): With an OAuth token, you can log into multiple services or apps using just one login, making life easier and safer. +Single Sign-On (SSO): With an OAuth token, you can log into multiple services or apps using just one login, making life easier and safer. -Authorization Across Systems: The OAuth token allows you to share your authorization or access rights across various systems, so you don't have to log in separately everywhere. +Authorization Across Systems: The OAuth token allows you to share your authorization or access rights across various systems, so you don't have to log in separately everywhere. -Accessing User Profile: Apps with an OAuth token can access certain parts of your user profile that you allow, but they won't see everything. +Accessing User Profile: Apps with an OAuth token can access certain parts of your user profile that you allow, but they won't see everything. Remember, OAuth 2.0 is all about keeping you and your data safe while making your online experiences seamless and hassle-free across different applications and services. -### Top 4 Forms of Authentication Mechanisms +### Top 4 Forms of Authentication Mechanisms

-1. SSH Keys: - - Cryptographic keys are used to access remote systems and servers securely +1. SSH Keys: -1. OAuth Tokens: + Cryptographic keys are used to access remote systems and servers securely - Tokens that provide limited access to user data on third-party applications +1. OAuth Tokens: -1. SSL Certificates: - - Digital certificates ensure secure and encrypted communication between servers and clients + Tokens that provide limited access to user data on third-party applications -1. Credentials: +1. SSL Certificates: - User authentication information is used to verify and grant access to various systems and services + Digital certificates ensure secure and encrypted communication between servers and clients + +1. Credentials: + + User authentication information is used to verify and grant access to various systems and services ### Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they? @@ -1402,25 +1379,24 @@ From simple to complex, here is my understanding of user identity management: - By using OAuth 2.0, you can authorize one website to access your information on another website. -### How to store passwords safely in the database and how to validate a password? +### How to store passwords safely in the database and how to validate a password?

- **Things NOT to do** - Storing passwords in plain text is not a good idea because anyone with internal access can see them. -- Storing password hashes directly is not sufficient because it is pruned to precomputation attacks, such as rainbow tables. +- Storing password hashes directly is not sufficient because it is pruned to precomputation attacks, such as rainbow tables. -- To mitigate precomputation attacks, we salt the passwords. +- To mitigate precomputation attacks, we salt the passwords. **What is salt?** According to OWASP guidelines, “a salt is a unique, randomly generated string that is added to each password as part of the hashing process”. - + **How to store a password and salt?** 1. the hash result is unique to each password. @@ -1433,7 +1409,7 @@ To validate a password, it can go through the following process: 1. A client enters the password. 1. The system fetches the corresponding salt from the database. 1. The system appends the salt to the password and hashes it. Let’s call the hashed value H1. -1. The system compares H1 and H2, where H2 is the hash stored in the database. If they are the same, the password is valid. +1. The system compares H1 and H2, where H2 is the hash stored in the database. If they are the same, the password is valid. ### Explaining JSON Web Token (JWT) to a 10 year old Kid @@ -1453,48 +1429,45 @@ When you want to send the JWT to a server, you put the header, payload, and sign ### How does Google Authenticator (or other types of 2-factor authenticators) work? Google Authenticator is commonly used for logging into our accounts when 2-factor authentication is enabled. How does it guarantee security? - -Google Authenticator is a software-based authenticator that implements a two-step verification service. The diagram below provides detail. + +Google Authenticator is a software-based authenticator that implements a two-step verification service. The diagram below provides detail.

- There are two stages involved: -- Stage 1 - The user enables Google two-step verification. +- Stage 1 - The user enables Google two-step verification. - Stage 2 - The user uses the authenticator for logging in, etc. Let’s look at these stages. - + **Stage 1** Steps 1 and 2: Bob opens the web page to enable two-step verification. The front end requests a secret key. The authentication service generates the secret key for Bob and stores it in the database. - + Step 3: The authentication service returns a URI to the front end. The URI is composed of a key issuer, username, and secret key. The URI is displayed in the form of a QR code on the web page. - + Step 4: Bob then uses Google Authenticator to scan the generated QR code. The secret key is stored in the authenticator. **Stage 2** Steps 1 and 2: Bob wants to log into a website with Google two-step verification. For this, he needs the password. Every 30 seconds, Google Authenticator generates a 6-digit password using TOTP (Time-based One Time Password) algorithm. Bob uses the password to enter the website. - + Steps 3 and 4: The frontend sends the password Bob enters to the backend for authentication. The authentication service reads the secret key from the database and generates a 6-digit password using the same TOTP algorithm as the client. - + Step 5: The authentication service compares the two passwords generated by the client and the server, and returns the comparison result to the frontend. Bob can proceed with the login process only if the two passwords match. - -Is this authentication mechanism safe? -- Can the secret key be obtained by others? +Is this authentication mechanism safe? - We need to make sure the secret key is transmitted using HTTPS. The authenticator client and the database store the secret key, and we need to make sure the secret keys are encrypted. +- Can the secret key be obtained by others? -- Can the 6-digit password be guessed by hackers? - - No. The password has 6 digits, so the generated password has 1 million potential combinations. Plus, the password changes every 30 seconds. If hackers want to guess the password in 30 seconds, they need to enter 30,000 combinations per second. + We need to make sure the secret key is transmitted using HTTPS. The authenticator client and the database store the secret key, and we need to make sure the secret keys are encrypted. +- Can the 6-digit password be guessed by hackers? + No. The password has 6 digits, so the generated password has 1 million potential combinations. Plus, the password changes every 30 seconds. If hackers want to guess the password in 30 seconds, they need to enter 30,000 combinations per second. -## Real World Case Studies +## Real World Case Studies ### Netflix's Tech Stack @@ -1522,30 +1495,28 @@ This post is based on research from many Netflix engineering blogs and open-sour ### Twitter Architecture 2022 -Yes, this is the real Twitter architecture. It is posted by Elon Musk and redrawn by us for better readability. +Yes, this is the real Twitter architecture. It is posted by Elon Musk and redrawn by us for better readability.

- ### Evolution of Airbnb’s microservice architecture over the past 15 years -Airbnb’s microservice architecture went through 3 main stages. +Airbnb’s microservice architecture went through 3 main stages.

- Monolith (2008 - 2017) -Airbnb began as a simple marketplace for hosts and guests. This is built in a Ruby on Rails application - the monolith. +Airbnb began as a simple marketplace for hosts and guests. This is built in a Ruby on Rails application - the monolith. What’s the challenge? - Confusing team ownership + unowned code -- Slow deployment +- Slow deployment Microservices (2017 - 2020) @@ -1565,35 +1536,34 @@ Micro + macroservices (2020 - present) This is what Airbnb is working on now. The micro and macroservice hybrid model focuses on the unification of APIs. -### Monorepo vs. Microrepo. +### Monorepo vs. Microrepo. -Which is the best? Why do different companies choose different options? +Which is the best? Why do different companies choose different options?

+Monorepo isn't new; Linux and Windows were both created using Monorepo. To improve scalability and build speed, Google developed its internal dedicated toolchain to scale it faster and strict coding quality standards to keep it consistent. -Monorepo isn't new; Linux and Windows were both created using Monorepo. To improve scalability and build speed, Google developed its internal dedicated toolchain to scale it faster and strict coding quality standards to keep it consistent. +Amazon and Netflix are major ambassadors of the Microservice philosophy. This approach naturally separates the service code into separate repositories. It scales faster but can lead to governance pain points later on. -Amazon and Netflix are major ambassadors of the Microservice philosophy. This approach naturally separates the service code into separate repositories. It scales faster but can lead to governance pain points later on. +Within Monorepo, each service is a folder, and every folder has a BUILD config and OWNERS permission control. Every service member is responsible for their own folder. -Within Monorepo, each service is a folder, and every folder has a BUILD config and OWNERS permission control. Every service member is responsible for their own folder. +On the other hand, in Microrepo, each service is responsible for its repository, with the build config and permissions typically set for the entire repository. -On the other hand, in Microrepo, each service is responsible for its repository, with the build config and permissions typically set for the entire repository. +In Monorepo, dependencies are shared across the entire codebase regardless of your business, so when there's a version upgrade, every codebase upgrades their version. -In Monorepo, dependencies are shared across the entire codebase regardless of your business, so when there's a version upgrade, every codebase upgrades their version. +In Microrepo, dependencies are controlled within each repository. Businesses choose when to upgrade their versions based on their own schedules. -In Microrepo, dependencies are controlled within each repository. Businesses choose when to upgrade their versions based on their own schedules. +Monorepo has a standard for check-ins. Google's code review process is famously known for setting a high bar, ensuring a coherent quality standard for Monorepo, regardless of the business. -Monorepo has a standard for check-ins. Google's code review process is famously known for setting a high bar, ensuring a coherent quality standard for Monorepo, regardless of the business. +Microrepo can either set its own standard or adopt a shared standard by incorporating the best practices. It can scale faster for business, but the code quality might be a bit different. +Google engineers built Bazel, and Meta built Buck. There are other open-source tools available, including Nx, Lerna, and others. -Microrepo can either set its own standard or adopt a shared standard by incorporating the best practices. It can scale faster for business, but the code quality might be a bit different. -Google engineers built Bazel, and Meta built Buck. There are other open-source tools available, including Nx, Lerna, and others. +Over the years, Microrepo has had more supported tools, including Maven and Gradle for Java, NPM for NodeJS, and CMake for C/C++, among others. -Over the years, Microrepo has had more supported tools, including Maven and Gradle for Java, NPM for NodeJS, and CMake for C/C++, among others. - -### How will you design the Stack Overflow website? +### How will you design the Stack Overflow website? If your answer is on-premise servers and monolith (on the bottom of the following image), you would likely fail the interview, but that's how it is built in reality! @@ -1601,7 +1571,6 @@ If your answer is on-premise servers and monolith (on the bottom of the followin

- **What people think it should look like** The interviewer is probably expecting something like the top portion of the picture. @@ -1617,42 +1586,41 @@ The interviewer is probably expecting something like the top portion of the pict Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud. -This is contrary to all our popular beliefs these days. +This is contrary to all our popular beliefs these days. ### Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost? -The diagram below shows the architecture comparison before and after the migration. +The diagram below shows the architecture comparison before and after the migration.

+What is Amazon Prime Video Monitoring Service? -What is Amazon Prime Video Monitoring Service? +Prime Video service needs to monitor the quality of thousands of live streams. The monitoring tool automatically analyzes the streams in real time and identifies quality issues like block corruption, video freeze, and sync problems. This is an important process for customer satisfaction. -Prime Video service needs to monitor the quality of thousands of live streams. The monitoring tool automatically analyzes the streams in real time and identifies quality issues like block corruption, video freeze, and sync problems. This is an important process for customer satisfaction. +There are 3 steps: media converter, defect detector, and real-time notification. -There are 3 steps: media converter, defect detector, and real-time notification. +- What is the problem with the old architecture? -- What is the problem with the old architecture? + The old architecture was based on Amazon Lambda, which was good for building services quickly. However, it was not cost-effective when running the architecture at a high scale. The two most expensive operations are: - The old architecture was based on Amazon Lambda, which was good for building services quickly. However, it was not cost-effective when running the architecture at a high scale. The two most expensive operations are: +1. The orchestration workflow - AWS step functions charge users by state transitions and the orchestration performs multiple state transitions every second. -1. The orchestration workflow - AWS step functions charge users by state transitions and the orchestration performs multiple state transitions every second. +2. Data passing between distributed components - the intermediate data is stored in Amazon S3 so that the next stage can download. The download can be costly when the volume is high. -2. Data passing between distributed components - the intermediate data is stored in Amazon S3 so that the next stage can download. The download can be costly when the volume is high. +- Monolithic architecture saves 90% cost -- Monolithic architecture saves 90% cost + A monolithic architecture is designed to address the cost issues. There are still 3 components, but the media converter and defect detector are deployed in the same process, saving the cost of passing data over the network. Surprisingly, this approach to deployment architecture change led to 90% cost savings! - A monolithic architecture is designed to address the cost issues. There are still 3 components, but the media converter and defect detector are deployed in the same process, saving the cost of passing data over the network. Surprisingly, this approach to deployment architecture change led to 90% cost savings! +This is an interesting and unique case study because microservices have become a go-to and fashionable choice in the tech industry. It's good to see that we are having more discussions about evolving the architecture and having more honest discussions about its pros and cons. Decomposing components into distributed microservices comes with a cost. -This is an interesting and unique case study because microservices have become a go-to and fashionable choice in the tech industry. It's good to see that we are having more discussions about evolving the architecture and having more honest discussions about its pros and cons. Decomposing components into distributed microservices comes with a cost. +- What did Amazon leaders say about this? -- What did Amazon leaders say about this? - - Amazon CTO Werner Vogels: “Building **evolvable software systems** is a strategy, not a religion. And revisiting your architecture with an open mind is a must.” + Amazon CTO Werner Vogels: “Building **evolvable software systems** is a strategy, not a religion. And revisiting your architecture with an open mind is a must.” -Ex Amazon VP Sustainability Adrian Cockcroft: “The Prime Video team had followed a path I call **Serverless First**…I don’t advocate **Serverless Only**”. +Ex Amazon VP Sustainability Adrian Cockcroft: “The Prime Video team had followed a path I call **Serverless First**…I don’t advocate **Serverless Only**”. ### How does Disney Hotstar capture 5 Billion Emojis during a tournament? @@ -1660,73 +1628,70 @@ Ex Amazon VP Sustainability Adrian Cockcroft: “The Prime Video team had follow

- 1. Clients send emojis through standard HTTP requests. You can think of Golang Service as a typical Web Server. Golang is chosen because it supports concurrency well. Threads in Golang are lightweight. 2. Since the write volume is very high, Kafka (message queue) is used as a buffer. 3. Emoji data are aggregated by a streaming processing service called Spark. It aggregates data every 2 seconds, which is configurable. There is a trade-off to be made based on the interval. A shorter interval means emojis are delivered to other clients faster but it also means more computing resources are needed. -4. Aggregated data is written to another Kafka. +4. Aggregated data is written to another Kafka. -5. The PubSub consumers pull aggregated emoji data from Kafka. +5. The PubSub consumers pull aggregated emoji data from Kafka. 6. Emojis are delivered to other clients in real-time through the PubSub infrastructure. The PubSub infrastructure is interesting. Hotstar considered the following protocols: Socketio, NATS, MQTT, and gRPC, and settled with MQTT. - + A similar design is adopted by LinkedIn which streams a million likes/sec. -### How Discord Stores Trillions Of Messages +### How Discord Stores Trillions Of Messages -The diagram below shows the evolution of message storage at Discord: +The diagram below shows the evolution of message storage at Discord:

+MongoDB ➡️ Cassandra ➡️ ScyllaDB -MongoDB ➡️ Cassandra ➡️ ScyllaDB - -In 2015, the first version of Discord was built on top of a single MongoDB replica. Around Nov 2015, MongoDB stored 100 million messages and the RAM couldn’t hold the data and index any longer. The latency became unpredictable. Message storage needs to be moved to another database. Cassandra was chosen. +In 2015, the first version of Discord was built on top of a single MongoDB replica. Around Nov 2015, MongoDB stored 100 million messages and the RAM couldn’t hold the data and index any longer. The latency became unpredictable. Message storage needs to be moved to another database. Cassandra was chosen. -In 2017, Discord had 12 Cassandra nodes and stored billions of messages. +In 2017, Discord had 12 Cassandra nodes and stored billions of messages. -At the beginning of 2022, it had 177 nodes with trillions of messages. At this point, latency was unpredictable, and maintenance operations became too expensive to run. +At the beginning of 2022, it had 177 nodes with trillions of messages. At this point, latency was unpredictable, and maintenance operations became too expensive to run. -There are several reasons for the issue: +There are several reasons for the issue: -- Cassandra uses the LSM tree for the internal data structure. The reads are more expensive than the writes. There can be many concurrent reads on a server with hundreds of users, resulting in hotspots. -- Maintaining clusters, such as compacting SSTables, impacts performance. -- Garbage collection pauses would cause significant latency spikes +- Cassandra uses the LSM tree for the internal data structure. The reads are more expensive than the writes. There can be many concurrent reads on a server with hundreds of users, resulting in hotspots. +- Maintaining clusters, such as compacting SSTables, impacts performance. +- Garbage collection pauses would cause significant latency spikes -ScyllaDB is Cassandra compatible database written in C++. Discord redesigned its architecture to have a monolithic API, a data service written in Rust, and ScyllaDB-based storage. +ScyllaDB is Cassandra compatible database written in C++. Discord redesigned its architecture to have a monolithic API, a data service written in Rust, and ScyllaDB-based storage. -The p99 read latency in ScyllaDB is 15ms compared to 40-125ms in Cassandra. The p99 write latency is 5ms compared to 5-70ms in Cassandra. +The p99 read latency in ScyllaDB is 15ms compared to 40-125ms in Cassandra. The p99 write latency is 5ms compared to 5-70ms in Cassandra. ### How do video live streamings work on YouTube, TikTok live, or Twitch? - + Live streaming differs from regular streaming because the video content is sent via the internet in real-time, usually with a latency of just a few seconds. - + The diagram below explains what happens behind the scenes to make this possible.

- Step 1: The raw video data is captured by a microphone and camera. The data is sent to the server side. - + Step 2: The video data is compressed and encoded. For example, the compressing algorithm separates the background and other video elements. After compression, the video is encoded to standards such as H.264. The size of the video data is much smaller after this step. - + Step 3: The encoded data is divided into smaller segments, usually seconds in length, so it takes much less time to download or stream. - + Step 4: The segmented data is sent to the streaming server. The streaming server needs to support different devices and network conditions. This is called ‘Adaptive Bitrate Streaming.’ This means we need to produce multiple files at different bitrates in steps 2 and 3. - -Step 5: The live streaming data is pushed to edge servers supported by CDN (Content Delivery Network.) Millions of viewers can watch the video from an edge server nearby. CDN significantly lowers data transmission latency. - + +Step 5: The live streaming data is pushed to edge servers supported by CDN (Content Delivery Network.) Millions of viewers can watch the video from an edge server nearby. CDN significantly lowers data transmission latency. + Step 6: The viewers’ devices decode and decompress the video data and play the video in a video player. - + Steps 7 and 8: If the video needs to be stored for replay, the encoded data is sent to a storage server, and viewers can request a replay from it later. - + Standard protocols for live streaming include: - RTMP (Real-Time Messaging Protocol): This was originally developed by Macromedia to transmit data between a Flash player and a server. Now it is used for streaming video data over the internet. Note that video conferencing applications like Skype use RTC (Real-Time Communication) protocol for lower latency. From c5997e4f61d4b3b9fb79d69dec9a3da918c351c8 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Wed, 20 Dec 2023 08:50:41 -0300 Subject: [PATCH 05/19] =?UTF-8?q?DevOps=20Done=20=E2=9C=85?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index e2bd4ca..f05cf20 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -42,8 +42,8 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Quais são os algoritmos de distribuição de carga comuns?](#quais-são-os-algoritmos-de-distribuição-de-carga-comuns) - [URL, URI, URN - Você sabe a diferênça?](#url-uri-urn---você-sabe-a-diferênça) - [CI/CD](#cicd) - - [CI/CD Pipeline Explained in Simple Terms](#cicd-pipeline-explained-in-simple-terms) - - [Netflix Tech Stack (CI/CD Pipeline)](#netflix-tech-stack-cicd-pipeline) + - [Pipeline CI/CD Explicado em Termos Simples](#pipeline-cicd-explicado-em-termos-simples) + - [Netflix Stack Tecnológico (Pipeline de CI/CD)](#netflix-stack-tecnológico-pipeline-de-cicd) - [Architecture patterns](#architecture-patterns) - [MVC, MVP, MVVM, MVVM-C, and VIPER](#mvc-mvp-mvvm-mvvm-c-and-viper) - [18 Key Design Patterns Every Developer Should Know](#18-key-design-patterns-every-developer-should-know) @@ -489,57 +489,57 @@ Se você deseja obter mais detalhes sobre o assunto, eu recomendaria a [explica ## CI/CD -### CI/CD Pipeline Explained in Simple Terms +### Pipeline CI/CD Explicado em Termos Simples

-Section 1 - SDLC with CI/CD +Seção 1 - SDLC com CI/CD -The software development life cycle (SDLC) consists of several key stages: development, testing, deployment, and maintenance. CI/CD automates and integrates these stages to enable faster and more reliable releases. +O ciclo de vida de desenvolvimento de software (SDLC, _Software Development Life Cycle_) consiste em várias etapas-chave: desenvolvimento, teste, implantação e manutenção. CI/CD automatiza e integra essas etapas para possibilitar lançamentos mais rápidos e confiáveis. -When code is pushed to a git repository, it triggers an automated build and test process. End-to-end (e2e) test cases are run to validate the code. If tests pass, the code can be automatically deployed to staging/production. If issues are found, the code is sent back to development for bug fixing. This automation provides fast feedback to developers and reduces the risk of bugs in production. +Quando o código é enviado para um repositório Git, isso aciona um processo automatizado de compilação e teste. Casos de teste de ponta a ponta (end-to-end ou e2e) são executados para validar o código. Se os testes são bem-sucedidos, o código pode ser implantado automaticamente no ambiente de preparo/produção. Se problemas são identificados, o código é enviado de volta para o desenvolvimento para correção de bugs. Essa automação proporciona um feedback rápido aos desenvolvedores e reduz o risco de erros em produção. -Section 2 - Difference between CI and CD +Seção 2 - Diferença entre CI e CD -Continuous Integration (CI) automates the build, test, and merge process. It runs tests whenever code is committed to detect integration issues early. This encourages frequent code commits and rapid feedback. +Integração Continua (CI, Continous Integrations) automatiza o processo de compilação, teste e o processo de implantação. Roda testes quando código é comitado para detecção de problemas de integração precocemente. Isso encoraja commits frequentes e feedback rápido. -Continuous Delivery (CD) automates release processes like infrastructure changes and deployment. It ensures software can be released reliably at any time through automated workflows. CD may also automate the manual testing and approval steps required before production deployment. +Entrega Contínua (CD, Continuous Delivery) automatiza processos de lançamento como mudanças de infraestrutura e implantação. Garante que o software possa ser lançado de maneira confiável a qualquer momento por meio de fluxos de trabalho automatizados. A CD também pode automatizar etapas de teste manual e aprovação necessárias antes da implantação em produção. -Section 3 - CI/CD Pipeline +Seção 3 - Pipeline de CI/CD -A typical CI/CD pipeline has several connected stages: +Um pipeline típico de CI/CD tem alguns estágios conectados: -- The developer commits code changes to the source control -- CI server detects changes and triggers the build -- Code is compiled, and tested (unit, integration tests) -- Test results reported to the developer -- On success, artifacts are deployed to staging environments -- Further testing may be done on staging before release -- CD system deploys approved changes to production +- O desenvolvedor comita mudanças de código para o controle de versão +- O servidor de CI detecta as mudanças e dá início à compilação +- Code é compilado e testado (unitário e de integração) +- Os resultados são reportados ao desenvolvedor +- No sucesso, artefatos são lançados para o ambiente de preparo (staging) +- Testes adicionais podem ser realizados no ambiente de preparo antes do lançamento +- O sistema de CD lança mudanças aprovadas para produção -### Netflix Tech Stack (CI/CD Pipeline) +### Netflix Stack Tecnológico (Pipeline de CI/CD)

-Planning: Netflix Engineering uses JIRA for planning and Confluence for documentation. +Planejamento: A Engenharia da Netflix utiliza o JIRA para planejamento e o Confluence para documentação. -Coding: Java is the primary programming language for the backend service, while other languages are used for different use cases. +Codificação: Java é a linguagem de programação principal para o serviço backend, enquanto outras linguagens são utilizadas para diferentes casos de uso. -Build: Gradle is mainly used for building, and Gradle plugins are built to support various use cases. +Compilação: Gradle é principalmente utilizado para compilação, e plugins do Gradle são construídos para suportar vários casos de uso. -Packaging: Package and dependencies are packed into an Amazon Machine Image (AMI) for release. +Empacotamento: O pacote e suas dependências são empacotados em uma Imagem de Máquina Amazon (AMI, _Amazon Machine Image_) para lançamento. -Testing: Testing emphasizes the production culture's focus on building chaos tools. +Testes: Os testes enfatizam o foco da cultura de produção na construção de ferramentas de caos. -Deployment: Netflix uses its self-built Spinnaker for canary rollout deployment. +Implantação: A Netflix utiliza sua própria ferramenta Spinnaker para implantação de rollout de canário. -Monitoring: The monitoring metrics are centralized in Atlas, and Kayenta is used to detect anomalies. +Monitoramento: As métricas de monitoramento são centralizadas no Atlas, e o Kayenta é utilizado para detectar anomalias. -Incident report: Incidents are dispatched according to priority, and PagerDuty is used for incident handling. +Relatório de Incidentes: Incidentes são despachados de acordo com a prioridade, e o PagerDuty é utilizado para o tratamento de incidentes. ## Architecture patterns From c368e32c7e1713a269177ec5cc2f407eda2de63b Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Wed, 20 Dec 2023 09:19:57 -0300 Subject: [PATCH 06/19] =?UTF-8?q?Architecture=20Patterns=20Done=20?= =?UTF-8?q?=E2=9C=85?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 60 ++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index f05cf20..0954ea0 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -44,9 +44,9 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [CI/CD](#cicd) - [Pipeline CI/CD Explicado em Termos Simples](#pipeline-cicd-explicado-em-termos-simples) - [Netflix Stack Tecnológico (Pipeline de CI/CD)](#netflix-stack-tecnológico-pipeline-de-cicd) - - [Architecture patterns](#architecture-patterns) - - [MVC, MVP, MVVM, MVVM-C, and VIPER](#mvc-mvp-mvvm-mvvm-c-and-viper) - - [18 Key Design Patterns Every Developer Should Know](#18-key-design-patterns-every-developer-should-know) + - [Padões de Arquitetura](#padões-de-arquitetura) + - [MVC, MVP, MVVM, MVVM-C, e VIPER](#mvc-mvp-mvvm-mvvm-c-e-viper) + - [18 Padrões de Design Essenciais Que Todo Desenvolvedor Deve Conhecer](#18-padrões-de-design-essenciais-que-todo-desenvolvedor-deve-conhecer) - [Database](#database) - [A nice cheat sheet of different databases in cloud services](#a-nice-cheat-sheet-of-different-databases-in-cloud-services) - [8 Data Structures That Power Your Databases](#8-data-structures-that-power-your-databases) @@ -541,47 +541,47 @@ Monitoramento: As métricas de monitoramento são centralizadas no Atlas, e o Ka Relatório de Incidentes: Incidentes são despachados de acordo com a prioridade, e o PagerDuty é utilizado para o tratamento de incidentes. -## Architecture patterns +## Padões de Arquitetura -### MVC, MVP, MVVM, MVVM-C, and VIPER +### MVC, MVP, MVVM, MVVM-C, e VIPER -These architecture patterns are among the most commonly used in app development, whether on iOS or Android platforms. Developers have introduced them to overcome the limitations of earlier patterns. So, how do they differ? +Esses padrões de arquitetura estão entre os mais comumente utilizados no desenvolvimento de aplicativos, seja nas plataformas iOS ou Android. Os desenvolvedores os introduziram para superar as limitações de padrões anteriores. Então, como eles diferem?

-- MVC, the oldest pattern, dates back almost 50 years -- Every pattern has a "view" (V) responsible for displaying content and receiving user input -- Most patterns include a "model" (M) to manage business data -- "Controller," "presenter," and "view-model" are translators that mediate between the view and the model ("entity" in the VIPER pattern) +- MVC (Modelo-Visão-Controle, _Model View Controller_), o padrão mais antigo, de quase 50 anos +- Cada padrão possui uma "visão", _view_ (V) responsável por exibir conteúdo e receber entrada do usuário +- A maioria dos padrões incluem um "modelo", _model_ (M) para manusear dados de negócio +- "Controller", "presenter" e "view-model" são tradutores que atuam como mediadores entre a "view" e o "model" (ou "entity" no padrão VIPER). -### 18 Key Design Patterns Every Developer Should Know +### 18 Padrões de Design Essenciais Que Todo Desenvolvedor Deve Conhecer -Patterns are reusable solutions to common design problems, resulting in a smoother, more efficient development process. They serve as blueprints for building better software structures. These are some of the most popular patterns: +Padrões são soluções reutilizáveis para problemas comuns de design, resultando em um processo de desenvolvimento mais linear e eficiente. Eles servem como modelos para construir estruturas de software mais sólidas. Aqui estão alguns dos padrões mais populares:

-- Abstract Factory: Family Creator - Makes groups of related items. -- Builder: Lego Master - Builds objects step by step, keeping creation and appearance separate. -- Prototype: Clone Maker - Creates copies of fully prepared examples. -- Singleton: One and Only - A special class with just one instance. -- Adapter: Universal Plug - Connects things with different interfaces. -- Bridge: Function Connector - Links how an object works to what it does. -- Composite: Tree Builder - Forms tree-like structures of simple and complex parts. -- Decorator: Customizer - Adds features to objects without changing their core. -- Facade: One-Stop-Shop - Represents a whole system with a single, simplified interface. -- Flyweight: Space Saver - Shares small, reusable items efficiently. -- Proxy: Stand-In Actor - Represents another object, controlling access or actions. -- Chain of Responsibility: Request Relay - Passes a request through a chain of objects until handled. -- Command: Task Wrapper - Turns a request into an object, ready for action. -- Iterator: Collection Explorer - Accesses elements in a collection one by one. -- Mediator: Communication Hub - Simplifies interactions between different classes. -- Memento: Time Capsule - Captures and restores an object's state. -- Observer: News Broadcaster - Notifies classes about changes in other objects. -- Visitor: Skillful Guest - Adds new operations to a class without altering it. +- Fábrica Abstrata (_Abstract Factory_): Criador de Famílias - Cria grupos de itens relacionados. +- Construtor (_Builder_): Mestre Lego - Constrói objetos passo a passo, mantendo a criação e a aparência separadas. +- Protótipo (_Prototype_): Clone Maker - Cria cópias de exemplos totalmente preparados. +- Singleton: Único e Exclusivo - Uma classe especial com apenas uma instâcia. +- Adaptador(_Adapter_): Plugue Universal - Conecta coisas com interfaces diferentes. +- Ponte (_Bridge_): Connectador Funcional - Liga como um objeto funciona ao que ele faz. +- Composito (_Composite_): Constutor de Árvores - Forma uma estruturas semelhantes a árvore com partes simples e complexas. +- Decorador (_Decorator_): Customizador - Adiciona funcionalidade a objetos sem alterar seu núcleo. +- Façada (_Facade_): Tudo em Um - Representa um sistema inteiro com uma única interface simplificada. +- Peso Mosca (_Flyweight_): Economizador de Espaço - Compartilha itens pequenos e reutilizáveis de maneira eficiente. +- Proxy: Ator Substituto - Representa outro objeto, controlando acesso ou ações. +- Cadeia de Responsabilidades (_Chain of Responsibility_): Relé de Requisições - Passa uma requisição por uma cadeia de objetos até que seja tratada. +- Comando (_Command_): Envelopador de Tarefas - Transforma uma solicitação em um objeto, pronto para atuar. +- Iterador (_Iterator_): Exploração de Coleções - Acessa elementos em uma coleção, um a um. +- Mediador (_Mediator_): Central de Comunicações - Simplifica interações entre classes distintas. +- Lembrança (_Memento_): Cápsula do Tempo - Captura e restaura o estado de um objeto. +- Observador (_Observer_): Emissora de Notícias - Notifica classes sobre mudanças em outros objetos. +- Visitante (_Visitor_): Hóspede Habilidoso - Adiciona novas operações a uma classe sem alterá-la. ## Database From 96b1fdc87f92b98a89aea962be0e1ee15252fda3 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Wed, 20 Dec 2023 21:05:44 -0300 Subject: [PATCH 07/19] CAP --- translations/README-ptbr.md | 78 ++++++++++++++++++------------------- 1 file changed, 39 insertions(+), 39 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 0954ea0..f966a75 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -47,11 +47,11 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Padões de Arquitetura](#padões-de-arquitetura) - [MVC, MVP, MVVM, MVVM-C, e VIPER](#mvc-mvp-mvvm-mvvm-c-e-viper) - [18 Padrões de Design Essenciais Que Todo Desenvolvedor Deve Conhecer](#18-padrões-de-design-essenciais-que-todo-desenvolvedor-deve-conhecer) - - [Database](#database) - - [A nice cheat sheet of different databases in cloud services](#a-nice-cheat-sheet-of-different-databases-in-cloud-services) - - [8 Data Structures That Power Your Databases](#8-data-structures-that-power-your-databases) - - [How is an SQL statement executed in the database?](#how-is-an-sql-statement-executed-in-the-database) - - [CAP theorem](#cap-theorem) + - [Bancos de Dados](#bancos-de-dados) + - [Um guia prático de diferentes bancos de dados em serviços de nuvem](#um-guia-prático-de-diferentes-bancos-de-dados-em-serviços-de-nuvem) + - [8 Estruturas de Dados que Impulsionam seus Bancos de Dados](#8-estruturas-de-dados-que-impulsionam-seus-bancos-de-dados) + - [Como um comando SQL é executado no Banco de Dados?](#como-um-comando-sql-é-executado-no-banco-de-dados) + - [Teorema CAP](#teorema-cap) - [Types of Memory and Storage](#types-of-memory-and-storage) - [Visualizing a SQL query](#visualizing-a-sql-query) - [SQL language](#sql-language) @@ -583,82 +583,82 @@ Padrões são soluções reutilizáveis para problemas comuns de design, resulta - Observador (_Observer_): Emissora de Notícias - Notifica classes sobre mudanças em outros objetos. - Visitante (_Visitor_): Hóspede Habilidoso - Adiciona novas operações a uma classe sem alterá-la. -## Database +## Bancos de Dados -### A nice cheat sheet of different databases in cloud services +### Um guia prático de diferentes bancos de dados em serviços de nuvem

-Choosing the right database for your project is a complex task. Many database options, each suited to distinct use cases, can quickly lead to decision fatigue. +Escolher o banco de dados correto para o seu projeto é uma tarefa complexa. Muitas opções de bancos de dados, cada uma adequada a casos de uso distintos, podem rapidamente levar à fadiga de decisões. -We hope this cheat sheet provides high-level direction to pinpoint the right service that aligns with your project's needs and avoid potential pitfalls. +Esperamos que este guia prático forneça direcionamento de alto nível para identificar o serviço correto que esteja alinhado com as necessidades do seu projeto e evite possíveis ciladas. -Note: Google has limited documentation for their database use cases. Even though we did our best to look at what was available and arrived at the best option, some of the entries may need to be more accurate. +Nota: O Google limitou a documentação para os casos de uso de seus banco de dados. Mesmo que tenhamos feito o nosso melhor para examinar o que estava disponível e chegar à melhor opção, algumas das entradas podem precisar de maior precisão. -### 8 Data Structures That Power Your Databases +### 8 Estruturas de Dados que Impulsionam seus Bancos de Dados -The answer will vary depending on your use case. Data can be indexed in memory or on disk. Similarly, data formats vary, such as numbers, strings, geographic coordinates, etc. The system might be write-heavy or read-heavy. All of these factors affect your choice of database index format. +A resposta irá variar dependendo do seu caso de uso. Dados podem ser indexados em memória ou em disco. Similarmente, os formatos dos dados variam, como números, strings, coordenadas geográficas etc. O sistema pode ser intensivo em escrita (write-heavy) ou intensivo em leitura (read-heavy). Todos esses fatores afetam a escolha do formato de índice do banco de dados.

-The following are some of the most popular data structures used for indexing data: +A seguir estão algumas das estruturas de dados mais populares usadas para indexar dados: -- Skiplist: a common in-memory index type. Used in Redis -- Hash index: a very common implementation of the “Map” data structure (or “Collection”) -- SSTable: immutable on-disk “Map” implementation -- LSM tree: Skiplist + SSTable. High write throughput -- B-tree: disk-based solution. Consistent read/write performance -- Inverted index: used for document indexing. Used in Lucene -- Suffix tree: for string pattern search -- R-tree: multi-dimension search, such as finding the nearest neighbor +- Skiplist: um tipo comum de índice em memória. Usado no Redis +- Índice de hash: uma implementação muito comum da estrutura de dados "Mapa" (ou "Coleção") +- SSTable: implementação em disco e imutável do "Mapa" +- Árvore LSM: Skiplist + SSTable. Alta taxa de gravação +- B-tree: solução baseada em disco. Desempenho de leitura/gravação consistente +- Índice invertido: usado para indexação de documentos. Usado no Lucene +- Árvore de sufixos: para pesquisa de padrões em strings +- R-tree: pesquisa multidimensional, como encontrar o vizinho mais próximo -### How is an SQL statement executed in the database? +### Como um comando SQL é executado no Banco de Dados? -The diagram below shows the process. Note that the architectures for different databases are different, the diagram demonstrates some common designs. +O diagrama abaixo demonstra o processo. Note que a arquitetura de diferentes bancos são diferentes, o diagrama apresenta alguns designs comuns.

-Step 1 - A SQL statement is sent to the database via a transport layer protocol (e.g.TCP). +Passo 1 - Uma instrução SQL é enviada para o banco de dados por meio de um protocolo de camada de transporte (por exemplo, TCP). -Step 2 - The SQL statement is sent to the command parser, where it goes through syntactic and semantic analysis, and a query tree is generated afterward. +Passo 2 - A instrução SQL é enviada ao analisador (_parser_) de comandos, onde passa por análise sintática e semântica, e em seguida, uma árvore de consulta é gerada. -Step 3 - The query tree is sent to the optimizer. The optimizer creates an execution plan. +Passo 3 - A árvore de consulta é enviada ao otimizador. O otimizador cria um plano de execução. -Step 4 - The execution plan is sent to the executor. The executor retrieves data from the execution. +Passo 4 - O plano de execução é enviado ao executor. O executor recupera os dados da execução. -Step 5 - Access methods provide the data fetching logic required for execution, retrieving data from the storage engine. +Passo 5 - Métodos de acesso fornecem a lógica de recuperação de dados necessária para a execução, recuperando dados do mecanismo de armazenamento (_storage engine_). -Step 6 - Access methods decide whether the SQL statement is read-only. If the query is read-only (SELECT statement), it is passed to the buffer manager for further processing. The buffer manager looks for the data in the cache or data files. +Passo 6 - Os métodos de acesso decidem se a instrução SQL é somente leitura. Se a consulta for somente leitura (instrução SELECT), ela é enviada para o gerenciador de buffer para processamento adicional. O gerenciador de buffer procura os dados no cache ou nos arquivos de dados. -Step 7 - If the statement is an UPDATE or INSERT, it is passed to the transaction manager for further processing. +Passo 7 - Se a instrução for um UPDATE ou INSERT, ela é enviada para o gerenciador de transações para processamento adicional. -Step 8 - During a transaction, the data is in lock mode. This is guaranteed by the lock manager. It also ensures the transaction’s ACID properties. +Passo 8 - Durante uma transação, os dados estão em modo de bloqueio. Isso é garantido pelo gerenciador de bloqueio. Ele também assegura as propriedades ACID da transação. -### CAP theorem +### Teorema CAP -The CAP theorem is one of the most famous terms in computer science, but I bet different developers have different understandings. Let’s examine what it is and why it can be confusing. +O Teorema CAP é um dos termos mais famosos na ciência da computação, mas aposto que desenvolvedores diferentes tem interpretações diferentes. Vamos examinar o que é e por que pode ser confuso.

-CAP theorem states that a distributed system can't provide more than two of these three guarantees simultaneously. +O teorema CAP afirma que um sistema distribuído não pode fornecer mais do que duas destas três garantias simultaneamente. -**Consistency**: consistency means all clients see the same data at the same time no matter which node they connect to. +**Consistência**: consistência significa que todos os clientes enxergam os mesmos dados ao mesmo tempo, não importando em qual nó eles se conectam. -**Availability**: availability means any client that requests data gets a response even if some of the nodes are down. +**Disponibilidade**: disponibilidade significa que qualquer cliente que realizar uma requisição de dados terá uma resposta, mesmo que alguns nós não estejam de pé. -**Partition Tolerance**: a partition indicates a communication break between two nodes. Partition tolerance means the system continues to operate despite network partitions. +**Tolerância de Partição**: uma partição indica uma quebra na comunicação entre dois nós. Uma tolerância de partição significa que o sistema continua em operação apesar de partições de redes. -The “2 of 3” formulation can be useful, **but this simplification could be misleading**. +A formulação "2 de 3" pode ser útil, **mas essa simplificação pode ser enganosa**. -1. Picking a database is not easy. Justifying our choice purely based on the CAP theorem is not enough. For example, companies don't choose Cassandra for chat applications simply because it is an AP system. There is a list of good characteristics that make Cassandra a desirable option for storing chat messages. We need to dig deeper. +1. Escolher um banco de dados não é fácil. Justificar sua escolha puramente no teorema CAP não é o suficiente. Por exemplo, companhias não escolhem Cassandra para aplicativos de cat simplesmente por ser um sistema AP. Há uma lista de características que tornam uma opção boa para armazenamento de mensagens de chat. Precisamos cavar mais fundo. 2. “CAP prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare”. Quoted from the paper: CAP Twelve Years Later: How the “Rules” Have Changed. From ac1393fdd1d05216fe1709f0d92ad705e6bd6d56 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Wed, 20 Dec 2023 21:14:24 -0300 Subject: [PATCH 08/19] CAP theorem Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index f966a75..71d320a 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -660,13 +660,13 @@ A formulação "2 de 3" pode ser útil, **mas essa simplificação pode ser enga 1. Escolher um banco de dados não é fácil. Justificar sua escolha puramente no teorema CAP não é o suficiente. Por exemplo, companhias não escolhem Cassandra para aplicativos de cat simplesmente por ser um sistema AP. Há uma lista de características que tornam uma opção boa para armazenamento de mensagens de chat. Precisamos cavar mais fundo. -2. “CAP prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare”. Quoted from the paper: CAP Twelve Years Later: How the “Rules” Have Changed. +2. "CAP proíbe apenas uma pequena parte do espaço de design: disponibilidade e consistência perfeitas na presença de partições, que são raras". Citado do artigo: CAP Twelve Years Later: How the "Rules" Have Changed. -3. The theorem is about 100% availability and consistency. A more realistic discussion would be the trade-offs between latency and consistency when there is no network partition. See PACELC theorem for more details. +3. O teorema trata de disponibilidade e consistência de 100%. Uma discussão mais realista envolveria as compensações entre latência e consistência quando não há partição de rede. Consulte o teorema PACELC para obter mais detalhes. -**Is the CAP theorem actually useful?** +**O teorema CAP é de fato útil?** -I think it is still useful as it opens our minds to a set of tradeoff discussions, but it is only part of the story. We need to dig deeper when picking the right database. +Acredito que ainda é útil por abrir nossas mentes para um novo conjunto de discussões sobre compensações, mas é apenas uma parte da história. Precisamos nos aprofundar para esolcher o banco de dados correto. ### Types of Memory and Storage From b6cc4654ffdcea30509a66ddd03733fb61913967 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Wed, 20 Dec 2023 21:26:16 -0300 Subject: [PATCH 09/19] Databases Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 48 +++++++++++++++++++------------------ 1 file changed, 25 insertions(+), 23 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 71d320a..f64de02 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -52,9 +52,9 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [8 Estruturas de Dados que Impulsionam seus Bancos de Dados](#8-estruturas-de-dados-que-impulsionam-seus-bancos-de-dados) - [Como um comando SQL é executado no Banco de Dados?](#como-um-comando-sql-é-executado-no-banco-de-dados) - [Teorema CAP](#teorema-cap) - - [Types of Memory and Storage](#types-of-memory-and-storage) - - [Visualizing a SQL query](#visualizing-a-sql-query) - - [SQL language](#sql-language) + - [Tipos de Memória e Armazenamento](#tipos-de-memória-e-armazenamento) + - [Visualizando uma consulta SQL](#visualizando-uma-consulta-sql) + - [Linguagem SQL](#linguagem-sql) - [Cache](#cache) - [Data is cached everywhere](#data-is-cached-everywhere) - [Why is Redis so fast?](#why-is-redis-so-fast) @@ -668,49 +668,51 @@ A formulação "2 de 3" pode ser útil, **mas essa simplificação pode ser enga Acredito que ainda é útil por abrir nossas mentes para um novo conjunto de discussões sobre compensações, mas é apenas uma parte da história. Precisamos nos aprofundar para esolcher o banco de dados correto. -### Types of Memory and Storage +### Tipos de Memória e Armazenamento

-### Visualizing a SQL query +### Visualizando uma consulta SQL

-SQL statements are executed by the database system in several steps, including: +As instruções SQL são executadas pelo sistema do banco de dados em várias etapas, incluindo: -- Parsing the SQL statement and checking its validity -- Transforming the SQL into an internal representation, such as relational algebra -- Optimizing the internal representation and creating an execution plan that utilizes index information -- Executing the plan and returning the results +- Analisando a instrução SQL e verificando sua validade +- Transformando o SQL em uma representação interna, como álgebra relacional +- Otimizando a representação interna e criando um plano de execução que utiliza informações de índices +- Executando o plano e retornando os resultados -The execution of SQL is highly complex and involves many considerations, such as: +A execução do SQL é altamente complexa e envolve muitas considerações, tais como: -- The use of indexes and caches -- The order of table joins -- Concurrency control -- Transaction management +- O uso de índices e caches +- A ordem de junções de tabelas +- Controle de concorrência +- Gerenciamento de transações -### SQL language +### Linguagem SQL In 1986, SQL (Structured Query Language) became a standard. Over the next 40 years, it became the dominant language for relational database management systems. Reading the latest standard (ANSI SQL 2016) can be time-consuming. How can I learn it? +Em 1986, SQL (Linguagem de Busca Estruturada, _Structured Query Language_) se tornou um padrão. Ao longo os próximos 40 anos, ela se tornou a linguagem dominante para sistemas de manuseamento de bancos de dados relacionais. Ler o último padrão (ANSI SQL 2016) pode ser demorado. Como posso aprenê-lo? +

-There are 5 components of the SQL language: +Há 5 componentes da linguagem SQL: -- DDL: data definition language, such as CREATE, ALTER, DROP -- DQL: data query language, such as SELECT -- DML: data manipulation language, such as INSERT, UPDATE, DELETE -- DCL: data control language, such as GRANT, REVOKE -- TCL: transaction control language, such as COMMIT, ROLLBACK +- DDL: linguagem de definição de dados, como CREATE, ALTER, DROP +- DQL: linguagem de consulta de dados, como SELECT +- DML: linguagem de manipulação de dados, como INSERT, UPDATE, DELETE +- DCL: linguagem de controle de dados, como GRANT, REVOKE +- TCL: linguagem de controle de transações, como COMMIT, ROLLBACK -For a backend engineer, you may need to know most of it. As a data analyst, you may need to have a good understanding of DQL. Select the topics that are most relevant to you. +Para um engenheiro de backend, pode ser necessário saber a maior parte deles. Como um analista de dados, é importante ter uma boa noção do DQL. Selecione os tópicos que são mais relevantes para você. ## Cache From 848c3e5b7cf60e38fed6b47c93cfc116bd5bcbf7 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Wed, 20 Dec 2023 22:18:45 -0300 Subject: [PATCH 10/19] Cache Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 101 ++++++++++++++++++------------------ 1 file changed, 51 insertions(+), 50 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index f64de02..e91466c 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -56,10 +56,10 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Visualizando uma consulta SQL](#visualizando-uma-consulta-sql) - [Linguagem SQL](#linguagem-sql) - [Cache](#cache) - - [Data is cached everywhere](#data-is-cached-everywhere) - - [Why is Redis so fast?](#why-is-redis-so-fast) - - [How can Redis be used?](#how-can-redis-be-used) - - [Top caching strategies](#top-caching-strategies) + - [Dados são cachados em toda parte](#dados-são-cachados-em-toda-parte) + - [Por que o Redis é tão rápido?](#por-que-o-redis-é-tão-rápido) + - [Como o Redis pode ser utilizado?](#como-o-redis-pode-ser-utilizado) + - [Principais Estratégias de Cache](#principais-estratégias-de-cache) - [Microservice architecture](#microservice-architecture) - [What does a typical microservice architecture look like?](#what-does-a-typical-microservice-architecture-look-like) - [Microservice Best Practices](#microservice-best-practices) @@ -716,101 +716,102 @@ Para um engenheiro de backend, pode ser necessário saber a maior parte deles. C ## Cache -### Data is cached everywhere +### Dados são cachados em toda parte -This diagram illustrates where we cache data in a typical architecture. +Este diagrama ilustra onde nós cachamos dados em uma arquitetura típica.

-There are **multiple layers** along the flow. +Existem múltiplas camadas ao longo do fluxo. -1. Client apps: HTTP responses can be cached by the browser. We request data over HTTP for the first time, and it is returned with an expiry policy in the HTTP header; we request data again, and the client app tries to retrieve the data from the browser cache first. -2. CDN: CDN caches static web resources. The clients can retrieve data from a CDN node nearby. -3. Load Balancer: The load Balancer can cache resources as well. -4. Messaging infra: Message brokers store messages on disk first, and then consumers retrieve them at their own pace. Depending on the retention policy, the data is cached in Kafka clusters for a period of time. -5. Services: There are multiple layers of cache in a service. If the data is not cached in the CPU cache, the service will try to retrieve the data from memory. Sometimes the service has a second-level cache to store data on disk. -6. Distributed Cache: Distributed cache like Redis holds key-value pairs for multiple services in memory. It provides much better read/write performance than the database. -7. Full-text Search: we sometimes need to use full-text searches like Elastic Search for document search or log search. A copy of data is indexed in the search engine as well. -8. Database: Even in the database, we have different levels of caches: +1. Aplicativos cliente: As respostas HTTP podem ser armazenadas em cache pelo navegador. Solicitamos dados pela primeira vez por meio do HTTP, e eles são retornados com uma política de expiração no cabeçalho HTTP; solicitamos os dados novamente, e o aplicativo cliente tenta recuperar os dados primeiro do cache do navegador. -- WAL(Write-ahead Log): data is written to WAL first before building the B tree index -- Bufferpool: A memory area allocated to cache query results +2. CDN: CDN (Rede de distribuição de Conteúdos, _Content Delivery Network_) cacha recursos web estáticos. Os clientes podem recuperar dados de um nó de CDN próximo. +3. Distribuidor de Cargas: O distribuidor de cargas também pode cachar recursos. +4. Infraestrutura de mensagens: Corretores de Mensagens (_Message Brokers_) armazenam mensagens primeiramente no disco, e depois os consumidores as recuperam em seu próprio ritmo. Dependendo da política de retenção, os dados são cachados em clusters Kafka por um curto período de tempo. +5. Serviços: Há múltiplas camadas de cache em um serviço. Se o dado não está cachado no cache da CPU, o serviço irá tentar recuperar o dado da memória. As vezes o serviço tem uma segunda camada de cache para armazenar dados em disco. +6. Cache Distribuído: Cache distribuído como Redis armazena pares chave-valor para múltiplos serviços em memória. Ele provê performance de leitura/escrita muito melhor que em um banco de dados. +7. Pesquisa de Texto Completo (_Full-text Search_): as vezes precisamos realizar pesquisas de texto completas como Elastic Search para buscas em documentos ou logs. Uma cópia dos dados é indexada na ferramenta de busca também. +8. Banco de Dados: Até mesmo nos bancos de dados, temos diferentes níveis de cache. + +- WAL (Write-ahead Log): os dados são gravados primeiro no WAL antes de construir o índice B-tree. +- Bufferpool: Uma área de memória alocada para cachar os resultados das consultas +- Visualização Materializada: Pré-calcular os resultados da consulta e armazená-los nas tabelas do banco de dados para melhorar o desempenho das consultas - Materialized View: Pre-compute query results and store them in the database tables for better query performance -- Transaction log: record all the transactions and database updates -- Replication Log: used to record the replication state in a database cluster +- Log de Transação: registrar todas as transações e atualizações do banco de dados +- Log de Replicação: usado para registrar o estado de replicação em um cluster de banco de dados. -### Why is Redis so fast? +### Por que o Redis é tão rápido? -There are 3 main reasons as shown in the diagram below. +Há 3 razões principais, como mostrado no diagrama abaixo.

-1. Redis is a RAM-based data store. RAM access is at least 1000 times faster than random disk access. -2. Redis leverages IO multiplexing and single-threaded execution loop for execution efficiency. -3. Redis leverages several efficient lower-level data structures. +1. Redis é um armazenamento baseado em memória RAM. O acesso à RAM é pelo menos 1000 vezes mais rápido que um acesso aleatório ao disco. +2. O Redis alavanca multipleximento de IO e um loop single-threaded para eficácia de execução. +3. O Redis utiliza algumas estruturas eficázes de baixo nível. -Question: Another popular in-memory store is Memcached. Do you know the differences between Redis and Memcached? +Pergunta: Outro armazenamento em memória popular é o Memcached. Você sabe as diferenças entre Redis e Memcached? -You might have noticed the style of this diagram is different from my previous posts. Please let me know which one you prefer. +Você pode ter notado que o estilo deste diagrama é diferente dos posts anteriores. Por favor me avise sobre qual você prefere. -### How can Redis be used? +### Como o Redis pode ser utilizado?

-There is more to Redis than just caching. +Há mais no redis do que apenas caching. -Redis can be used in a variety of scenarios as shown in the diagram. +O Redis pode ser utilizado em uma variedade de cenários, conforme mostrado no diagrama. -- Session +- Sessão - We can use Redis to share user session data among different services. + Nós podemos utilizar o Redis para compartilhar a sessão do usuário entre diferentes serviços. - Cache - We can use Redis to cache objects or pages, especially for hotspot data. + Nós podemos utilizar o Redis para cachar objetos ou páginas, especialmente para dados quentes (acessados com frequência alta, _hotspot data_). -- Distributed lock +- Trava distribuída - We can use a Redis string to acquire locks among distributed services. + Podemos usar o tipo de dado String do Redis para adquirir uma trava (_lock_) dentre serviços distribuídos. -- Counter +- Contador - We can count how many likes or how many reads for articles. + Podemos contar quantos likes ou quantas leituras em um artigo. -- Rate limiter +- Limitador de Taxa - We can apply a rate limiter for certain user IPs. + Podemos aplicar um limitador de taxa para determinados IPs de usuários. -- Global ID generator +- Gerador de ID Global - We can use Redis Int for global ID. + Podemos usar o tipo de dado Int no Redis para gerar IDs globais. -- Shopping cart +- Carrinho de Compras - We can use Redis Hash to represent key-value pairs in a shopping cart. + Podemos usar o tipo de dado Hash no Redis para representar pares chave-valor em um carrinho de compras. -- Calculate user retention +- Calcular Retenção de Usuários - We can use Bitmap to represent the user login daily and calculate user retention. + Podemos usar o tipo de dado Bitmap no Redis para representar os logins diários dos usuários e calcular a retenção de usuários. -- Message queue +- Fila de Mensagens - We can use List for a message queue. + Podemos usar o tipo de dado Lista no Redis para uma fila de mensagens. - Ranking - We can use ZSet to sort the articles. + Podemos usar o tipo de dado ZSet (conjunto ordenado) no Redis para classificar os artigos. -### Top caching strategies +### Principais Estratégias de Cache -Designing large-scale systems usually requires careful consideration of caching. -Below are five caching strategies that are frequently utilized. +Projetar sistemas de larga escala geralmente requer consideração cuidadosa de caching. Abaixo são cinco estratégias de caching frequentemente utilizadas.

From 058ff52f53150cba450fb9a51bdae98aaef84977 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 11:30:26 -0300 Subject: [PATCH 11/19] Microsservices Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 122 +++++++++++++++++++----------------- 1 file changed, 64 insertions(+), 58 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index e91466c..e3de6c2 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -60,11 +60,12 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Por que o Redis é tão rápido?](#por-que-o-redis-é-tão-rápido) - [Como o Redis pode ser utilizado?](#como-o-redis-pode-ser-utilizado) - [Principais Estratégias de Cache](#principais-estratégias-de-cache) - - [Microservice architecture](#microservice-architecture) - - [What does a typical microservice architecture look like?](#what-does-a-typical-microservice-architecture-look-like) - - [Microservice Best Practices](#microservice-best-practices) - - [What tech stack is commonly used for microservices?](#what-tech-stack-is-commonly-used-for-microservices) + - [Arquiteturas de Microsserviços](#arquiteturas-de-microsserviços) + - [Como é uma arquitetura típica de microsserviços?](#como-é-uma-arquitetura-típica-de-microsserviços) + - [Melhores Práticas em Microsserviços](#melhores-práticas-em-microsserviços) + - [Qual pilha tecnológica é comumente utilizada para microsserviços?](#qual-pilha-tecnológica-é-comumente-utilizada-para-microsserviços) - [Why is Kafka fast](#why-is-kafka-fast) + - [Por quê Kafka é tão rápido](#por-quê-kafka-é-tão-rápido) - [Payment systems](#payment-systems) - [How to learn payment systems?](#how-to-learn-payment-systems) - [Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money?](#why-is-the-credit-card-called-the-most-profitable-product-in-banks-how-does-visamastercard-make-money) @@ -817,53 +818,54 @@ Projetar sistemas de larga escala geralmente requer consideração cuidadosa de

-## Microservice architecture +## Arquiteturas de Microsserviços -### What does a typical microservice architecture look like? +### Como é uma arquitetura típica de microsserviços? + +O diagrama abaixo mostra uma arquitetura típica de microsservissos.

-The diagram below shows a typical microservice architecture. - -- Load Balancer: This distributes incoming traffic across multiple backend services. -- CDN (Content Delivery Network): CDN is a group of geographically distributed servers that hold static content for faster delivery. The clients look for content in CDN first, then progress to backend services. -- API Gateway: This handles incoming requests and routes them to the relevant services. It talks to the identity provider and service discovery. -- Identity Provider: This handles authentication and authorization for users. -- Service Registry & Discovery: Microservice registration and discovery happen in this component, and the API gateway looks for relevant services in this component to talk to. -- Management: This component is responsible for monitoring the services. -- Microservices: Microservices are designed and deployed in different domains. Each domain has its own database. The API gateway talks to the microservices via REST API or other protocols, and the microservices within the same domain talk to each other using RPC (Remote Procedure Call). +- Distribuidor de Cargas: Isso distribui tráfego de entrada para multiplos serviços de backend. +- CDN (Rede de Distribuição de Serviços, _Content Delivery Network_): CDN é um grupo de servidores distribuídos geograficamente que armazenam conteúdos estáticos para entrega mais rápida. O cliente procura por conteúdo primeiro no CDN, e apenas depois para os serviços de backend. +- API Gateway: Isso lida com as solicitações recebidas e as direciona para os serviços relevantes. Ele se comunica com o provedor de identidade e descoberta de serviços. +- Provedor de identidade (_Identity Provider_): Isso lida com autenticação e autorização para os usuários. +- Registro e Descoberta de Serviços: O registro e a descoberta de microsserviços ocorrem neste componente, e o API Gateway procura por serviços relevantes neste componente para se comunicar. +- Gerenciamento: Este componente é responsável por monitorar os serviços. +- Microsserviços: Microsserviços são desenhados e implantados em diversos domínios. +- Microservices: Microsserviços são projetados e implantados em diferentes domínios. Cada domínio tem seu próprio banco de dados. O API Gateway se comunica com os microsserviços por meio de API REST ou outros protocolos, e os microsserviços dentro do mesmo domínio se comunicam entre si usando RPC (Chamada de Procedimento Remoto). -Benefits of microservices: +Benefícios de microsserviços: -- They can be quickly designed, deployed, and horizontally scaled. -- Each domain can be independently maintained by a dedicated team. -- Business requirements can be customized in each domain and better supported, as a result. +- Podem ser projetados e implantados e escalados horizontalmente rapidamente. +- Cada domínio pode ser mantido independentemente por uma equipe dedicada. +- Os requisitos de negócios podem ser personalizados em cada domínio e, como resultado, melhor suportados. -### Microservice Best Practices +### Melhores Práticas em Microsserviços -A picture is worth a thousand words: 9 best practices for developing microservices. +Uma imagem vale por mil palavras: 9 melhores práticas para desenvolver microsserviços.

-When we develop microservices, we need to follow the following best practices: +Quando desenvolvemos microsserviços, nós precisamos seguir as seguintes melhores práticas: -1. Use separate data storage for each microservice -2. Keep code at a similar level of maturity -3. Separate build for each microservice -4. Assign each microservice with a single responsibility -5. Deploy into containers -6. Design stateless services -7. Adopt domain-driven design -8. Design micro frontend -9. Orchestrating microservices +1. Utilize armazenamento de dados separado para cada microsserviço +2. Mantenha o código em um nível semelhante de maturidade +3. Faça compilação separada para cada microsserviço +4. Atribua a cada microsserviço uma única responsabilidade +5. Implante em containers +6. Projete serviços sem estado +7. Adote o design orientado por domínio +8. Projete micro frontends +9. Orquestre os microsserviços -### What tech stack is commonly used for microservices? +### Qual pilha tecnológica é comumente utilizada para microsserviços? -Below you will find a diagram showing the microservice tech stack, both for the development phase and for production. +Abaixo você irá encontrar um diagrama mostrando a pilha tecnológica de microsserviços, tanto para fase de desenvolvimento como para produção.

@@ -871,53 +873,57 @@ Below you will find a diagram showing the microservice tech stack, both for the ▶️ 𝐏𝐫𝐞-𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 -- Define API - This establishes a contract between frontend and backend. We can use Postman or OpenAPI for this. -- Development - Node.js or react is popular for frontend development, and java/python/go for backend development. Also, we need to change the configurations in the API gateway according to API definitions. -- Continuous Integration - JUnit and Jenkins for automated testing. The code is packaged into a Docker image and deployed as microservices. +- Definir a API - Isso estabelece o contrato entre frontend e backend. Nós podemos utilizar Postman ou OpenAPI pra isso. +- Desenvolvimento - Node.js ou react são populares para desenvolvimento frontend, e java/python/go para desenvolvimento backend. Além disso, nós precisamos mudar as configurações no API Gateway de acordo com as definições da API. +- Integração Contínua - Junit e Jenkins para testes automatizados. O código é empacotado em uma imagem Docker e implantado como microsserviços. ▶️ 𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 -- NGinx is a common choice for load balancers. Cloudflare provides CDN (Content Delivery Network). -- API Gateway - We can use spring boot for the gateway, and use Eureka/Zookeeper for service discovery. +- NGinx é uma escolha comum para distribuidor de carga. A Cloudflare providencia um CDN (Content Delivery Network). +- API Gateway - Nós podemos utilizar o spring boot para o gateway, e usar o Eureka/Zookeeper para descobrimento de serviços. +- Os microsserviços são implantados em clouds. Nós temos opções como AWS, Microsoft Azure ou Google GCP. +- Cache and Busca - The microservices are deployed on clouds. We have options among AWS, Microsoft Azure, or Google GCP. - Cache and Full-text Search - Redis is a common choice for caching key-value pairs. Elasticsearch is used for full-text search. -- Communications - For services to talk to each other, we can use messaging infra Kafka or RPC. -- Persistence - We can use MySQL or PostgreSQL for a relational database, and Amazon S3 for object store. We can also use Cassandra for the wide-column store if necessary. -- Management & Monitoring - To manage so many microservices, the common Ops tools include Prometheus, Elastic Stack, and Kubernetes. +- Cache e Busca de Texto Completo - Redis é uma escolha comum para armazenamento em cache de pares chave-valor. Elasticsearch é utilizado para busca de texto completo. +- Comunicações - Para serviços se comunicarem um com o outro, podemos utilizar infraestrutura de mensagerias, como Kafka ou RPC (Chamada de Procedimento Remota, _Remote Procedure Call_). +- Persistencia - Podemos utilizar MySQL ou PostgreSQL para banco de dados relactionais e Amazon S3 para armazenamento de objeto. Também podemos utilizar Cassandra para armazenamento wide-column (coluna-larga) se necessário. +- Gerenciamento & Monitoramento - Para manusear tantos microsserviços, as ferramentas comuns incluem Prometheus, Elastic Stack e Kubernetes. ### Why is Kafka fast -There are many design decisions that contributed to Kafka’s performance. In this post, we’ll focus on two. We think these two carried the most weight. +### Por quê Kafka é tão rápido + +Houveram muitas decisões de design que contribuem para a performance do Kafka. Neste post, vamos focar em duas. Acreditamos que estas duas tenham o maior impacto.

-1. The first one is Kafka’s reliance on Sequential I/O. -2. The second design choice that gives Kafka its performance advantage is its focus on efficiency: zero copy principle. +1. A primeira é a dependência do Kafka em E/S (I/O) sequencial. +2. A segunda escolha de design que confere ao Kafka sua vantagem de desempenho é seu foco na eficiência: o princípio de cópia zero. -The diagram illustrates how the data is transmitted between producer and consumer, and what zero-copy means. +O diagrama ilustra como o dado transmitido entre produtor e consumidor e o que zero-copy significa. -- Step 1.1 - 1.3: Producer writes data to the disk -- Step 2: Consumer reads data without zero-copy +- Passo 1.1 - 1.3: O Produtos escreve dado no disco +- Passo 2: O Consumidor lê dados sem zero-copy - 2.1 The data is loaded from disk to OS cache + 2.1 O dado é carregado do disco para o cache do SO - 2.2 The data is copied from OS cache to Kafka application + 2.2 O dado é copiado do cache do SO para a aplicação (o próprio Kafka) - 2.3 Kafka application copies the data into the socket buffer + 2.3 A aplicação kafka copia o dado para o buffer do socket - 2.4 The data is copied from socket buffer to network card + 2.4 O dado é copiado do buffer do socket para a placa de rede - 2.5 The network card sends data out to the consumer + 2.5 A placa de rede envia o dado para o consumidor -- Step 3: Consumer reads data with zero-copy +- Passo 3: Consumidor lê o dado com zero-copy - 3.1: The data is loaded from disk to OS cache - 3.2 OS cache directly copies the data to the network card via sendfile() command - 3.3 The network card sends data out to the consumer + 3.1 O dado é carregado do disco ao cache do SO + 3.2 Cache do SO diretamente copia o dado da placa de rede com o comando sendfile() + 3.3 A placa de rede envia o dado para o consumidor -Zero copy is a shortcut to save the multiple data copies between application context and kernel context. +Zero-Copy é um atalho para salvar as multiplas copias entre contexto de usuário e contexto de kernel. ## Payment systems From b6ec988cab884d2a00f58b9ababcd50814ae1380 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 12:13:24 -0300 Subject: [PATCH 12/19] Payment Systems Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 81 ++++++++++++++++++------------------- 1 file changed, 39 insertions(+), 42 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index e3de6c2..2e665bd 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -64,13 +64,12 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Como é uma arquitetura típica de microsserviços?](#como-é-uma-arquitetura-típica-de-microsserviços) - [Melhores Práticas em Microsserviços](#melhores-práticas-em-microsserviços) - [Qual pilha tecnológica é comumente utilizada para microsserviços?](#qual-pilha-tecnológica-é-comumente-utilizada-para-microsserviços) - - [Why is Kafka fast](#why-is-kafka-fast) - [Por quê Kafka é tão rápido](#por-quê-kafka-é-tão-rápido) - - [Payment systems](#payment-systems) - - [How to learn payment systems?](#how-to-learn-payment-systems) - - [Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money?](#why-is-the-credit-card-called-the-most-profitable-product-in-banks-how-does-visamastercard-make-money) - - [How does VISA work when we swipe a credit card at a merchant’s shop?](#how-does-visa-work-when-we-swipe-a-credit-card-at-a-merchants-shop) - - [Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India](#payment-systems-around-the-world-series-part-1-unified-payments-interface-upi-in-india) + - [Sistemas de Pagamento](#sistemas-de-pagamento) + - [Como aprender sistemas de pagamento?](#como-aprender-sistemas-de-pagamento) + - [Por que o cartão de crédito é chamado de "o produto mais lucrativo para os bancos"? Como que a VISA/Mastercard ganham dinheiro?](#por-que-o-cartão-de-crédito-é-chamado-de-o-produto-mais-lucrativo-para-os-bancos-como-que-a-visamastercard-ganham-dinheiro) + - [Como a VISA funciona quando nós passamos o cartão de crédito em uma loja?](#como-a-visa-funciona-quando-nós-passamos-o-cartão-de-crédito-em-uma-loja) + - [Série de Sistemas de Pagamento ao Redor do Mundo (Parte 1): Interface Unificada de Pagamentos (UPI, _Unified Payments Interface_) na Índia](#série-de-sistemas-de-pagamento-ao-redor-do-mundo-parte-1-interface-unificada-de-pagamentos-upi-unified-payments-interface-na-índia) - [DevOps](#devops) - [DevOps vs. SRE vs. Platform Engineering. What is the difference?](#devops-vs-sre-vs-platform-engineering-what-is-the-difference) - [What is k8s (Kubernetes)?](#what-is-k8s-kubernetes) @@ -889,8 +888,6 @@ Abaixo você irá encontrar um diagrama mostrando a pilha tecnológica de micros - Persistencia - Podemos utilizar MySQL ou PostgreSQL para banco de dados relactionais e Amazon S3 para armazenamento de objeto. Também podemos utilizar Cassandra para armazenamento wide-column (coluna-larga) se necessário. - Gerenciamento & Monitoramento - Para manusear tantos microsserviços, as ferramentas comuns incluem Prometheus, Elastic Stack e Kubernetes. -### Why is Kafka fast - ### Por quê Kafka é tão rápido Houveram muitas decisões de design que contribuem para a performance do Kafka. Neste post, vamos focar em duas. Acreditamos que estas duas tenham o maior impacto. @@ -925,83 +922,83 @@ O diagrama ilustra como o dado transmitido entre produtor e consumidor e o que z Zero-Copy é um atalho para salvar as multiplas copias entre contexto de usuário e contexto de kernel. -## Payment systems +## Sistemas de Pagamento -### How to learn payment systems? +### Como aprender sistemas de pagamento?

-### Why is the credit card called “the most profitable product in banks”? How does VISA/Mastercard make money? +### Por que o cartão de crédito é chamado de "o produto mais lucrativo para os bancos"? Como que a VISA/Mastercard ganham dinheiro? -The diagram below shows the economics of the credit card payment flow. +O diagrama abaixo mostra a economia do fluxo de pagamento com cartão de crédito.

-1.  The cardholder pays a merchant $100 to buy a product. +1.  O titular do cartão paga $100 a um comerciante para comprar um produto. -2. The merchant benefits from the use of the credit card with higher sales volume and needs to compensate the issuer and the card network for providing the payment service. The acquiring bank sets a fee with the merchant, called the “merchant discount fee.” +2. O comerciante se beneficia do uso do cartão de crédito com um volume de vendas mais alto e precisa compensar o emissor e a rede de cartões por fornecer o serviço de pagamento. O banco adquirente estabelece uma taxa com o comerciante, chamada "taxa de desconto do comerciante". -3 - 4. The acquiring bank keeps $0.25 as the acquiring markup, and $1.75 is paid to the issuing bank as the interchange fee. The merchant discount fee should cover the interchange fee. +3 - 4. O banco adquirente retém $0,25 como a margem adquirente, e $1,75 é pago ao banco emissor como taxa de intercâmbio. A taxa de desconto do comerciante deve cobrir a taxa de intercâmbio. -The interchange fee is set by the card network because it is less efficient for each issuing bank to negotiate fees with each merchant. +A taxa de intercâmbio é estabelecida pela rede de cartões, pois seria menos eficiente para cada banco emissor negociar taxas com cada comerciante individualmente. -5.  The card network sets up the network assessments and fees with each bank, which pays the card network for its services every month. For example, VISA charges a 0.11% assessment, plus a $0.0195 usage fee, for every swipe. +5.  A rede de cartões estabelece as avaliações e taxas de rede com cada banco, que paga à rede de cartões por seus serviços a cada mês. Por exemplo, a VISA cobra uma avaliação de 0,11%, além de uma taxa de uso de $0,0195, para cada transação. -6.  The cardholder pays the issuing bank for its services. +6.  O titular do cartão paga ao banco emissor pelos seus serviços. -Why should the issuing bank be compensated? +Por que o banco emissor precisa ser compensado? -- The issuer pays the merchant even if the cardholder fails to pay the issuer. -- The issuer pays the merchant before the cardholder pays the issuer. -- The issuer has other operating costs, including managing customer accounts, providing statements, fraud detection, risk management, clearing & settlement, etc. +- O emissor paga ao comerciante mesmo se o titular do cartão deixar de pagar ao emissor. +- O emissor paga ao comerciante antes mesmo de o titular do cartão pagar ao emissor. +- O emissor tem outros custos operacionais, incluindo a gestão de contas do cliente, emissão de extratos, detecção de fraudes, gestão de riscos, compensação & liquidação, entre outros. -### How does VISA work when we swipe a credit card at a merchant’s shop? +### Como a VISA funciona quando nós passamos o cartão de crédito em uma loja?

-VISA, Mastercard, and American Express act as card networks for the clearing and settling of funds. The card acquiring bank and the card issuing bank can be – and often are – different. If banks were to settle transactions one by one without an intermediary, each bank would have to settle the transactions with all the other banks. This is quite inefficient. +VISA, Mastercard e American Express atuam como redes de cartões para a compensação e liquidação de fundos. O banco adquirente e o banco emissor do cartão podem ser - e muitas vezes são - diferentes. Se os bancos fossem liquidar as transações individualmente, sem um intermediário, cada banco teria que liquidar as transações com todos os outros bancos. Isso seria bastante ineficiente. -The diagram below shows VISA’s role in the credit card payment process. There are two flows involved. Authorization flow happens when the customer swipes the credit card. Capture and settlement flow happens when the merchant wants to get the money at the end of the day. +O diagrama acima mostra o papel da VISA no processo de pagamento com cartão de crédito. Existem dois fluxos envolvidos. O fluxo de autorização ocorre quando o cliente passa o cartão de crédito. O fluxo de captura e liquidação ocorre quando o comerciante deseja receber o dinheiro no final do dia. -- Authorization Flow +- Fluxo de autorização -Step 0: The card issuing bank issues credit cards to its customers. +Passo 0: O banco emissor do cartão emite cartões de crédito para seus clientes. -Step 1: The cardholder wants to buy a product and swipes the credit card at the Point of Sale (POS) terminal in the merchant’s shop. +Passo 1: O titular do cartão deseja comprar um produto e passa o cartão de crédito no terminal de ponto de venda (POS, _Point of Sale_) na loja do comerciante. -Step 2: The POS terminal sends the transaction to the acquiring bank, which has provided the POS terminal. +Passo 2: O terminal POS envia a transação para o banco adquirente, que forneceu o terminal POS. -Steps 3 and 4: The acquiring bank sends the transaction to the card network, also called the card scheme. The card network sends the transaction to the issuing bank for approval. +Passos 3 e 4: O banco adquirente envia a transação para a rede de cartões, também chamada de esquema de cartão. A rede de cartões envia a transação para o banco emissor para aprovação. -Steps 4.1, 4.2 and 4.3: The issuing bank freezes the money if the transaction is approved. The approval or rejection is sent back to the acquirer, as well as the POS terminal. +Passos 4.1, 4.2 e 4.3: O banco emissor reserva o dinheiro se a transação for aprovada. A aprovação ou rejeição é enviada de volta para o adquirente, assim como para o terminal POS. -- Capture and Settlement Flow +- Fluxo de Captura e Liquidação -Steps 1 and 2: The merchant wants to collect the money at the end of the day, so they hit ”capture” on the POS terminal. The transactions are sent to the acquirer in batch. The acquirer sends the batch file with transactions to the card network. +Passos 1 e 2: O comerciante deseja receber o dinheiro no final do dia, então eles acionam a "captura" no terminal POS. As transações são enviadas em lote para o adquirente. O adquirente envia o arquivo em lote com as transações para a rede de cartões. -Step 3: The card network performs clearing for the transactions collected from different acquirers, and sends the clearing files to different issuing banks. +Passo 3: A rede de cartões realiza a compensação para as transações coletadas de diferentes adquirentes e envia os arquivos de compensação para diferentes bancos emissores. -Step 4: The issuing banks confirm the correctness of the clearing files, and transfer money to the relevant acquiring banks. +Passo 4: Os bancos emissores confirmam a correção dos arquivos de compensação e transferem dinheiro para os respectivos bancos adquirentes. -Step 5: The acquiring bank then transfers money to the merchant’s bank. +Passo 5: O banco adquirente, então, transfere dinheiro para o banco do comerciante. -Step 4: The card network clears up the transactions from different acquiring banks. Clearing is a process in which mutual offset transactions are netted, so the number of total transactions is reduced. +Passo 4: A rede de cartões liquida as transações de diferentes bancos adquirentes. A liquidação é um processo no qual as transações de compensação mútua são compensadas, reduzindo assim o número total de transações. -In the process, the card network takes on the burden of talking to each bank and receives service fees in return. +No processo, a rede de cartões assume o encargo de falar com cada banco, em troca, recebendo as taxas de serviço. -### Payment Systems Around The World Series (Part 1): Unified Payments Interface (UPI) in India +### Série de Sistemas de Pagamento ao Redor do Mundo (Parte 1): Interface Unificada de Pagamentos (UPI, _Unified Payments Interface_) na Índia -What’s UPI? UPI is an instant real-time payment system developed by the National Payments Corporation of India. +O que é UPI? UPI é um sistema de pagamento instantâneo em tempo real desenvolvido pela National Payments Corporation of India. -It accounts for 60% of digital retail transactions in India today. +Atualmente, representa 60% das transações digitais no varejo na Índia. -UPI = payment markup language + standard for interoperable payments +UPI = linguagem de marcação de pagamento + padrão para pagamentos interoperáveis

From e0b71a4d276698751eaaa1210dde44d97d5fbaed Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 13:11:21 -0300 Subject: [PATCH 13/19] DevOps Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 103 ++++++++++++++++++------------------ 1 file changed, 51 insertions(+), 52 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 2e665bd..df875bf 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -71,10 +71,10 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Como a VISA funciona quando nós passamos o cartão de crédito em uma loja?](#como-a-visa-funciona-quando-nós-passamos-o-cartão-de-crédito-em-uma-loja) - [Série de Sistemas de Pagamento ao Redor do Mundo (Parte 1): Interface Unificada de Pagamentos (UPI, _Unified Payments Interface_) na Índia](#série-de-sistemas-de-pagamento-ao-redor-do-mundo-parte-1-interface-unificada-de-pagamentos-upi-unified-payments-interface-na-índia) - [DevOps](#devops) - - [DevOps vs. SRE vs. Platform Engineering. What is the difference?](#devops-vs-sre-vs-platform-engineering-what-is-the-difference) - - [What is k8s (Kubernetes)?](#what-is-k8s-kubernetes) - - [Docker vs. Kubernetes. Which one should we use?](#docker-vs-kubernetes-which-one-should-we-use) - - [How does Docker work?](#how-does-docker-work) + - [DevOps vs. SRE vs. Platform Engineering. Qual a diferença?](#devops-vs-sre-vs-platform-engineering-qual-a-diferença) + - [O que é k8s (Kubernetes)?](#o-que-é-k8s-kubernetes) + - [Docker vs. Kubernetes. Qual eu deveria usar?](#docker-vs-kubernetes-qual-eu-deveria-usar) + - [Como o Docker funciona?](#como-o-docker-funciona) - [GIT](#git) - [How Git Commands work](#how-git-commands-work) - [How does Git Work?](#how-does-git-work) @@ -1006,120 +1006,119 @@ UPI = linguagem de marcação de pagamento + padrão para pagamentos interoperá ## DevOps -### DevOps vs. SRE vs. Platform Engineering. What is the difference? +### DevOps vs. SRE vs. Platform Engineering. Qual a diferença? -The concepts of DevOps, SRE, and Platform Engineering have emerged at different times and have been developed by various individuals and organizations. +Os conceitos de DevOps, SRE (Engenharia de Confiabilidade do Site, _Site Reliability Engineering_) e Platform Egineering (Engenharia de Plataforma) surgiram em momentos diferentes e foram desenvolvidos por diversos indivíduos e organizações.

-DevOps as a concept was introduced in 2009 by Patrick Debois and Andrew Shafer at the Agile conference. They sought to bridge the gap between software development and operations by promoting a collaborative culture and shared responsibility for the entire software development lifecycle. +O conceito de DevOps foi introduzido em 2009 por Patrick Debois e Andrew Shafer a conferência Ágil (_Alige_ conference). Eles buscaram reduzir a lacuna entre o desenvolvimento de software e as operações, promovendo uma cultura colaborativa e responsabilidade compartilhada por todo o ciclo de vida do desenvolvimento de software. -SRE, or Site Reliability Engineering, was pioneered by Google in the early 2000s to address operational challenges in managing large-scale, complex systems. Google developed SRE practices and tools, such as the Borg cluster management system and the Monarch monitoring system, to improve the reliability and efficiency of their services. +O SRE, ou Engenharia de Confiabilidade de Sites, foi pioneirizado pelo Google no início dos anos 2000 para lidar com desafios operacionais no gerenciamento de sistemas complexos em grande escala. O Google desenvolveu práticas e ferramentas de SRE, como o sistema de gerenciamento de clusters Borg e o sistema de monitoramento Monarch, para aprimorar a confiabilidade e eficiência de seus serviços. -Platform Engineering is a more recent concept, building on the foundation of SRE engineering. The precise origins of Platform Engineering are less clear, but it is generally understood to be an extension of the DevOps and SRE practices, with a focus on delivering a comprehensive platform for product development that supports the entire business perspective. +A Engenharia de Plataformas é um conceito mais recente, construindo sobre a base da engenharia SRE. As origens precisas da Engenharia de Plataformas são menos claras, mas geralmente é entendida como uma extensão das práticas DevOps e SRE, com foco em fornecer uma plataforma abrangente para o desenvolvimento de produtos que suporta toda a perspectiva do negócio. -It's worth noting that while these concepts emerged at different times. They are all related to the broader trend of improving collaboration, automation, and efficiency in software development and operations. +Vale ressaltar que, embora esses conceitos tenham surgido em momentos diferentes, todos estão relacionados à tendência mais ampla de aprimorar a colaboração, automação e eficiência no desenvolvimento e operações de software. -### What is k8s (Kubernetes)? +### O que é k8s (Kubernetes)? -K8s is a container orchestration system. It is used for container deployment and management. Its design is greatly impacted by Google’s internal system Borg. +O K8s é um sistema de orquestração de contêineres. Ele é usado para implantação e gerenciamento de contêineres. Seu design é fortemente influenciado pelo sistema interno do Google chamado Borg.

-A k8s cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. +Um cluster K8s consistem em um conjunto de máquinas workers (trabalhadores, secundários) que rodam aplicações containerizadas. Todo cluster tem pelo menos um nó worker. -The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster. In production environments, the control plane usually runs across multiple computers, and a cluster usually runs multiple nodes, providing fault tolerance and high availability. +O(s) nó(s) worker(s) hospedam os Pods que são os componentes da carga de trabalho da aplicação. O plano de controle (_control plane_) gerencia os nós de trabalho e os Pods no cluster. Em ambientes de produção, o plano de controle geralmente é executado em vários computadores, e um cluster geralmente executa vários nós, proporcionando tolerância a falhas e alta disponibilidade. -- Control Plane Components +- Componentem do Plano de Controle -1. API Server +1. Servidor da API - The API server talks to all the components in the k8s cluster. All the operations on pods are executed by talking to the API server. + O servidor da API se comunica com todos os componented do cluster k8s. Todas as operações nos pods são executadas por meio de comunicação com o servidor de API. -2. Scheduler +2. Escalonador (_Scheduler_) - The scheduler watches pod workloads and assigns loads on newly created pods. + O escalonador (_scheduler_) observa cargas de trabalho nas pods e aloca cargas em pods recém-criados. -3. Controller Manager +3. Gerenciador de Controladores - The controller manager runs the controllers, including Node Controller, Job Controller, EndpointSlice Controller, and ServiceAccount Controller. + O gerenciador de controladores executa os controladores, incluindo Node Controller, Job Controller, EndpointSlice Controller e ServiceAccount Controller. 4. Etcd - etcd is a key-value store used as Kubernetes' backing store for all cluster data. + etcd é um armazenamento de chave-valor usado como armazenamento principal do Kubernetes para todos os dados do cluster. -- Nodes +- Nós 1. Pods - A pod is a group of containers and is the smallest unit that k8s administers. Pods have a single IP address applied to every container within the pod. + Um pod é um grupo de contêineres e é a menor unidade administrada pelo Kubernetes. Os Pods têm um único endereço IP aplicado a cada contêiner dentro do pod. 2. Kubelet - An agent that runs on each node in the cluster. It ensures containers are running in a Pod. + Um agente que é executado em cada nó no cluster. Ele garante que os contêineres estejam em execução em um Pod. 3. Kube Proxy - Kube-proxy is a network proxy that runs on each node in your cluster. It routes traffic coming into a node from the service. It forwards requests for work to the correct containers. + O Kube-proxy é um proxy de rede que é executado em cada nó do seu cluster. Ele direciona o tráfego que entra em um nó proveniente do serviço e encaminha solicitações de trabalho para os contêineres corretos. -### Docker vs. Kubernetes. Which one should we use? +### Docker vs. Kubernetes. Qual eu deveria usar?

-What is Docker ? +O que é Docker ? -Docker is an open-source platform that allows you to package, distribute, and run applications in isolated containers. It focuses on containerization, providing lightweight environments that encapsulate applications and their dependencies. +O Docker é uma plataforma de código aberto que permite empacotar, distribuir e executar aplicativos em contêineres isolados. Ele se concentra na containerização, fornecendo ambientes leves que encapsulam aplicativos e suas dependências. -What is Kubernetes ? +O que é Kubernetes ? -Kubernetes, often referred to as K8s, is an open-source container orchestration platform. It provides a framework for automating the deployment, scaling, and management of containerized applications across a cluster of nodes. +O Kubernetes, frequentemente referido como K8s, é uma plataforma de orquestração de contêineres de código aberto. Ele fornece um framework para automatizar a implantação, escalabilidade e gerenciamento de aplicativos em contêineres em um cluster de nós. -How are both different from each other ? +Como eles diferem entre si ? -Docker: Docker operates at the individual container level on a single operating system host. +Docker: O Docker opera no nível individual do contêiner em um único sistema operacional hospedeiro. -You must manually manage each host and setting up networks, security policies, and storage for multiple related containers can be complex. +É necessário gerenciar manualmente cada hospedeiro, e configurar redes, políticas de segurança e armazenamento para vários contêineres relacionados pode ser complexo. -Kubernetes: Kubernetes operates at the cluster level. It manages multiple containerized applications across multiple hosts, providing automation for tasks like load balancing, scaling, and ensuring the desired state of applications. +Kubernetes: O Kubernetes opera no nível do cluster. Ele gerencia múltiplos aplicativos em contêineres em vários hospedeiros, proporcionando automação para tarefas como balanceamento de carga, escalabilidade e garantia do estado desejado dos aplicativos. -In short, Docker focuses on containerization and running containers on individual hosts, while Kubernetes specializes in managing and orchestrating containers at scale across a cluster of hosts. +Em resumo, o Docker foca na containerização e na execução de contêineres em hospedeiros individuais, enquanto o Kubernetes se especializa em gerenciar e orquestrar contêineres em escala, em um cluster de hospedeiros. -### How does Docker work? +### Como o Docker funciona? -The diagram below shows the architecture of Docker and how it works when we run “docker build”, “docker pull” -and “docker run”. +O diagrama abaixo mostra a arquitetura do Docker e como ela funciona quando executamos os comandos "docker build", "docker pull" e "docker run".

-There are 3 components in Docker architecture: +Há 3 components em uma arquitetura Docker: -- Docker client +- Cliente Docker - The docker client talks to the Docker daemon. + O cliente docker fala com o Docker daemon. -- Docker host +- Hospedeiro Docker (_host_) - The Docker daemon listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes. + O Docker daemon escuta por requisições de API do Docker e gerencia objetos do Docker, como imagens, contêineres, redes e volumes. -- Docker registry +- Registro do Docker (_registry_) - A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use. + Um registro do Docker armazena imagens do Docker. O Docker Hub é um registro público que qualquer pessoa pode utilizar. -Let’s take the “docker run” command as an example. +Vamos tomar o comando "docker run" como exemplo. -1. Docker pulls the image from the registry. -1. Docker creates a new container. -1. Docker allocates a read-write filesystem to the container. -1. Docker creates a network interface to connect the container to the default network. -1. Docker starts the container. +1. O Docker puxa a imagem do registro. +2. O Docker cria um novo container. +3. O Docker aloca um sistema de arquivos de leitura-escrita para o container. +4. O Docker cria uma interface de rede e conecta ao container para a rede padrão. +5. O Docker inicializa o container. ## GIT From 230cc3a4dd4b0c63c0f3127d6bfd35fc33316e55 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 15:10:25 -0300 Subject: [PATCH 14/19] Git Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 50 +++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index df875bf..1980210 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -76,8 +76,8 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Docker vs. Kubernetes. Qual eu deveria usar?](#docker-vs-kubernetes-qual-eu-deveria-usar) - [Como o Docker funciona?](#como-o-docker-funciona) - [GIT](#git) - - [How Git Commands work](#how-git-commands-work) - - [How does Git Work?](#how-does-git-work) + - [Como Comandos do Git funcionam](#como-comandos-do-git-funcionam) + - [Como o Git funciona?](#como-o-git-funciona) - [Git merge vs. Git rebase](#git-merge-vs-git-rebase) - [Cloud Services](#cloud-services) - [A nice cheat sheet of different cloud services (2023 edition)](#a-nice-cheat-sheet-of-different-cloud-services-2023-edition) @@ -1122,64 +1122,66 @@ Vamos tomar o comando "docker run" como exemplo. ## GIT -### How Git Commands work +### Como Comandos do Git funcionam -To begin with, it's essential to identify where our code is stored. The common assumption is that there are only two locations - one on a remote server like Github and the other on our local machine. However, this isn't entirely accurate. Git maintains three local storages on our machine, which means that our code can be found in four places: +Para começar, é essencial identificar onde nosso código está armazenado. A suposição comum é que só existem duas localidades - uma em um servidor remoto como Github e a outra em nossa máquina local. No entanto, isso não é totalmente preciso. O Git mantém três armazenamentos locais na nossa máquina, o que significa que nosso código pode estar em quatro lugares:

-- Working directory: where we edit files -- Staging area: a temporary location where files are kept for the next commit -- Local repository: contains the code that has been committed -- Remote repository: the remote server that stores the code +- Diretório Atual (_Working Directory_): onde editamos arquivos +- Área de Ensaio (_Staging Area_): um local temporário onde arquivos são mantidos para o próximo commit +- Repositório Local: contém o código que foi confirmado (_commitetd_) +- Repositório Remoto (_Remote_): o servidor remoto que armazena o código -Most Git commands primarily move files between these four locations. +A maioria dos comandos Git movimenta arquivos entre essas 4 localidades. -### How does Git Work? +### Como o Git funciona? -The diagram below shows the Git workflow. +O diagrama abaixo mostra o fluxo de trabalho do Git.

-Git is a distributed version control system. +Git é um sistema de controle de versões distribuído. -Every developer maintains a local copy of the main repository and edits and commits to the local copy. +Todos os desenvolvedores mantém uma copia local do repositório principal e edita e confirma (comita) na cópia logal. -The commit is very fast because the operation doesn’t interact with the remote repository. +O _commit_ é muito rápido pois a operação não interage com o repositório remoto. -If the remote repository crashes, the files can be recovered from the local repositories. +Se o repositorio remoto crasha, os arquivos podem ser recuperados pelas repositórios locais. ### Git merge vs. Git rebase -What are the differences? +Quais as diferenças? + +Quando nós **mesclamos (merge) alterações** de um branch (ramo) Git para outro, nós podemos usar ‘git merge’ ou ‘git rebase’. O diagrama abaixo mostra como os dois comandos funcionam.

-When we **merge changes** from one Git branch to another, we can use ‘git merge’ or ‘git rebase’. The diagram below shows how the two commands work. - **Git merge** -This creates a new commit G’ in the main branch. G’ ties the histories of both main and feature branches. +Isso cria um novo commit G' no branch principal. G' une as histórias tanto do branch principal quanto do branch de recurso (_resource branch_). -Git merge is **non-destructive**. Neither the main nor the feature branch is changed. +O merge do Git é **não destrutivo**. Nem o branch principal nem o branch de recurso são alterados. **Git rebase** Git rebase moves the feature branch histories to the head of the main branch. It creates new commits E’, F’, and G’ for each commit in the feature branch. -The benefit of rebase is that it has a linear **commit history**. +O rebase do Git move as histórias do feature branch para o topo do branch principal. Ele cria novos commits E', F' e G' para cada commit no branch de recurso. + +A vantagem do rebase é que ele resulta em um **histórico de commits** linear. -Rebase can be dangerous if “the golden rule of git rebase” is not followed. +O rebase pode ser perigoso se "a regra de ouro do git rebase" não for seguida. -**The Golden Rule of Git Rebase** +**A Regra de Ouro do Git Rebase** -Never use it on public branches! +Nunca utilize ele em branches públicos! ## Cloud Services From 6ed124adf1392e89da8644f363b278650e962ba8 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 15:26:06 -0300 Subject: [PATCH 15/19] Cloud Services Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 1980210..9ad3e17 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -79,9 +79,9 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Como Comandos do Git funcionam](#como-comandos-do-git-funcionam) - [Como o Git funciona?](#como-o-git-funciona) - [Git merge vs. Git rebase](#git-merge-vs-git-rebase) - - [Cloud Services](#cloud-services) - - [A nice cheat sheet of different cloud services (2023 edition)](#a-nice-cheat-sheet-of-different-cloud-services-2023-edition) - - [What is cloud native?](#what-is-cloud-native) + - [Serviços Cloud](#serviços-cloud) + - [Um guia prático útil de diferentes serviços em nuvem (edição 2023).](#um-guia-prático-útil-de-diferentes-serviços-em-nuvem-edição-2023) + - [O que é cloud native?](#o-que-é-cloud-native) - [Developer productivity tools](#developer-productivity-tools) - [Visualize JSON files](#visualize-json-files) - [Automatically turn code into architecture diagrams](#automatically-turn-code-into-architecture-diagrams) @@ -1183,43 +1183,43 @@ O rebase pode ser perigoso se "a regra de ouro do git rebase" não for seguida. Nunca utilize ele em branches públicos! -## Cloud Services +## Serviços Cloud -### A nice cheat sheet of different cloud services (2023 edition) +### Um guia prático útil de diferentes serviços em nuvem (edição 2023).

-### What is cloud native? +### O que é cloud native? -Below is a diagram showing the evolution of architecture and processes since the 1980s. +Abaixo é um diagrama mostrando a evoluçã de arquiteturas e processos desde 1980.

-Organizations can build and run scalable applications on public, private, and hybrid clouds using cloud native technologies. +Organizações podem construir e rodar aplicações escaláveis em clouds públicas, privadas e hibridas utilizando tecnologias cloud native. -This means the applications are designed to leverage cloud features, so they are resilient to load and easy to scale. +Isso significa que aplicações são projetadas para aproveitar características da cloud, tornando-as resilientes à cargas fáceis de escalar. -Cloud native includes 4 aspects: +Cloud native inclui 4 aspectos: -1. Development process +1. Processo de Desenvolvimento - This has progressed from waterfall to agile to DevOps. + Isso evoluiu do modelo waterfall para o ágil e, posteriormente, para o DevOps. -2. Application Architecture +2. Arquitetura de Aplicação - The architecture has gone from monolithic to microservices. Each service is designed to be small, adaptive to the limited resources in cloud containers. + A arquitetura foi de monolito para microsserviços. Cada serviço é projetado para ser pequeno, adaptativo para os recursos limitados em containers na cloud. -3. Deployment & packaging +3. Implantação & Empacotamento - The applications used to be deployed on physical servers. Then around 2000, the applications that were not sensitive to latency were usually deployed on virtual servers. With cloud native applications, they are packaged into docker images and deployed in containers. + As aplicações que eram implantados em servidores físicos. Então, por volta do ano 2000, as aplicações que não eram sensitivas à latência foram implantadas em servidores virtuais. Com aplicações cloud native, eles foram empacotados em imagens docker e implantados em containers. -4. Application infrastructure +4. Infraestrutura de Aplicação - The applications are massively deployed on cloud infrastructure instead of self-hosted servers. + As aplicações são implantadas em massa em infraestrutura de nuvem em vez de servidores auto-hospedados. ## Developer productivity tools From ad6076d1cebb8ee73995dcbbe700d6a0b485a257 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 16:58:29 -0300 Subject: [PATCH 16/19] Productivity Tools Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 9ad3e17..2f1ecb9 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -82,9 +82,9 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Serviços Cloud](#serviços-cloud) - [Um guia prático útil de diferentes serviços em nuvem (edição 2023).](#um-guia-prático-útil-de-diferentes-serviços-em-nuvem-edição-2023) - [O que é cloud native?](#o-que-é-cloud-native) - - [Developer productivity tools](#developer-productivity-tools) - - [Visualize JSON files](#visualize-json-files) - - [Automatically turn code into architecture diagrams](#automatically-turn-code-into-architecture-diagrams) + - [Ferramentas de produtividade para desenvolvedores](#ferramentas-de-produtividade-para-desenvolvedores) + - [Visualizar arquivos JSON](#visualizar-arquivos-json) + - [Transformar código em diagramas de arquitetura de forma automática](#transformar-código-em-diagramas-de-arquitetura-de-forma-automática) - [Linux](#linux) - [Linux file system explained](#linux-file-system-explained) - [18 Most-used Linux Commands You Should Know](#18-most-used-linux-commands-you-should-know) @@ -1221,32 +1221,32 @@ Cloud native inclui 4 aspectos: As aplicações são implantadas em massa em infraestrutura de nuvem em vez de servidores auto-hospedados. -## Developer productivity tools +## Ferramentas de produtividade para desenvolvedores -### Visualize JSON files +### Visualizar arquivos JSON -Nested JSON files are hard to read. +Arquivos JSON aninhados podem ser difíceis de ler. -**JsonCrack** generates graph diagrams from JSON files and makes them easy to read. +**JsonCrack** gera diagramas de grafo a partir de arquivos JSON e os torna fáceis de ler. -Additionally, the generated diagrams can be downloaded as images. +Além disso, os diagramas gerados podem ser baixados como imagens.

-### Automatically turn code into architecture diagrams +### Transformar código em diagramas de arquitetura de forma automática

-What does it do? +O que ele faz? -- Draw the cloud system architecture in Python code. -- Diagrams can also be rendered directly inside the Jupyter Notebooks. -- No design tools are needed. -- Supports the following providers: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud, etc. +- Desenha a arquitetura do sistema cloud em código python. +- Diagramas podem ser renderizados diretamente dentro de Jupyter Notebooks. +- Nenhuma ferramenta de design necessária. +- Suporta os seguintes fornecedores: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud etc. [Github repo](https://github.com/mingrammer/diagrams) From 7f872254941cfb4c670139559c21607c63411839 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 17:16:39 -0300 Subject: [PATCH 17/19] Linux Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 53 ++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 27 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 2f1ecb9..4516c6f 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -86,8 +86,8 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Visualizar arquivos JSON](#visualizar-arquivos-json) - [Transformar código em diagramas de arquitetura de forma automática](#transformar-código-em-diagramas-de-arquitetura-de-forma-automática) - [Linux](#linux) - - [Linux file system explained](#linux-file-system-explained) - - [18 Most-used Linux Commands You Should Know](#18-most-used-linux-commands-you-should-know) + - [Sistema de Arquivos do Linux explicado](#sistema-de-arquivos-do-linux-explicado) + - [18 Comandos Linux Mais Utilizados que Você Deve Conhecer](#18-comandos-linux-mais-utilizados-que-você-deve-conhecer) - [Security](#security) - [How does HTTPS work?](#how-does-https-work) - [Oauth 2.0 Explained With Simple Terms.](#oauth-20-explained-with-simple-terms) @@ -1252,45 +1252,44 @@ O que ele faz? ## Linux -### Linux file system explained +### Sistema de Arquivos do Linux explicado

-The Linux file system used to resemble an unorganized town where individuals constructed their houses wherever they pleased. However, in 1994, the Filesystem Hierarchy Standard (FHS) was introduced to bring order to the Linux file system. +O sistema de arquivos do Linux costumava se assemelhar a uma cidade desorganizada onde indivíduos construíam suas casas onde queriam. No entando, em 1994, o Padrão de Hierarquia do Sistema de Arquivos (FHS, _Filesystem Hierarchy Standard_) foi introduzido para trazer ordem ao sistem de arquivos Linux. -By implementing a standard like the FHS, software can ensure a consistent layout across various Linux distributions. Nonetheless, not all Linux distributions strictly adhere to this standard. They often incorporate their own unique elements or cater to specific requirements. -To become proficient in this standard, you can begin by exploring. Utilize commands such as "cd" for navigation and "ls" for listing directory contents. Imagine the file system as a tree, starting from the root (/). With time, it will become second nature to you, transforming you into a skilled Linux administrator. +Ao implementar um padrão como o FHS, o software pode garantir um layout consistente em várias distribuições do Linux. No entando, nem todas as distribuições do Linux aderem estritamente a esse padrão. Elas frequentemente incorporam elementos exclusivos ou atendem a requisitos específicos. Para se tornar proficiente nesse padrão você pode começar explorando. Utilize comandos como "cd" para navegação e "ls" para listar os conteúdos de um diretório. Imagine o sistema de arquivos como uma árvore, começando pela raiz (/). Com o tempo, isso se tornará algo natural para você, transformando-o num administrador Linux habilidoso. -### 18 Most-used Linux Commands You Should Know +### 18 Comandos Linux Mais Utilizados que Você Deve Conhecer -Linux commands are instructions for interacting with the operating system. They help manage files, directories, system processes, and many other aspects of the system. You need to become familiar with these commands in order to navigate and maintain Linux-based systems efficiently and effectively. +Comandos Linux são instruções para interagir com o sistema operacional. Eles ajudam a manusear arquivos, diretórios, processos e vários outros aspectos do sistema. Você precisa se tornar familiar com estes comands para navegar e manter um sistema baseado em Linux de forma eficiente e efetiva. -This diagram below shows popular Linux commands: +O diagrama abaixo mostra comandos Linux populares:

-- ls - List files and directories -- cd - Change the current directory -- mkdir - Create a new directory -- rm - Remove files or directories -- cp - Copy files or directories -- mv - Move or rename files or directories -- chmod - Change file or directory permissions -- grep - Search for a pattern in files -- find - Search for files and directories -- tar - manipulate tarball archive files -- vi - Edit files using text editors -- cat - display the content of files -- top - Display processes and resource usage -- ps - Display processes information -- kill - Terminate a process by sending a signal -- du - Estimate file space usage -- ifconfig - Configure network interfaces -- ping - Test network connectivity between hosts +- ls - Lista arquivos e diretórios +- cd - Troca o diretório corrente +- mkdir - Cria um novo diretório +- rm - Remove arquivos e ou diretórios +- cp - Copia arquivos e ou diretórios +- mv - Move ou renomeia arquivos e ou diretórios +- chmod - Muda permissoes de arquivos e ou diretórios +- grep - Busca por um padrão em arquivos +- find - Busca por arquivos e diretódios +- tar - Marnipula arquivos tarball comprimidos +- vi - Edita arquivos usando editores de texto +- cat - Imprime o conteúdo de arquivos +- top - Imprime processos e utilização de recursos +- ps - Imprime informações de processos +- kill - Termina um processo por enviar um sinal +- du - Estima a utilização de espaço de arquivos +- ifconfig - Configura interfaces de rede +- ping - Testa conectividade via rede entre hospedeiros (hosts) ## Security From 933d30618a68e478c6b62124af6e83f2d9d3cee5 Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 18:50:20 -0300 Subject: [PATCH 18/19] Security Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 181 ++++++++++++++++++------------------ 1 file changed, 92 insertions(+), 89 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 4516c6f..5bcadd9 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -88,14 +88,14 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Linux](#linux) - [Sistema de Arquivos do Linux explicado](#sistema-de-arquivos-do-linux-explicado) - [18 Comandos Linux Mais Utilizados que Você Deve Conhecer](#18-comandos-linux-mais-utilizados-que-você-deve-conhecer) - - [Security](#security) - - [How does HTTPS work?](#how-does-https-work) - - [Oauth 2.0 Explained With Simple Terms.](#oauth-20-explained-with-simple-terms) - - [Top 4 Forms of Authentication Mechanisms](#top-4-forms-of-authentication-mechanisms) - - [Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they?](#session-cookie-jwt-token-sso-and-oauth-20---what-are-they) - - [How to store passwords safely in the database and how to validate a password?](#how-to-store-passwords-safely-in-the-database-and-how-to-validate-a-password) - - [Explaining JSON Web Token (JWT) to a 10 year old Kid](#explaining-json-web-token-jwt-to-a-10-year-old-kid) - - [How does Google Authenticator (or other types of 2-factor authenticators) work?](#how-does-google-authenticator-or-other-types-of-2-factor-authenticators-work) + - [Segurança](#segurança) + - [Como o HTTPS funciona?](#como-o-https-funciona) + - [Oauth 2.0 Explicado com Termos Simples.](#oauth-20-explicado-com-termos-simples) + - [Principais 4 Formas de Mecanismos de Autenticação](#principais-4-formas-de-mecanismos-de-autenticação) + - [Sessão, cookie, JWT, token, SSO, e OAuth 2.0 - o que são?](#sessão-cookie-jwt-token-sso-e-oauth-20---o-que-são) + - [Como armazenar senhas de forma segura em bancos de dados e como validá-las?](#como-armazenar-senhas-de-forma-segura-em-bancos-de-dados-e-como-validá-las) + - [Explicando JSON Web Token (JWT) para uma criança de 10 anos de idade](#explicando-json-web-token-jwt-para-uma-criança-de-10-anos-de-idade) + - [Como o Autenticador do Google (ou outros tipos de autenticadores de 2-fatores) funciona?](#como-o-autenticador-do-google-ou-outros-tipos-de-autenticadores-de-2-fatores-funciona) - [Real World Case Studies](#real-world-case-studies) - [Netflix's Tech Stack](#netflixs-tech-stack) - [Twitter Architecture 2022](#twitter-architecture-2022) @@ -1291,187 +1291,190 @@ O diagrama abaixo mostra comandos Linux populares: - ifconfig - Configura interfaces de rede - ping - Testa conectividade via rede entre hospedeiros (hosts) -## Security +## Segurança -### How does HTTPS work? +### Como o HTTPS funciona? -Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP.) HTTPS transmits encrypted data using Transport Layer Security (TLS.) If the data is hijacked online, all the hijacker gets is binary code. +Protocolo de Transferência de Hipertexto Seguro (HTTPS, _Hypertext Transfer Protocol Secure_) é uma extensão do Protocolo de Transferência de Hipertexto (HTTP). HTTPS transmite dados encriptados utilizando Securança na Camada de de Transporte (TLC, _Transport Layer Security_). Se os dados forem sequestrados online, tudo que o sequestrador recebe é lixo binário.

-How is the data encrypted and decrypted? +Como os dados são encriptados e decriptados? -Step 1 - The client (browser) and the server establish a TCP connection. +Passo 1 - O cliente (navegador) e o servidor estabelecem uma conexão TCP. -Step 2 - The client sends a “client hello” to the server. The message contains a set of necessary encryption algorithms (cipher suites) and the latest TLS version it can support. The server responds with a “server hello” so the browser knows whether it can support the algorithms and TLS version. +Passo 2 - O cliente envia um "client hello" para o servidor. A mensagem contém um conjunto de algoritmos de criptografia necessários (suites de cifras) e a última versão do TLS que ele pode suportar. O servidor responde com um "server hello" para que o navegador saiba se pode suportar os algoritmos e a versão do TLS. -The server then sends the SSL certificate to the client. The certificate contains the public key, host name, expiry dates, etc. The client validates the certificate. +O servidor então envia o certificado SSL para o cliente. O certificado contém a chave pública, hostname, data de expiração etc. O cliente valida o certificado. -Step 3 - After validating the SSL certificate, the client generates a session key and encrypts it using the public key. The server receives the encrypted session key and decrypts it with the private key. +Passo 3 - Após a validação do certificado SSL, o cliente gera a chave de sessão e encripta utilizando a chave pública. O servidor recebe a chave de sessão encriptada e a decripta com a chave privada. -Step 4 - Now that both the client and the server hold the same session key (symmetric encryption), the encrypted data is transmitted in a secure bi-directional channel. +Passo 4 - Agora que tanto o cliente como o servidor possuem a mesma chave de sessão (criptografia simétrica), os dados encriptados são transmitidos em um canal bi-direcional seguro. -Why does HTTPS switch to symmetric encryption during data transmission? There are two main reasons: +Por que HTTPS troca para criptografia simétrica durante a transmissão de dados? Há duas razões principais: -1. Security: The asymmetric encryption goes only one way. This means that if the server tries to send the encrypted data back to the client, anyone can decrypt the data using the public key. +1. Segurança: A criptografia simétrica funciona apenas de um lado. Isso significa que se o servidor tentar enviar dados criptografados de volta para o cliente, qualquer pessoa consegue decriptar os dados utilizando a chave pública. -2. Server resources: The asymmetric encryption adds quite a lot of mathematical overhead. It is not suitable for data transmissions in long sessions. +1. Recursos do Servidor: A criptografia assimétrica adiciona uma carga matemática significativa. Não é adequada para transmissões de dados em sessões longas. -### Oauth 2.0 Explained With Simple Terms. +### Oauth 2.0 Explicado com Termos Simples. -OAuth 2.0 is a powerful and secure framework that allows different applications to securely interact with each other on behalf of users without sharing sensitive credentials. +OAuth 2.0 é um framework poderoso e seguro que permite diferentes aplicações interagirem uma com as outras de formas seguras em nome dos usuários, sem compartilhar credenciais sensíveis.

-The entities involved in OAuth are the User, the Server, and the Identity Provider (IDP). +A entidade envolvida com OAuth são: o Usuário, o Servidor e o Provedor de Identidade (IDP, _Identity Provider_). -What Can an OAuth Token Do? +O que um Token OAuth pode fazer? -When you use OAuth, you get an OAuth token that represents your identity and permissions. This token can do a few important things: +Quando você utiliza OAuth, você recebe um OAuth token que representa sua identidade e permissões. Esse token pode fazer algumas coisas importantes: -Single Sign-On (SSO): With an OAuth token, you can log into multiple services or apps using just one login, making life easier and safer. +Single Sign-On (SSO): Com um token OAuth, você pode fazer login em vários serviços ou aplicativos com um único login, tornando a vida mais fácil e segura. -Authorization Across Systems: The OAuth token allows you to share your authorization or access rights across various systems, so you don't have to log in separately everywhere. +Autorização Entre Sistemas: O token OAuth permite que você compartilhe suas permissões ou direitos de acesso em vários sistemas, evitando que você precise fazer login separadamente em cada lugar. -Accessing User Profile: Apps with an OAuth token can access certain parts of your user profile that you allow, but they won't see everything. +Acesso ao Perfil do Usuário: Aplicativos com um token OAuth podem acessar partes específicas do seu perfil de usuário que você permite, mas eles não verão tudo. -Remember, OAuth 2.0 is all about keeping you and your data safe while making your online experiences seamless and hassle-free across different applications and services. +Lembre-se, o OAuth 2.0 visa manter você e seus dados seguros, tornando suas experiências online contínuas e sem complicações em diferentes aplicativos e serviços. -### Top 4 Forms of Authentication Mechanisms +### Principais 4 Formas de Mecanismos de Autenticação

-1. SSH Keys: +1. Chaves SSH: - Cryptographic keys are used to access remote systems and servers securely + Chaves criptográficas são utilizadas para acesso remoto de sistemas e servidores de forma segura -1. OAuth Tokens: +2. Tokens OAuth: - Tokens that provide limited access to user data on third-party applications + Tokens que fornecem acesso limitado aos dados do usuário em aplicativos de terceiros. -1. SSL Certificates: +3. Certificados SSL: - Digital certificates ensure secure and encrypted communication between servers and clients + Certificados digitais garantem comunicação segura e encriptada entre servidores de clientes -1. Credentials: +4. Credenciais: - User authentication information is used to verify and grant access to various systems and services + As informações de autenticação do usuário são utilizadas para verificar e conceder acesso a vários sistemas e serviços. -### Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they? +### Sessão, cookie, JWT, token, SSO, e OAuth 2.0 - o que são? -These terms are all related to user identity management. When you log into a website, you declare who you are (identification). Your identity is verified (authentication), and you are granted the necessary permissions (authorization). Many solutions have been proposed in the past, and the list keeps growing. +Esses termos estão todos relacionados à gestão da identidade do usuário. Quando você faz login em um site, declara quem é (identificação). Sua identidade é verificada (autenticação) e são concedidas as permissões necessárias (autorização). Muitas soluções foram propostas no passado, e a lista continua crescendo.

-From simple to complex, here is my understanding of user identity management: +De simples até complexo, aqui está a minha compreensão sobre a gestão de identidade do usuário: -- WWW-Authenticate is the most basic method. You are asked for the username and password by the browser. As a result of the inability to control the login life cycle, it is seldom used today. +- WWW-Authenticate é o método mais básico. O navegador solicita o nome de usuário e a senha. Devido à incapacidade de controlar o ciclo de vida do login, raramente é usado hoje em dia. -- A finer control over the login life cycle is session-cookie. The server maintains session storage, and the browser keeps the ID of the session. A cookie usually only works with browsers and is not mobile app friendly. +- Um controle mais refinado sobre o ciclo de vida do login é feito com session-cookie (cookie da sessão). O servidor mantém o armazenamento de sessão, e o navegador mantém o ID da sessão. Um cookie geralmente funciona apenas com navegadores e não é amigável para aplicativos móveis. -- To address the compatibility issue, the token can be used. The client sends the token to the server, and the server validates the token. The downside is that the token needs to be encrypted and decrypted, which may be time-consuming. +- Para lidar com o problema de compatibilidade, o token pode ser usado. O cliente envia o token para o servidor, e o servidor valida o token. A desvantagem é que o token precisa ser criptografado e descriptografado, o que pode ser demorado. -- JWT is a standard way of representing tokens. This information can be verified and trusted because it is digitally signed. Since JWT contains the signature, there is no need to save session information on the server side. +- JWT é uma maneira padrão de representar tokens. Essas informações podem ser verificadas e confiáveis porque são digitalmente assinadas. Como o JWT contém a assinatura, não é necessário salvar informações de sessão no lado do servidor. -- By using SSO (single sign-on), you can sign on only once and log in to multiple websites. It uses CAS (central authentication service) to maintain cross-site information. +- Ao usar SSO (entrada única, _single sign-on_), você pode fazer login apenas uma vez e acessar vários sites. Ele utiliza o CAS (serviço de autenticação central, _central authentication service_) para manter informações entre sites. -- By using OAuth 2.0, you can authorize one website to access your information on another website. +- Ao usar OAuth 2.0, você pode autorizar um site a acessar suas informações em outro site. -### How to store passwords safely in the database and how to validate a password? +### Como armazenar senhas de forma segura em bancos de dados e como validá-las?

-**Things NOT to do** +**O que NÃO fazer** -- Storing passwords in plain text is not a good idea because anyone with internal access can see them. +- Armazenar senhas em texto puro não é uma boa ideia pois qualquer pessoa com acesso interno consegue vê-las. -- Storing password hashes directly is not sufficient because it is pruned to precomputation attacks, such as rainbow tables. +- Armazenar diretamente os hashes de senhas não é suficiente, pois está sujeito a ataques de pré-computação, como tabelas arco-íris (_rainbow tables_). -- To mitigate precomputation attacks, we salt the passwords. +- Para miticar ataques pré-computados, precisamos saltear as senhas. -**What is salt?** +**O que é sal** -According to OWASP guidelines, “a salt is a unique, randomly generated string that is added to each password as part of the hashing process”. +De acordo com as diretrizes da OWASP, "um _salt_ (sal) é uma string única e gerada aleatoriamente que é adicionada a cada senha como parte do processo de hash". -**How to store a password and salt?** +**Como armazenar uma senha e sal?** -1. the hash result is unique to each password. -1. The password can be stored in the database using the following format: hash(password + salt). +1. O resultado do hash é único para cada senha. +2. A senha pode ser armazenada no banco de dados usando o seguinte formato: hash(senha + salt). -**How to validate a password?** +**Como validar uma senha?** -To validate a password, it can go through the following process: +Para validar uma senha, ela pode passar pelo seguinte processo: -1. A client enters the password. -1. The system fetches the corresponding salt from the database. -1. The system appends the salt to the password and hashes it. Let’s call the hashed value H1. -1. The system compares H1 and H2, where H2 is the hash stored in the database. If they are the same, the password is valid. +1. Um cliente insere a senha. +2. O sistema recupera o salt correspondente do banco de dados. +3. O sistema concatena o salt à senha e realiza o hash. Vamos chamar o valor hash resultante de H1. +4. O sistema compara H1 e H2, onde H2 é o hash armazenado no banco de dados. Se forem iguais, a senha é válida. -### Explaining JSON Web Token (JWT) to a 10 year old Kid +### Explicando JSON Web Token (JWT) para uma criança de 10 anos de idade

-Imagine you have a special box called a JWT. Inside this box, there are three parts: a header, a payload, and a signature. +Imagine que você tem uma caixa especial chamada JWT. Dentro dessa caixa, existem três partes: um cabeçalho, uma carga útil e uma assinatura. -The header is like the label on the outside of the box. It tells us what type of box it is and how it's secured. It's usually written in a format called JSON, which is just a way to organize information using curly braces { } and colons : . +O cabeçalho é como a etiqueta do lado de fora da caixa. Ele nos diz que tipo de caixa é e como ela está protegida. Geralmente, é escrito em um formato chamado JSON, que é apenas uma maneira de organizar informações usando chaves { } e dois-pontos : . -The payload is like the actual message or information you want to send. It could be your name, age, or any other data you want to share. It's also written in JSON format, so it's easy to understand and work with. -Now, the signature is what makes the JWT secure. It's like a special seal that only the sender knows how to create. The signature is created using a secret code, kind of like a password. This signature ensures that nobody can tamper with the contents of the JWT without the sender knowing about it. +A carga útil é como a mensagem ou a informação real que você deseja enviar. Pode ser seu nome, idade ou qualquer outro dado que você queira compartilhar. Também é escrito no formato JSON, tornando-o fácil de entender e utilizar. -When you want to send the JWT to a server, you put the header, payload, and signature inside the box. Then you send it over to the server. The server can easily read the header and payload to understand who you are and what you want to do. +Agora, a assinatura é o que torna o JWT seguro. É como um selo especial que apenas o remetente sabe como criar. A assinatura é criada usando um código secreto, algo semelhante a uma senha. Essa assinatura garante que ninguém pode adulterar o conteúdo do JWT sem que o remetente saiba sobre a alteração. -### How does Google Authenticator (or other types of 2-factor authenticators) work? +Quando você deseja enviar o JWT para um servidor, coloca o cabeçalho, a carga útil e a assinatura dentro da caixa. Em seguida, você envia para o servidor. O servidor pode ler facilmente o cabeçalho e a carga útil para entender quem você é e o que deseja fazer. -Google Authenticator is commonly used for logging into our accounts when 2-factor authentication is enabled. How does it guarantee security? +### Como o Autenticador do Google (ou outros tipos de autenticadores de 2-fatores) funciona? -Google Authenticator is a software-based authenticator that implements a two-step verification service. The diagram below provides detail. +O Autenticador do Google é comumente utilizado para fazer login em contas quando autenticação de dois-fatores está habilitada. Como ele garante segurança? + +O Autenticador doGoogle é um autenticador baseado em software que implementa um serviço de verificação de dois-fatores. O diagrama abaixo detalha.

-There are two stages involved: +Existem duas etapas envolvidas: + +- Etapa 1 - O usuário habilita a verificação de duas etapas do Google. +- Etapa 2 - O usuário utiliza o autenticador para fazer login etc. + +Vamos analisar essas etapas. -- Stage 1 - The user enables Google two-step verification. -- Stage 2 - The user uses the authenticator for logging in, etc. +**Etapa 1** -Let’s look at these stages. +Passo 1 and 2: Bob abre a página da web para habilitar a verificação em duas etapas. A interface solicita uma chave secreta. O serviço de autenticação gera a chave secreta para Bob e a armazena no banco de dados. -**Stage 1** +Passp 3: O serviço de autenticação retorna uma URI para a interface. A URI é composta por um emissor de chave, nome de usuário e chave secreta. A URI é exibida na forma de um código QR na página da web. -Steps 1 and 2: Bob opens the web page to enable two-step verification. The front end requests a secret key. The authentication service generates the secret key for Bob and stores it in the database. +Passo 4: Bob então usa o Google Authenticator para escanear o código QR gerado. A chave secreta é armazenada no autenticador. -Step 3: The authentication service returns a URI to the front end. The URI is composed of a key issuer, username, and secret key. The URI is displayed in the form of a QR code on the web page. +**Etapa 2** -Step 4: Bob then uses Google Authenticator to scan the generated QR code. The secret key is stored in the authenticator. +Passos 1 e 2: Bob deseja fazer login em um site com a verificação em duas etapas do Google. Para isso, ele precisa da senha. A cada 30 segundos, o Google Authenticator gera uma senha de 6 dígitos usando o algoritmo TOTP (Senha de Uso Único Baseada em Tempo, _Time-based One Time Password_). Bob usa a senha para acessar o site. -**Stage 2** -Steps 1 and 2: Bob wants to log into a website with Google two-step verification. For this, he needs the password. Every 30 seconds, Google Authenticator generates a 6-digit password using TOTP (Time-based One Time Password) algorithm. Bob uses the password to enter the website. +Passos 3 e 4: A interface envia a senha que Bob insere para o backend para autenticação. O serviço de autenticação lê a chave secreta do banco de dados e gera uma senha de 6 dígitos usando o mesmo algoritmo TOTP que o cliente. -Steps 3 and 4: The frontend sends the password Bob enters to the backend for authentication. The authentication service reads the secret key from the database and generates a 6-digit password using the same TOTP algorithm as the client. +Passo 5: O serviço de autenticação compara as duas senhas geradas pelo cliente e pelo servidor, e retorna o resultado da comparação para a interface. Bob pode prosseguir com o processo de login apenas se as duas senhas coincidirem. -Step 5: The authentication service compares the two passwords generated by the client and the server, and returns the comparison result to the frontend. Bob can proceed with the login process only if the two passwords match. +O mecanismo de autenticação é seguro? -Is this authentication mechanism safe? +- A chave secreta pode ser obtida por outras pessoas? -- Can the secret key be obtained by others? + Nós precisamos garantir que a chave secreta é transmitida via HTTPS. O cliente do autenticador e o banco de dados armazenam a chave secreta, e precisamos garantir que as chaves secretas sejam criptografadas. - We need to make sure the secret key is transmitted using HTTPS. The authenticator client and the database store the secret key, and we need to make sure the secret keys are encrypted. +- Os hackers podem adivinhar a senha de 6 dígitos? -- Can the 6-digit password be guessed by hackers? - No. The password has 6 digits, so the generated password has 1 million potential combinations. Plus, the password changes every 30 seconds. If hackers want to guess the password in 30 seconds, they need to enter 30,000 combinations per second. + Não. A senha possui 6 dígitos, o que resulta em 1 milhão de combinações potenciais. Além disso, a senha muda a cada 30 segundos. Se os hackers quiserem adivinhar a senha em 30 segundos, precisariam inserir 30.000 combinações por segundo. ## Real World Case Studies From 5ed8aa3be01d4673ca50cf4c3b22660f40af067d Mon Sep 17 00:00:00 2001 From: Daniel Lombardi Date: Thu, 21 Dec 2023 23:03:34 -0300 Subject: [PATCH 19/19] Case Studies Done Signed-off-by: Daniel Lombardi --- translations/README-ptbr.md | 230 ++++++++++++++++++------------------ 1 file changed, 115 insertions(+), 115 deletions(-) diff --git a/translations/README-ptbr.md b/translations/README-ptbr.md index 5bcadd9..0a03878 100644 --- a/translations/README-ptbr.md +++ b/translations/README-ptbr.md @@ -96,16 +96,16 @@ Seja que você esteja se preparando para uma Entrevista de Design de Sistemas ou - [Como armazenar senhas de forma segura em bancos de dados e como validá-las?](#como-armazenar-senhas-de-forma-segura-em-bancos-de-dados-e-como-validá-las) - [Explicando JSON Web Token (JWT) para uma criança de 10 anos de idade](#explicando-json-web-token-jwt-para-uma-criança-de-10-anos-de-idade) - [Como o Autenticador do Google (ou outros tipos de autenticadores de 2-fatores) funciona?](#como-o-autenticador-do-google-ou-outros-tipos-de-autenticadores-de-2-fatores-funciona) - - [Real World Case Studies](#real-world-case-studies) - - [Netflix's Tech Stack](#netflixs-tech-stack) - - [Twitter Architecture 2022](#twitter-architecture-2022) - - [Evolution of Airbnb’s microservice architecture over the past 15 years](#evolution-of-airbnbs-microservice-architecture-over-the-past-15-years) + - [Estudos de Caso do Mundo Real](#estudos-de-caso-do-mundo-real) + - [Pilha Tecnológica do Netflix](#pilha-tecnológica-do-netflix) + - [Arquitetura do Twitter 2022](#arquitetura-do-twitter-2022) + - [A Evolução da arquitetura de microsserviços do Airbnb nos ultimos 15 anos](#a-evolução-da-arquitetura-de-microsserviços-do-airbnb-nos-ultimos-15-anos) - [Monorepo vs. Microrepo.](#monorepo-vs-microrepo) - - [How will you design the Stack Overflow website?](#how-will-you-design-the-stack-overflow-website) - - [Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost?](#why-did-amazon-prime-video-monitoring-move-from-serverless-to-monolithic-how-can-it-save-90-cost) - - [How does Disney Hotstar capture 5 Billion Emojis during a tournament?](#how-does-disney-hotstar-capture-5-billion-emojis-during-a-tournament) - - [How Discord Stores Trillions Of Messages](#how-discord-stores-trillions-of-messages) - - [How do video live streamings work on YouTube, TikTok live, or Twitch?](#how-do-video-live-streamings-work-on-youtube-tiktok-live-or-twitch) + - [Como você desenharia o website Stack Overflow?](#como-você-desenharia-o-website-stack-overflow) + - [Por que o monitoramenteo do Amazon Prime Video migrou de serverless para monólito? Como isso pode evitar 90% dos custos?](#por-que-o-monitoramenteo-do-amazon-prime-video-migrou-de-serverless-para-monólito-como-isso-pode-evitar-90-dos-custos) + - [Como o Disney Hotstar captura 5 Bilhões de Emojis durante um torneio?](#como-o-disney-hotstar-captura-5-bilhões-de-emojis-durante-um-torneio) + - [Como o Discord Armazena Trilhões de Mensagens](#como-o-discord-armazena-trilhões-de-mensagens) + - [Como live-streams de video funcionam no YouTube, TikTok live ou Twitch?](#como-live-streams-de-video-funcionam-no-youtube-tiktok-live-ou-twitch) - [License](#license) @@ -1211,7 +1211,7 @@ Cloud native inclui 4 aspectos: 2. Arquitetura de Aplicação - A arquitetura foi de monolito para microsserviços. Cada serviço é projetado para ser pequeno, adaptativo para os recursos limitados em containers na cloud. + A arquitetura foi de monólito para microsserviços. Cada serviço é projetado para ser pequeno, adaptativo para os recursos limitados em containers na cloud. 3. Implantação & Empacotamento @@ -1476,184 +1476,184 @@ O mecanismo de autenticação é seguro? Não. A senha possui 6 dígitos, o que resulta em 1 milhão de combinações potenciais. Além disso, a senha muda a cada 30 segundos. Se os hackers quiserem adivinhar a senha em 30 segundos, precisariam inserir 30.000 combinações por segundo. -## Real World Case Studies +## Estudos de Caso do Mundo Real -### Netflix's Tech Stack +### Pilha Tecnológica do Netflix -This post is based on research from many Netflix engineering blogs and open-source projects. If you come across any inaccuracies, please feel free to inform us. +Este post é baseado em pesquisas de diversos blogs de engenharia da Netflix e projetos de código abert. Se encontrar qualquer imprecisão, sinta-se à vontade para nos informar.

-**Mobile and web**: Netflix has adopted Swift and Kotlin to build native mobile apps. For its web application, it uses React. +**Mobile e web**: A Netflix adotou Swift e Kotlin para construir seus aplicativos móveis nativos. Para a aplicação web, eles utilizam React. -**Frontend/server communication**: Netflix uses GraphQL. +**Comunicação Frontend/server**: A Netflix utiliza GraphQL. -**Backend services**: Netflix relies on ZUUL, Eureka, the Spring Boot framework, and other technologies. +**Serviços Backend**: A Netflix depende do ZUUL, Eureka, do framework Spring Boot e outras tecnologias. -**Databases**: Netflix utilizes EV cache, Cassandra, CockroachDB, and other databases. +**Bancos de Dados**: A Netflix utiliza EV cache, Cassandra, CockroachDB e outros bancos de dados. -**Messaging/streaming**: Netflix employs Apache Kafka and Fink for messaging and streaming purposes. +**Mensagerias/Streaming**: A Netflix utiliza o Apache Kafka e o Flink para fins de mensagens e streaming. -**Video storage**: Netflix uses S3 and Open Connect for video storage. +**Armazendo de Video**: A Netflix utiliza o S3 e o Open Connect para armazenamento de vídeos. -**Data processing**: Netflix utilizes Flink and Spark for data processing, which is then visualized using Tableau. Redshift is used for processing structured data warehouse information. +**Processamento de Dados**: A Netflix utiliza o Flink e o Spark para processamento de dados, que é então visualizado usando o Tableau. O Redshift é usado para processar informações do data warehouse estruturado. -**CI/CD**: Netflix employs various tools such as JIRA, Confluence, PagerDuty, Jenkins, Gradle, Chaos Monkey, Spinnaker, Atlas, and more for CI/CD processes. +**CI/CD**: A Netflix utiliza diversas ferramentas como JIRA, Confluence, PagerDuty, Jenkins, Gradle, Chaos Monkey, Spinnaker, Atlas e mais para processos de CI/CD. -### Twitter Architecture 2022 +### Arquitetura do Twitter 2022 -Yes, this is the real Twitter architecture. It is posted by Elon Musk and redrawn by us for better readability. +Sim, esta é a arquitetura real do Twitter. Foi postada por Elon Musk e redesenhada por nós para facilitar a leitura.

-### Evolution of Airbnb’s microservice architecture over the past 15 years +### A Evolução da arquitetura de microsserviços do Airbnb nos ultimos 15 anos -Airbnb’s microservice architecture went through 3 main stages. +A arquitetura de microsserviços do Airbnb passou por 3 estágios principais.

-Monolith (2008 - 2017) +Monólito (2008 - 2017) -Airbnb began as a simple marketplace for hosts and guests. This is built in a Ruby on Rails application - the monolith. +O Airbnb começou como um marketplace simples para anfitriões e hóspedes. Isso foi construído em uma aplicação Ruby on Rails - o monólito. -What’s the challenge? +Qual o desafio? -- Confusing team ownership + unowned code -- Slow deployment +- Propriedade da equipe confusa + código não atribuído +- Implantação lenta -Microservices (2017 - 2020) +Microsserviços (2017 - 2020) -Microservice aims to solve those challenges. In the microservice architecture, key services include: +A arquitetura de microsserviços visa resolver esses desafios. Na arquitetura de microsserviços, os serviços-chave incluem: -- Data fetching service -- Business logic data service -- Write workflow service -- UI aggregation service -- Each service had one owning team +- Serviço de busca de dados +- Serviço de lógica de negócios para dados +- Serviço de fluxo de escrita +- Serviço de agregação de interface do usuário +- Cada serviço tinha uma equipe responsável -What’s the challenge? +Qual o desafio? -Hundreds of services and dependencies were difficult for humans to manage. +Centenas de serviços e dependencias são difíceis para humanos manusear. -Micro + macroservices (2020 - present) +Micro + macrosserviços (2020 - present) -This is what Airbnb is working on now. The micro and macroservice hybrid model focuses on the unification of APIs. +É nisso que o Airbnb está trabalhando agora. O modelo híbrido de micro e macrosserviços foca na unificação de APIs. ### Monorepo vs. Microrepo. -Which is the best? Why do different companies choose different options? +Qual é melhor? Por que companhias diferentes tomam decisões diferentes?

-Monorepo isn't new; Linux and Windows were both created using Monorepo. To improve scalability and build speed, Google developed its internal dedicated toolchain to scale it faster and strict coding quality standards to keep it consistent. +O monorepositório não é algo novo; tanto o Linux quanto o Windows foram criados usando um monorepositório. Para melhorar escalabilidade e velocidade de compilação, o Google desenvolveu sua própria cadeia de ferramentas interna dedicada para acelerar o processo e padrões estritos de qualidade de código para mantê-lo consistente. -Amazon and Netflix are major ambassadors of the Microservice philosophy. This approach naturally separates the service code into separate repositories. It scales faster but can lead to governance pain points later on. +Amazon e Netflix são grandes defensores da filosofia de microsserviços. Essa abordagem naturalmente separa o código do serviço em repositórios separados. Isso escala mais rapidamente, mas pode levar a pontos de dor de governança mais tarde. -Within Monorepo, each service is a folder, and every folder has a BUILD config and OWNERS permission control. Every service member is responsible for their own folder. +Dentro do Monorepositório, cada serviço é uma pasta, e cada pasta possui uma configuração BUILD e controle de permissões OWNERS. Cada membro do serviço é responsável pela sua própria pasta. -On the other hand, in Microrepo, each service is responsible for its repository, with the build config and permissions typically set for the entire repository. +Por outro lado, no Microrrepositório, cada serviço é responsável por seu próprio repositório, com a configuração de compilação e permissões normalmente definidas para todo o repositório. -In Monorepo, dependencies are shared across the entire codebase regardless of your business, so when there's a version upgrade, every codebase upgrades their version. +No Monorepositório, as dependências são compartilhadas em todo o código, independentemente do seu propósito comercial. Assim, quando há uma atualização de versão, todo o código atualiza sua versão. -In Microrepo, dependencies are controlled within each repository. Businesses choose when to upgrade their versions based on their own schedules. +No Microrrepositório, as dependências são controladas dentro de cada repositório. As empresas escolhem quando atualizar suas versões com base em seus próprios cronogramas. -Monorepo has a standard for check-ins. Google's code review process is famously known for setting a high bar, ensuring a coherent quality standard for Monorepo, regardless of the business. +No Monorepositório, há um padrão para check-ins. O processo de revisão de código do Google é conhecido por estabelecer um padrão de qualidade elevado, garantindo um padrão de qualidade coerente para o Monorepositório, independentemente do negócio. -Microrepo can either set its own standard or adopt a shared standard by incorporating the best practices. It can scale faster for business, but the code quality might be a bit different. -Google engineers built Bazel, and Meta built Buck. There are other open-source tools available, including Nx, Lerna, and others. +No Microrrepositório, pode-se definir seu próprio padrão ou adotar um padrão compartilhado incorporando as melhores práticas. Isso pode escalar mais rapidamente para os negócios, mas a qualidade do código pode ser um pouco diferente. -Over the years, Microrepo has had more supported tools, including Maven and Gradle for Java, NPM for NodeJS, and CMake for C/C++, among others. +Engenheiros do Google desenvolveram o Bazel, e a Meta construiu o Buck. Existem outras ferramentas de código aberto disponíveis, incluindo Nx, Lerna e outras. -### How will you design the Stack Overflow website? +Ao longo dos anos, o Microrrepositório teve mais ferramentas suportadas, incluindo Maven e Gradle para Java, NPM para NodeJS e CMake para C/C++, entre outras. -If your answer is on-premise servers and monolith (on the bottom of the following image), you would likely fail the interview, but that's how it is built in reality! +### Como você desenharia o website Stack Overflow? + +Se sua resposta for em servidores locais (on-premise) e monólito (na parte inferior da imagem a seguir), você provavelmente não passaria na entrevista, mas na realidade, é assim que é construído!

-**What people think it should look like** +**Como as pessoas acham que deveria ser** -The interviewer is probably expecting something like the top portion of the picture. +O entrevistador provavelmente está esperando uma resposta como a parte superior da imagem. -- Microservice is used to decompose the system into small components. -- Each service has its own database. Use cache heavily. -- The service is sharded. -- The services talk to each other asynchronously through message queues. -- The service is implemented using Event Sourcing with CQRS. -- Showing off knowledge in distributed systems such as eventual consistency, CAP theorem, etc. +- Microsserviço é utilizado para decompor o sistema em componentes pequenos. +- Cada serviço tem seu próprio banco de dados. Utilizando cache pesadamente. +- O serviço é compartilhado. +- Os serviços se comunicam uns com os outros de forma assíncrona utilizando mensagerias. +- O serviço é implementado utilizando Event Sourcing com CQRS. +- Mostrar conhecimentos em sistemas distribuídos como consistencia eventual, teorema CAP etc. -**What it actually is** +**Como realmente é** -Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud. +O Stack Overflow atende todo o tráfego com apenas 9 servidores web locais e está em um monolito! Ele possui seus próprios servidores e não opera na nuvem. -This is contrary to all our popular beliefs these days. +Isso vai contra todas as nossas crenças populares nos dias de hoje. -### Why did Amazon Prime Video monitoring move from serverless to monolithic? How can it save 90% cost? +### Por que o monitoramenteo do Amazon Prime Video migrou de serverless para monólito? Como isso pode evitar 90% dos custos? -The diagram below shows the architecture comparison before and after the migration. +O diagrama abaixo mostra a comparação de arquitetura de antes de dpois da migração.

-What is Amazon Prime Video Monitoring Service? +O que é o Serviço de Monitoramento do Amazon Prime Video? -Prime Video service needs to monitor the quality of thousands of live streams. The monitoring tool automatically analyzes the streams in real time and identifies quality issues like block corruption, video freeze, and sync problems. This is an important process for customer satisfaction. +O serviço do Prime Video precisa monitorar a qualidade de milhares de live-streams de video. A ferramenta de monitoramenteo analisa as streams em tempo-real e identifica problemas de qualidade como corrupções de bloco, confelamentos de vídeos e problemas de sincronização. Isso é importante para o processo de satisfação de cliente. -There are 3 steps: media converter, defect detector, and real-time notification. +Existem 3 etapas: conversão de mídia, detector de falhas e notificações em tempo-real. -- What is the problem with the old architecture? +- Qual o problema com a arquitetura antiga? - The old architecture was based on Amazon Lambda, which was good for building services quickly. However, it was not cost-effective when running the architecture at a high scale. The two most expensive operations are: + A arquitetura antiga era baseada no Amazon Lambda, o que é bom para construir serviços rapidamente. Porém, não é boa em custa-benefício quando rodando a arquitetura em larga escala. As operações mais caras são: -1. The orchestration workflow - AWS step functions charge users by state transitions and the orchestration performs multiple state transitions every second. +1. O fluxo de trabalho de orquestração - As AWS Step Functions cobram os usuários por transições de estado, e a orquestração realiza várias transições de estado a cada segundo. -2. Data passing between distributed components - the intermediate data is stored in Amazon S3 so that the next stage can download. The download can be costly when the volume is high. +2. Passagem de dados entre componentes distribuídos - os dados intermediários são armazenados no Amazon S3 para que a próxima etapa possa fazer o download. O download pode ser caro quando o volume é alto. -- Monolithic architecture saves 90% cost +- Arquiteturas Monólitas custam 90% menos - A monolithic architecture is designed to address the cost issues. There are still 3 components, but the media converter and defect detector are deployed in the same process, saving the cost of passing data over the network. Surprisingly, this approach to deployment architecture change led to 90% cost savings! + Uma arquitetura monolítica é projetada para lidar com questões de custo. Ainda existem 3 componentes, mas o conversor de mídia e o detector de defeitos são implantados no mesmo processo, economizando o custo de passagem de dados pela rede. Surpreendentemente, essa abordagem de mudança na arquitetura de implantação resultou em uma economia de custos de 90%! -This is an interesting and unique case study because microservices have become a go-to and fashionable choice in the tech industry. It's good to see that we are having more discussions about evolving the architecture and having more honest discussions about its pros and cons. Decomposing components into distributed microservices comes with a cost. +Este é um estudo de caso interessante e único, porque os microsserviços se tornaram uma escolha popular e na moda na indústria de tecnologia. É bom ver que estamos tendo mais discussões sobre a evolução da arquitetura e tendo discussões mais honestas sobre seus prós e contras. A decomposição de componentes em microsserviços distribuídos vem com um custo. -- What did Amazon leaders say about this? +- O que os líderes da Amazon disseram sobre isso? - Amazon CTO Werner Vogels: “Building **evolvable software systems** is a strategy, not a religion. And revisiting your architecture with an open mind is a must.” + Wener Vogels, CTO da Amazon: "Construir **sistemas de software evoluíveis** é uma estratégia, não uma religião. E revisando sua arquitetura com uma mente aberta é necessário". -Ex Amazon VP Sustainability Adrian Cockcroft: “The Prime Video team had followed a path I call **Serverless First**…I don’t advocate **Serverless Only**”. +Ex VP de Sustentabilidade da Amazon, Adrian Cockcroft: "A equipe do Prime Video seguiu um caminho que eu chamo de **Serverless First**... Eu não advoco por **Apenas Serverless**." -### How does Disney Hotstar capture 5 Billion Emojis during a tournament? +### Como o Disney Hotstar captura 5 Bilhões de Emojis durante um torneio?

-1. Clients send emojis through standard HTTP requests. You can think of Golang Service as a typical Web Server. Golang is chosen because it supports concurrency well. Threads in Golang are lightweight. - -2. Since the write volume is very high, Kafka (message queue) is used as a buffer. +1. Os clientes enviam emojis por meio de solicitações HTTP padrão. Você pode pensar no Serviço Golang como um servidor web típico. Golang é escolhido porque oferece bom suporte à concorrência. As threads em Golang são leves. -3. Emoji data are aggregated by a streaming processing service called Spark. It aggregates data every 2 seconds, which is configurable. There is a trade-off to be made based on the interval. A shorter interval means emojis are delivered to other clients faster but it also means more computing resources are needed. +2. Já que o volume de escrita é muito alto, Kafka (mensageria) é utilizado como um buffer. -4. Aggregated data is written to another Kafka. +3. Os dados dos emojis são agregados por um serviço de processamento de streaming chamado Spark. Ele agrega dados a cada 2 segundos, o que é configurável. Existe um equilíbrio a ser feito com base no intervalo. Um intervalo mais curto significa que os emojis são entregues mais rapidamente a outros clientes, mas também significa que são necessários mais recursos computacionais. +4. Os dados agregados são escritos em outro Kafka. -5. The PubSub consumers pull aggregated emoji data from Kafka. +5. Os consumidores do PubSub puxam dados agregados de emojis do Kafka. -6. Emojis are delivered to other clients in real-time through the PubSub infrastructure. The PubSub infrastructure is interesting. Hotstar considered the following protocols: Socketio, NATS, MQTT, and gRPC, and settled with MQTT. +6. Emojis são entregues a outros clientes em tempo real pela infraestrutura do PubSub. A infraetrutura do PubSub é interessante. A Hotstar considerou os seguintes protocolos: Socketio, NATS, MQTT e gRPC, e optou pelo MQTT. -A similar design is adopted by LinkedIn which streams a million likes/sec. +Um design similar é adotado pelo LinkedIn que streama um milhão de likes port segundo. -### How Discord Stores Trillions Of Messages +### Como o Discord Armazena Trilhões de Mensagens -The diagram below shows the evolution of message storage at Discord: +O diagrama abaixo mostra e evolução de armazenamento de mensgens no Discord:

@@ -1661,52 +1661,52 @@ The diagram below shows the evolution of message storage at Discord: MongoDB ➡️ Cassandra ➡️ ScyllaDB -In 2015, the first version of Discord was built on top of a single MongoDB replica. Around Nov 2015, MongoDB stored 100 million messages and the RAM couldn’t hold the data and index any longer. The latency became unpredictable. Message storage needs to be moved to another database. Cassandra was chosen. +Em 2015, a primeira versão do Discord foi montada em cima de uma única réplica de MongoDB. Em torno de Novembro de 2015, o MongoDB armazenava 100 milhões de mensagens e a RAM não conseguia mais armazenar dados nem o índice. A latência se tornou imprevista. O armazenamento de mensagens precisava ser movido para outro banco de dados. Assim foi escolhido Cassandra. -In 2017, Discord had 12 Cassandra nodes and stored billions of messages. +Em 2017, o Discord tinha 12 nós do Cassandra e armazenava bilhões de mensagens. -At the beginning of 2022, it had 177 nodes with trillions of messages. At this point, latency was unpredictable, and maintenance operations became too expensive to run. +No início de 2022, ele tinha 177 nós com trilhões de mensagens. Neste ponto, a latência ficou imprevista, e operações de manutenção ficaram caras de mais para serem feitas. -There are several reasons for the issue: +Há algumas razões para este problema: -- Cassandra uses the LSM tree for the internal data structure. The reads are more expensive than the writes. There can be many concurrent reads on a server with hundreds of users, resulting in hotspots. -- Maintaining clusters, such as compacting SSTables, impacts performance. -- Garbage collection pauses would cause significant latency spikes +- A Cassandra usa uma árvore LSM como estrutura de dados interna. As leituras são mais caras que escritas. Pode haver várias leituras concorrentes em um servidor com centenas de usuários, resultando em pontos de crise +- Manusear clusters, como compactando SSTables, impacta performance +- As pausas do Coletor de Lixo (_Garbage Collector_) causam picos significativos de latência -ScyllaDB is Cassandra compatible database written in C++. Discord redesigned its architecture to have a monolithic API, a data service written in Rust, and ScyllaDB-based storage. +ScyllaDB é um banco de dados compatível com o Cassandra escrito em C++. O Discord redesenhou sua arquitetura para ter uma API monólita, um serviço de dados escrito em Rust e armazenamento baseado em ScyllaDB. -The p99 read latency in ScyllaDB is 15ms compared to 40-125ms in Cassandra. The p99 write latency is 5ms compared to 5-70ms in Cassandra. +A latência de leitura p99 do ScyllaDB é 15ms, comaprado com 40-125ms da Cassandra. A latência p99 de escrita é 5ms, comparado com 5-70ms da Cassandra. -### How do video live streamings work on YouTube, TikTok live, or Twitch? +### Como live-streams de video funcionam no YouTube, TikTok live ou Twitch? -Live streaming differs from regular streaming because the video content is sent via the internet in real-time, usually with a latency of just a few seconds. +Live streaming difere de streaming tradicional pois o conteúdo de vídeo é enviado pela internet em tempo-real, usualmente com uma latência de apenas alguns segundos. -The diagram below explains what happens behind the scenes to make this possible. +O diagrama abaixo explica o que acontece por tras das cenas para tornar isso possível.

-Step 1: The raw video data is captured by a microphone and camera. The data is sent to the server side. +Passo 1: Os dados crús do vídeo é capturado por um microfone e câmera. O dado é enviado para o servidor. -Step 2: The video data is compressed and encoded. For example, the compressing algorithm separates the background and other video elements. After compression, the video is encoded to standards such as H.264. The size of the video data is much smaller after this step. +Passo 2: Os dados do vídeo são comprimidos e codificados. Por exemplo, o algoritmo de compressão separa o fundo de outros elementos do vídeo. Depois da compressão, o vídeo é codificado em padrões como o H.264. O tamanho dos dados do vídeo é bem menor depois dessa etapa. -Step 3: The encoded data is divided into smaller segments, usually seconds in length, so it takes much less time to download or stream. +Passo 3: Os dados codificados são divididos em segmentos menores, geralmente segundos em duração, para que dure menos tempo para baixar ou streamar. -Step 4: The segmented data is sent to the streaming server. The streaming server needs to support different devices and network conditions. This is called ‘Adaptive Bitrate Streaming.’ This means we need to produce multiple files at different bitrates in steps 2 and 3. +Passo 4: O segmento de dados é enviado para o servidor de streaming. O servidor de streaming precisa suportar diferentes dispositivos em diferentes condições de rede. Isso é chamado de 'Streaming de Bitrate Adaptativo'. -Step 5: The live streaming data is pushed to edge servers supported by CDN (Content Delivery Network.) Millions of viewers can watch the video from an edge server nearby. CDN significantly lowers data transmission latency. +Passo 5: Os dados do live-streaming são empurrados para servidores edge suportados por CDN (_Content Delivery Network_). Milhões de viewers podem assistr o vídeo a partir de um edge server próximo. CDNs reduzem significadamentea latência de transmissão. -Step 6: The viewers’ devices decode and decompress the video data and play the video in a video player. +Passo 6: Os dispositivos viewers decodificam e decomprimiem os dados do vídeo e mostram-no num video player. -Steps 7 and 8: If the video needs to be stored for replay, the encoded data is sent to a storage server, and viewers can request a replay from it later. +Passo 7 e 8: Se o video precisa ser armazenado para replay, os dados codificados são enviados para um servidor de armazenamento, e visualizadores podem requisitar o replay dele mais tarde. -Standard protocols for live streaming include: +Protocolos comuns para live streaming são: -- RTMP (Real-Time Messaging Protocol): This was originally developed by Macromedia to transmit data between a Flash player and a server. Now it is used for streaming video data over the internet. Note that video conferencing applications like Skype use RTC (Real-Time Communication) protocol for lower latency. -- HLS (HTTP Live Streaming): It requires the H.264 or H.265 encoding. Apple devices accept only HLS format. -- DASH (Dynamic Adaptive Streaming over HTTP): DASH does not support Apple devices. -- Both HLS and DASH support adaptive bitrate streaming. +- RTMP (Protocolo de Mensagens em Tempo-Real _Real-Time Messaging Protocol_): Este foi originalmente desenvolvido pela Macromedia para transmitir dados entre o Flash player e um servidor. Hoje ele é utilizado para streaming de dados de vídeo pela internet. Note que aplicativos de conferência de vídeo como Skupe utiliza o protocolo RTC (Comunicações em Tempo-Real, _Real Time Communications_) para baixa latência. +- HLS (HTTP Live Streaming): Ele requer codificação H.264 ou H.265. Dispositivos Apple aceitam apenas o formato HLS. +- DASH (Streaming Adaptativo Dinâmico por HTTP, _Dynamic Adaptive Streaming over HTTP_): DASH não suporta dispositivos Apple. +- Tanto HLS como DASH suportam straming de bitrate adaptativo. ## License