EZF-002 Webhooks API Architecture Style, Load Balancing Methods, HTTP Status Codes, Fundamental Latency Metrics to Remember, Git commands Part 1 (21 commands)

Oct 10, 2023

Recap on system design for this week:

Webhooks API Architecture style
Load Balancing Methods. Part 1 (7 Methods)
HTTP Status Codes
Fundamental Latency Metrics to Remember
Git commands. Part 1 (21 commands)

1. Webhooks API Architecture style

1.1. Message formats:

JSON

1.2. ✅ Advantages:

Real-Time Notifications 🚨: Immediate alerts without polling.
Event-Driven ⚡: Supports dynamic, event-based interactions.
Reduced Server Load 📉: Sends data only on events.
Configurable 🔧: Tailored to specific events.
Simple to Use 🟢: Generally straightforward implementation.
Asynchronous Operations 🔄: Non-blocking data transfers.
Scalable 📈: Adapts easily with growing needs.

1.3. ❌ Disadvantages

Security Concerns 🔒: Public endpoints may pose risks if not secured.
Error Handling ❌: Challenges with endpoint failures or errors.
Latency Issues 🐢: Network delays can affect real-time nature.
Resource Use 🔄: Each call consumes server resources.
Debugging 🔍: Complexity increases with third-party services.
Limited Filtering 🚫: Some providers may lack detailed options.
Ordering 📦: No guaranteed sequence for events.

1.4. 📋 Use cases

Real-Time Notifications 🚨: Immediate alerts for events like new posts.
CI/CD 🛠: Trigger builds, tests, and deployments on code changes.
E-Commerce Updates 🛍: Notify on order placements, payments, or stock changes.
CMS 📝: Alerts on content changes.
Social Media Integration 📲: Updates on posts and interactions.
Monitoring & Analytics 📊: Alerts for system or security issues.
Chatbots 💬: Notifications for new messages.
Payments 💳: Status updates on transactions.
IoT Device Communication 🌐: Send data when specific conditions are met by IoT devices.
Collaboration Tools Integration 🤝: Notify on task updates or new messages in collaboration apps.

❓ What API architecture style are you guys currently utilizing in your system?

2. Load Balancing Methods: Round Robin, Sticky Round Robin, Weighted Round Robin, IP Hash, Generic Hash, Least Connections, Least Time

More detailed:

2.1. 🔁Round Robin

Distributes requests sequentially across all servers in the pool. Simple and predictable.
Static
Pros:
- Easy Setup 🟢: Minimal configuration
- Even Distribution 🔁: Suitable for testing.
- No Monitoring Needed 📊: Rotates through servers.
- Predictable 🔄: Can anticipate server handling.
Cons:
- Server Overload Risk ⚠️: Can push servers to overload if they're already heavily loaded.
- Requires Similar Server Capacity 📏: Best when all servers have roughly the same capacity.
- Content Uniformity Needed 📋: Requires all servers to host the same content.
Use Cases:
- Testing Environments 🧪: Balanced request distribution for testing.
- Stateless Applications 🔄: For independent, session-less requests.
- Uniform Server Capacities 📏: When all servers have similar resources.
- Microservices 🚀: Even distribution across stateless services.
Example (NGINX):
- upstream backend {
  server s1.dmn.com;
  server s2.dmn.com;
  server s3.dmn.com;
  }

2.2. 🟢🔁 Sticky Round Robin

Distributes requests sequentially but ensures a user's subsequent requests stick to the initially assigned server, maintaining session persistence.
Static
Pros:
- Session Persistence 📌: Maintains user session data.
- Better User Experience 🚀: No session timeouts or data loss.
- Simpler App Design 🛠️: Assumes user requests hit the same server.
Cons:
- Potential Imbalance ⚖️: Uneven load distribution.
- Scaling Issues 📈: Challenges when adding new servers.
- Server Failure Impact 🔥: Disruptions if a sticky server fails.
Use Cases:
- Web Apps with Sessions 🛒: Maintains user carts and preferences.
- Authentication Systems 🔐: Keeps users logged in across requests.
- Multi-step Forms 📝: Remembers data across form pages.
- Streaming Services 🎥: Consistent server connection during streams.
- Online Gaming 🎮: Tracks player states and scores consistently.
Example (NGINX+):
- upstream backend {
  server s1.dmn.com;
  server s2.dmn.com;
  server s3.dmn.com;
  sticky cookie srv_id expires=2h domain=.dmn.com path=/;
  }

2.3. ⚖️🔁Weighted Round Robin

Distributes requests based on assigned server weights, favoring servers with higher capacities or priorities.
Static
Pros:
1. Adaptive Distribution 📊: Suits servers with different capacities.
2. Flexibility 🔄: Adjusts weights as server performance changes.
3. Efficient ⚙️: Maximizes resource utilization without overloading.
Cons:
1. Complexity 🧩: Requires monitoring and weight adjustments.
2. Potential Imbalance ⚠️: Incorrect weights can lead to overloads.
Use Cases:
1. Mixed Server Capacities 🏢: When servers have varying resources.
2. Dynamic Environments 🌪️: Adapting to changing server performance.
3. Traffic Prioritization 🚦: Directing more traffic to higher-performing servers.
Example (NGINX):
- upstream backend {
  server s1.dmn.com weight=3;
  server s2.dmn.com weight=2; server s3.dmn.com weight=1;
  }

2.4. 🔗#️⃣IP Hash

Distributes requests based on a hash of the client's IP address, ensuring consistent routing to the same server for a specific IP.
Static
Pros:
- Session Persistence 📌: Clients consistently directed to the same server.
- Predictable Distribution 🔀: Based on client IP hash, ensuring even load.
Cons:
- Limited Flexibility ⛓️: Hard to adjust once set up.
- Imbalance Risk ⚠️: Some servers might get more traffic if certain IP ranges are more active.
Use Cases:
- Stateful Applications 🛒: Where session data needs to be retained.
- Geo-specific Content 🌍: Serving content based on client's geographic location.
- Security & Monitoring 🛡️: Easier tracking and management of client sessions.
Example (NGINX):
- upstream backend {
  ip_hash;
  server s1.dmn.com weight=3; server s2.dmn.com weight=2; server s3.dmn.com weight=1;
  }

2.5. 🔳#️⃣Generic Hash

Distributes requests using a hash of customizable inputs, offering flexible and consistent routing based on various data points.
Static
Pros:
- Versatility 🌐: Hashes on diverse inputs, including text, variables, or combinations like IP-port pairs or URIs.
- Uniform Distribution 🔀: Aims for even distribution across servers.
Cons:
- Complexity 🧩: Requires careful selection of hash function.
- Potential Imbalance ⚠️: Hash collisions or poor hash functions can skew distribution.
Use Cases:
- Custom Inputs 🛠️: Hash based on specific application data or headers.
- Dynamic Environments 🌪️: Where inputs for distribution change frequently.
- Cache Distribution 💾: Ensuring cached content is evenly distributed.
Example (NGINX):
- upstream backend {
  hash $request_uri;
  server s1.dmn.com weight=3; server s2.dmn.com weight=2; server s3.dmn.com weight=1;
  }

2.6. 🔌🔽Least Connections

Directs traffic to servers with the fewest active connections, optimizing for server availability.
Dynamic
Pros:
1. Adaptive 🌐: Directs traffic to less-busy servers.
2. Efficiency ⚙️: Maximizes server utilization without overburdening.
Cons:
1. Delayed Reaction ⏱️: Might not account for sudden server load spikes.
2. Potential Overhead 📊: Requires monitoring of active connections.
Use Cases:
1. Varying Server Capacities 🏢: Balances load in mixed-capacity environments.
2. High Traffic Sites 🚦: Distributes large volumes of requests effectively.
3. Real-time Applications 🎮: Where quick response times are crucial.
Example (NGINX):
- upstream backend {
  least_conn;
  server s1.dmn.com;
  server s2.dmn.com;
  server s3.dmn.com;}

2.7. ⏱️🔽Least Time

Routes traffic to servers with the quickest response times, ensuring faster user experiences.
Dynamic
Pros:
1. Fast Responses 🚀: Prioritizes quickest servers.
2. Dynamic Adaptation 🌐: Adjusts based on server response times.
Cons:
1. Monitoring Overhead 📊: Requires continuous tracking of server latencies.
2. Fluctuation Risk ⚠️: Rapid changes in response times can lead to frequent server switches.
Use Cases:
1. User Experience 🛍️: Ensures users get the fastest server response.
2. High Demand Applications 🎥: For services like video streaming where latency matters.
3. Variable Server Performance 📉: Balances in environments with fluctuating server speeds.
Example (NGINX+):
- upstream backend {
  least_time header;
  server s1.dmn.com;
  server s2.dmn.com;
  server s3.dmn.com;
  }

❓ What combination of load balancing methods are you guys currently implementing in your system?

3. HTTP Status Codes

HTTP status codes are three-digit responses sent by servers to indicate the outcome of a request. They categorize the result into five broad classes: informational (1xx), successful (2xx), redirection (3xx), client errors (4xx), and server errors (5xx). Understanding these codes is crucial as they provide insight into the success or failure of an HTTP request, helping diagnose issues, optimize user experience, and ensure smooth communication between client and server.

Short version:

and more detailed:

1xx: Informational 🔄

100 Continue 📤
101 Switching Protocols 🔄
102 Processing ⏳
103 Early Hints 💡

2xx: Successful ✅

200 OK 🆗
201 Created 🆕
202 Accepted 🔄
203 Non-Authoritative Information ℹ️
204 No Content 🚫
205 Reset Content 🔄
206 Partial Content ⏳
207 Multi-Status 📊
208 Already Reported 📢
226 IM Used 🔄

3xx: Redirection ➡️

300 Multiple Choices 🤔
301 Moved Permanently ➡️
302 Found 🔄
303 See Other 👀
304 Not Modified 🔄
305 Use Proxy 🚧
307 Temporary Redirect ➡️
308 Permanent Redirect ➡️

4xx: Client Errors ❌

400 Bad Request 🚫
401 Unauthorized 🔒
402 Payment Required 💰
403 Forbidden 🚷
404 Not Found 🕳️
405 Method Not Allowed ❌
406 Not Acceptable 🚫.
407 Proxy Auth Required 🔒
408 Request Timeout ⏰
409 Conflict ⚠️
410 Gone 🕳️
411 Length Required ❗
412 Precondition Failed ❌
413 Payload Too Large 📦
414 URI Too Long 📏
415 Unsupported Media Type ❌
416 Range Not Satisfiable 🚫
417 Expectation Failed ❌
418 I'm a Teapot ☕: April Fools' joke
419 Page Expired ⏳
420 Method Failure/Enhance Your Calm 🚫

421 Misdirected Request ❗

422 Unprocessable Entity ❌(WebDAV)

423 Locked 🔒 (WebDAV)
424 Failed Dependency ❌(WebDAV)

425 Too Early ⏰
426 Upgrade Required ⬆️
428 Precondition Required ❗
429 Too Many Requests 🚫
430 HTTP Status Code 🚫
431 Headers Too Large 📏
440 Login Time-Out ⏰
444 No Response 🚫
449 Retry With 🔁
450 Blocked by Parental Controls 🔒

451 Legal Reasons ⚖️
460 Client Closed Connection Prematurely 🚫

463 Too Many Forwarded IP Addresses 🚫

494 Request Header Too Large 📏
495 SSL Certificate Error ❌
496 SSL Certificate Required 🔒
497 HTTP to HTTPS ❌
498 Invalid Token ❌
499 Token Required/Client Closed 🚫

5xx: Server Errors🚨

500 Internal Server Error 🚨
501 Not Implemented ❌
502 Bad Gateway 🚧
503 Service Unavailable ⛔
504 Gateway Timeout ⏰
505 HTTP Version Not Supported ❌
506 Variant Also Negotiates 🔄
507 Insufficient Storage 💾(WebDAV)
508 Loop Detected 🔁(WebDAV)
509 Bandwidth Limit Exceeded 📊
510 Not Extended ➕
511 Network Authentication Required 🔒
520 Unknown Error ❓
521 Server Is Down ⛔
522 Timeout ⏰
523 Origin Unreachable 🚧
524 Timeout ⏰
525 SSL Handshake Failed 🔒
526 Invalid SSL Certificate ❌
527 Railgun Listener to Origin 🚄
529 Service Overloaded ⚠️
530 Site Frozen ❄️
598 Network Read Timeout ⏰
599 Network Connect Timeout ⏰

❓ Imagine you're designing a new HTTP status code that indicates a server has understood the request but refuses to fulfill it, not due to authorization issues but because it deems the request to be potentially harmful or malicious. What would you name this status code, and how would you differentiate its use from the existing 403 Forbidden and 451 Unavailable For Legal Reasons codes?

4. Fundamental Latency Metrics to Remember

The numbers provided are approximate, based on figures from Peter Norvig’s article: http://norvig.com/21-days.html#answers.

It's essential to have a general understanding of the orders of magnitude rather than the exact values for the following reasons:

Performance Optimization: Helps identify system bottlenecks and areas for improvement.
System Design: Guides decisions about data placement and communication strategies in distributed systems.
Realistic Expectations: Sets achievable benchmarks for system operations.
Efficient Coding: Enables developers to write latency-aware code, reducing wasteful operations.
Debugging: Assists in quickly spotting performance anomalies.
User Experience: Directly impacts how users perceive application responsiveness.
Cost Efficiency: Faster operations can lead to savings, especially in cloud environments.
Educated Trade-offs: Allows for informed decisions when balancing performance, cost, and reliability.

❓ Guys, if you had to design a real-time collaborative document editing platform (similar to Google Docs) where multiple users from around the world can edit a document simultaneously, which latency numbers would be most critical to consider, and how would they influence your design choices to ensure a seamless user experience?

5. Git commands. Part 1 (21 commands).

Use case: Developing a new feature for a web application.

Git config: Before starting, Alice sets her Git username and email.

git config --global user.name "Alice Smith" git config --global user.email "alice.smith@example.com"

Git init: Alice initializes a new Git repository for her project.

git init my-web-app

Git clone: Bob wants to collaborate with Alice. He clones her repository.

git clone https://github.com/alice/my-web-app.git

Git status: Alice checks the status of her files.

git status

Git add: Alice adds a new file to the staging area.

git add index.html

Git commit: Alice commits her changes with a message.

git commit -m "Added homepage."

Git push: Alice pushes her changes to the remote repository.

git push origin master

Git branch: Bob creates a new branch to work on a feature.

git branch feature-navbar

Git checkout: Bob switches to his new branch.

git checkout feature-navbar

Git merge: Alice merges Bob's feature branch into the master branch.

git merge feature-navbar

Git pull: Bob pulls the latest changes from the remote repository.

git pull origin master

Git log: Alice views the commit history.

git log

Git show: Bob checks the details of the last commit.

git show

Git diff: Alice checks the differences between her working directory and the last commit.

git diff

Git tag: Alice tags the current commit as a new release.

git tag v1.0

Git rm: Bob removes a file from the repository.

git rm old-file.txt

Git stash: Alice temporarily saves her changes to work on something else.

git stash

Git reset: Bob undoes the last commit, keeping the changes in the working directory.

git reset HEAD~1

Git revert: Alice undoes a specific commit by creating a new commit.

git revert commit_id

Git remote: Bob checks the remote repositories connected to his local repository.

git remote -v

Git fetch: Alice fetches the latest changes from the remote repository without merging.

git fetch origin

❓ Guys, imagine you're working on a feature branch in Git and you've made several commits. You then realize that one of the commits in the middle introduced a critical bug. Without reverting or affecting the subsequent commits, how would you use Git commands to isolate and address just that specific commit?