if you guys never heard about Disaggregated Database Systems, I'll refer you to an article with the same name
https://www.cs.purdue.edu/homes/csjgwang/pubs/SIGMOD23_Tutorial_DisaggregatedDB.pdf
https://www.cs.purdue.edu/homes/csjgwang/pubs/SIGMOD23_Tutorial_DisaggregatedDB.pdf
Distributed systems reading list by Fred Hebert, a well-known Software architect in industy, and well-known contributor on Erlang community.
https://ferd.ca/a-distributed-systems-reading-list.html
https://ferd.ca/a-distributed-systems-reading-list.html
ferd.ca
A Distributed Systems Reading List
An old document I surfaced with my quick tour of distributed systems theory fundamentals
The Single Transferable Vote (STV) system is a way of voting in elections that aims to achieve proportional representation. It's like trying to make sure everyone's voice is heard as fairly as possible in deciding who gets elected. Here's a simple way to understand how it works, using a "pizza party" analogy.
Imagine you and your friends are deciding what kind of pizza to order for a party. Everyone has different tastes, so you decide to vote on it, but you want to make sure as many people as possible get something they like.
Everyone Votes for Their Favorites: Each person makes a list of their favorite pizzas in order, from most to least favorite.
Counting the Votes: To decide which pizzas to order, there's a certain number of "pizza spots" available (like seats in an election). A pizza needs a certain number of votes to claim a spot (this is like the quota in STV).
First Choices Counted First: Initially, everyone's first choice is counted. If your top choice pizza gets more votes than it needs to secure a spot, it's definitely being ordered.
Extra Votes Go to Next Favorites: If your favorite pizza gets more votes than it needs, your vote isn't wasted. Instead, it's as if your vote is partly for your top choice and partly for your next favorite, based on how many extra votes the winning pizza got. This way, part of your vote helps decide the next pizza.
No Hope Pizzas Are Out: If a pizza doesn't get enough votes to be in the running, it's out. Then, votes for that pizza go to the next choice on those voters' lists.
Repeat Until All Spots Filled: This process of redistributing votes from winning pizzas with too many votes and from losing pizzas continues until all the pizza spots are filled.
So, STV tries to make sure that the pizzas ordered reflect what the group as a whole likes, not just the most popular choice. It reduces wasted votes and helps more people get at least one of their top choices. In real elections, this means that the elected representatives better reflect the diverse preferences of the voters.
Imagine you and your friends are deciding what kind of pizza to order for a party. Everyone has different tastes, so you decide to vote on it, but you want to make sure as many people as possible get something they like.
Everyone Votes for Their Favorites: Each person makes a list of their favorite pizzas in order, from most to least favorite.
Counting the Votes: To decide which pizzas to order, there's a certain number of "pizza spots" available (like seats in an election). A pizza needs a certain number of votes to claim a spot (this is like the quota in STV).
First Choices Counted First: Initially, everyone's first choice is counted. If your top choice pizza gets more votes than it needs to secure a spot, it's definitely being ordered.
Extra Votes Go to Next Favorites: If your favorite pizza gets more votes than it needs, your vote isn't wasted. Instead, it's as if your vote is partly for your top choice and partly for your next favorite, based on how many extra votes the winning pizza got. This way, part of your vote helps decide the next pizza.
No Hope Pizzas Are Out: If a pizza doesn't get enough votes to be in the running, it's out. Then, votes for that pizza go to the next choice on those voters' lists.
Repeat Until All Spots Filled: This process of redistributing votes from winning pizzas with too many votes and from losing pizzas continues until all the pizza spots are filled.
So, STV tries to make sure that the pizzas ordered reflect what the group as a whole likes, not just the most popular choice. It reduces wasted votes and helps more people get at least one of their top choices. In real elections, this means that the elected representatives better reflect the diverse preferences of the voters.
What Is Social Contract?
The Social Contract is a simple and effective way to enable team autonomy and self accountability for engagements. The Social Contract or Agreement is created by and for the team. It looks to codify the behaviors and expectations of the team. It also provides a mechanism for the team to radiate and share its desired behaviours with management and other stakeholders.
To effectively use this practice you should look to create the following outcomes:
- Public display of the social contract.
- Nobody is above the contract.
- The team agreed holds each other accountable to the contract. Having every team member physically sign the contract can provide a good starting point for this.
- Revisit the social contract often and update it as necessary.
https://openpracticelibrary.com/practice/social-contract/
#team #management #worktogether
The Social Contract is a simple and effective way to enable team autonomy and self accountability for engagements. The Social Contract or Agreement is created by and for the team. It looks to codify the behaviors and expectations of the team. It also provides a mechanism for the team to radiate and share its desired behaviours with management and other stakeholders.
To effectively use this practice you should look to create the following outcomes:
- Public display of the social contract.
- Nobody is above the contract.
- The team agreed holds each other accountable to the contract. Having every team member physically sign the contract can provide a good starting point for this.
- Revisit the social contract often and update it as necessary.
https://openpracticelibrary.com/practice/social-contract/
#team #management #worktogether
The Five Dysfunctions of a Team
1. Abscence of trust
2. Fear of conflict
3. Lack of commitment
4. Avoidance of accountability
5. Inattention to the results
https://files.tablegroup.com/wp-content/uploads/2020/12/11224029/FiveDysfunctions.pdf
1. Abscence of trust
2. Fear of conflict
3. Lack of commitment
4. Avoidance of accountability
5. Inattention to the results
https://files.tablegroup.com/wp-content/uploads/2020/12/11224029/FiveDysfunctions.pdf
Empathy isn't just a social nicety
Empathy is really important for close relationships. It's like how a mom just gets what her baby needs without the baby saying anything. Empathy is about trying to think like someone else and understand that everyone thinks differently.
It's a big part of being good at getting along with others and doing well at work. Real empathy means you not only know how someone feels but also feel it with them. For example, some people can tell what makes others upset but don't really care about their feelings. They use what they know to take advantage, not to connect.
If someone can't truly share someone else's feelings, it might mean they haven't worked on understanding themselves enough. Empathy isn't just knowing; it's feeling together. It helps make real connections and grow as a person.
Empathy is really important for close relationships. It's like how a mom just gets what her baby needs without the baby saying anything. Empathy is about trying to think like someone else and understand that everyone thinks differently.
It's a big part of being good at getting along with others and doing well at work. Real empathy means you not only know how someone feels but also feel it with them. For example, some people can tell what makes others upset but don't really care about their feelings. They use what they know to take advantage, not to connect.
If someone can't truly share someone else's feelings, it might mean they haven't worked on understanding themselves enough. Empathy isn't just knowing; it's feeling together. It helps make real connections and grow as a person.
Article about ways to optimize INSERT operations and look at alternatives when you need to load more than a few rows in the LOAD DATA INFILE statement.
https://www.red-gate.com/simple-talk/databases/mysql/optimizing-mysql-adding-data-to-tables/
#mysql #performance #insert
https://www.red-gate.com/simple-talk/databases/mysql/optimizing-mysql-adding-data-to-tables/
#mysql #performance #insert
Simple Talk
Optimizing MySQL: Adding Data to Tables - Simple Talk
Welcome back to the MySQL optimization series! In case you haven’t been following this series, in the past couple of articles we have discussed the basics
Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost
https://aws.amazon.com/blogs/big-data/best-practices-for-right-sizing-your-apache-kafka-clusters-to-optimize-performance-and-cost/
#MSK
https://aws.amazon.com/blogs/big-data/best-practices-for-right-sizing-your-apache-kafka-clusters-to-optimize-performance-and-cost/
#MSK
Amazon
Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost | Amazon Web Services
Apache Kafka is well known for its performance and tunability to optimize for various use cases. But sometimes it can be challenging to find the right infrastructure configuration that meets your specific performance requirements while minimizing the infrastructure…
Jim's Lib
Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost https://aws.amazon.com/blogs/big-data/best-practices-for-right-sizing-your-apache-kafka-clusters-to-optimize-performance-and-cost/ #MSK
AWS MSK on AWS Graviton
https://aws.amazon.com/blogs/big-data/amazon-msk-now-provides-up-to-29-more-throughput-and-up-to-24-lower-costs-with-aws-graviton3-support/
#MSK
https://aws.amazon.com/blogs/big-data/amazon-msk-now-provides-up-to-29-more-throughput-and-up-to-24-lower-costs-with-aws-graviton3-support/
#MSK
Amazon
Amazon MSK now provides up to 29% more throughput and up to 24% lower costs with AWS Graviton3 support | Amazon Web Services
Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data. Today, we’re excited to bring the benefits of Graviton3 to Kafka workloads, with…
Best practices for running production workloads using Amazon MSK tiered storage
https://aws.amazon.com/blogs/big-data/best-practices-for-running-production-workloads-using-amazon-msk-tiered-storage/
#MSK
https://aws.amazon.com/blogs/big-data/best-practices-for-running-production-workloads-using-amazon-msk-tiered-storage/
#MSK
Amazon
Best practices for running production workloads using Amazon MSK tiered storage | Amazon Web Services
In the second post of the series, we discussed some core concepts of the Amazon Managed Streaming for Apache Kafka (Amazon MSK) tiered storage feature and explained how read and write operations work in a tiered storage enabled cluster. This post focuses…
RLHF & DPO: Simplifying and Enhancing Fine-Tuning for Language Models
https://www.linkedin.com/pulse/rlhf-dpo-simplifying-enhancing-fine-tuning-language-models-kirouane/
https://www.linkedin.com/pulse/rlhf-dpo-simplifying-enhancing-fine-tuning-language-models-kirouane/
Linkedin
RLHF & DPO: Simplifying and Enhancing Fine-Tuning for Language Models
What Is RLHF? Reinforcement Learning from Human Feedback (RLHF) is a cutting-edge approach in the field of artificial intelligence that leverages human preferences and guidance to train and improve machine learning models. At its core, RLHF is a machine learning…
ZomboDB brings powerful text-search and analytics features to Postgres by using Elasticsearch as an index type. Its comprehensive query language and SQL functions enable new and creative ways to query your relational data.
ZomboDB is a 100% native Postgres extension written in Rust with PGRX. ZomboDB uses Postgres's Index Access Method API to directly manage and optimize ZomboDB's specialized indices. As a native Postgres index type, ZomboDB allows you to CREATE INDEX ... USING zombodb on your existing Postgres tables. At that point, ZomboDB takes over and fully manages the remote Elasticsearch index, guaranteeing transactionally-correct text-search query results.
https://github.com/zombodb/zombodb/
ZomboDB is a 100% native Postgres extension written in Rust with PGRX. ZomboDB uses Postgres's Index Access Method API to directly manage and optimize ZomboDB's specialized indices. As a native Postgres index type, ZomboDB allows you to CREATE INDEX ... USING zombodb on your existing Postgres tables. At that point, ZomboDB takes over and fully manages the remote Elasticsearch index, guaranteeing transactionally-correct text-search query results.
https://github.com/zombodb/zombodb/
GitHub
GitHub - zombodb/zombodb: Making Postgres and Elasticsearch work together like it's 2023
Making Postgres and Elasticsearch work together like it's 2023 - zombodb/zombodb
This project contains a series of tiny broken programs (and one nasty surprise). By fixing them, you'll learn how to read and write Zig code.
#zig
https://codeberg.org/ziglings/exercises
#zig
https://codeberg.org/ziglings/exercises
Codeberg.org
exercises
Learn the ⚡Zig programming language by fixing tiny broken programs.
KahaDB is a file based persistence database that is local to the message broker that is using it. It has been optimized for fast persistence. It is the the default storage mechanism since ActiveMQ Classic 5.4. KahaDB uses less file descriptors and provides faster recovery than its predecessor, the AMQ Message Store.
In order to facilitate rapid retrieval of messages from the data logs, a B-tree index is created, which contains pointers to the locations of all the messages embedded in the data log files. The complete B-tree index is stored on disk and part or all of the B-tree index is held in a cache in memory. Evidently, the B-tree index can work more efficiently, if the complete index fits into the cache.
https://github.com/apache/activemq/tree/main/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb
In order to facilitate rapid retrieval of messages from the data logs, a B-tree index is created, which contains pointers to the locations of all the messages embedded in the data log files. The complete B-tree index is stored on disk and part or all of the B-tree index is held in a cache in memory. Evidently, the B-tree index can work more efficiently, if the complete index fits into the cache.
https://github.com/apache/activemq/tree/main/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb
GitHub
activemq/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb at main · apache/activemq
Mirror of Apache ActiveMQ. Contribute to apache/activemq development by creating an account on GitHub.
Interesting article about Kafka latency issue in scale due to ext4 filesystem, the sciencific approach they followed is really beuatiful.
https://blog.allegro.tech/2024/03/kafka-performance-analysis.html
https://blog.allegro.tech/2024/03/kafka-performance-analysis.html
blog.allegro.tech
Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF
At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we…
Slim allows developers to inspect, optimize and debug their containers .
#docker #optimization
https://github.com/slimtoolkit/slim
#docker #optimization
https://github.com/slimtoolkit/slim
GitHub
GitHub - slimtoolkit/slim: Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled…
Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source) - slimtoolkit/slim
Kafka Tierd storage feature, a novel approach for archiving kafka messages. IMO NOT READY FOR PRODUCTION BUT KEEP EYE ON IT
https://developers.redhat.com/articles/2024/03/13/kafka-tiered-storage-deep-dive
https://aws.amazon.com/blogs/big-data/deep-dive-on-amazon-msk-tiered-storage/
https://developers.redhat.com/articles/2024/03/13/kafka-tiered-storage-deep-dive
https://aws.amazon.com/blogs/big-data/deep-dive-on-amazon-msk-tiered-storage/
Red Hat Developer
Kafka tiered storage deep dive | Red Hat Developer
Tiered storage is a new early access feature available as of Apache Kafka 3.6.0 that allows you to scale compute and storage resources independently, provides better client isolation, and allows
How neural network works under the hood? explaining about vector embeddings and it's use cases.
https://www.datastax.com/guides/what-is-a-vector-embedding
https://www.datastax.com/guides/what-is-a-vector-embedding
DataStax
What are Vector Embeddings? Applications, Use Cases & More
Read this detailed guide to learn what vector embeddings are, how they are used in Generative AI, and how they can be stored and accessed in vector databases.