Jim's Lib

if you guys never heard about Disaggregated Database Systems, I'll refer you to an article with the same name

https://www.cs.purdue.edu/homes/csjgwang/pubs/SIGMOD23_Tutorial_DisaggregatedDB.pdf

67 views21:44

Jim's Lib

Distributed systems reading list by Fred Hebert, a well-known Software architect in industy, and well-known contributor on Erlang community.

https://ferd.ca/a-distributed-systems-reading-list.html

ferd.ca

A Distributed Systems Reading List

An old document I surfaced with my quick tour of distributed systems theory fundamentals

67 views21:54

Jim's Lib

Elasticsearch toolkit by LeakIX

https://github.com/LeakIX/estk

GitHub

GitHub - LeakIX/estk: ES ToolKit is a standalone solution to navigate and backup data for a wide range of Elasticsearch and Kibana…

ES ToolKit is a standalone solution to navigate and backup data for a wide range of Elasticsearch and Kibana versions. - LeakIX/estk

69 views12:12

Jim's Lib

Forwarded from Jim Mim

https://skillsmatter.com/skillscasts/5247-the-lmax-exchange-architecture-high-throughput-low-latency-and-plain-old-java

https://martinfowler.com/articles/lmax.html

49 views16:32

Jim's Lib

The Single Transferable Vote (STV) system is a way of voting in elections that aims to achieve proportional representation. It's like trying to make sure everyone's voice is heard as fairly as possible in deciding who gets elected. Here's a simple way to understand how it works, using a "pizza party" analogy.

Imagine you and your friends are deciding what kind of pizza to order for a party. Everyone has different tastes, so you decide to vote on it, but you want to make sure as many people as possible get something they like.

Everyone Votes for Their Favorites: Each person makes a list of their favorite pizzas in order, from most to least favorite.

Counting the Votes: To decide which pizzas to order, there's a certain number of "pizza spots" available (like seats in an election). A pizza needs a certain number of votes to claim a spot (this is like the quota in STV).

First Choices Counted First: Initially, everyone's first choice is counted. If your top choice pizza gets more votes than it needs to secure a spot, it's definitely being ordered.

Extra Votes Go to Next Favorites: If your favorite pizza gets more votes than it needs, your vote isn't wasted. Instead, it's as if your vote is partly for your top choice and partly for your next favorite, based on how many extra votes the winning pizza got. This way, part of your vote helps decide the next pizza.

No Hope Pizzas Are Out: If a pizza doesn't get enough votes to be in the running, it's out. Then, votes for that pizza go to the next choice on those voters' lists.
Repeat Until All Spots Filled: This process of redistributing votes from winning pizzas with too many votes and from losing pizzas continues until all the pizza spots are filled.

So, STV tries to make sure that the pizzas ordered reflect what the group as a whole likes, not just the most popular choice. It reduces wasted votes and helps more people get at least one of their top choices. In real elections, this means that the elected representatives better reflect the diverse preferences of the voters.

60 views10:02

Jim's Lib

What Is Social Contract?
The Social Contract is a simple and effective way to enable team autonomy and self accountability for engagements. The Social Contract or Agreement is created by and for the team. It looks to codify the behaviors and expectations of the team. It also provides a mechanism for the team to radiate and share its desired behaviours with management and other stakeholders.

To effectively use this practice you should look to create the following outcomes:
- Public display of the social contract.
- Nobody is above the contract.
- The team agreed holds each other accountable to the contract. Having every team member physically sign the contract can provide a good starting point for this.
- Revisit the social contract often and update it as necessary.

https://openpracticelibrary.com/practice/social-contract/

#team #management #worktogether

58 viewsedited 11:48

Jim's Lib

The Five Dysfunctions of a Team
1. Abscence of trust
2. Fear of conflict
3. Lack of commitment
4. Avoidance of accountability
5. Inattention to the results

https://files.tablegroup.com/wp-content/uploads/2020/12/11224029/FiveDysfunctions.pdf

57 views11:54

Jim's Lib

Empathy isn't just a social nicety

Empathy is really important for close relationships. It's like how a mom just gets what her baby needs without the baby saying anything. Empathy is about trying to think like someone else and understand that everyone thinks differently.

It's a big part of being good at getting along with others and doing well at work. Real empathy means you not only know how someone feels but also feel it with them. For example, some people can tell what makes others upset but don't really care about their feelings. They use what they know to take advantage, not to connect.

If someone can't truly share someone else's feelings, it might mean they haven't worked on understanding themselves enough. Empathy isn't just knowing; it's feeling together. It helps make real connections and grow as a person.

59 views18:53

Jim's Lib

Article about ways to optimize INSERT operations and look at alternatives when you need to load more than a few rows in the LOAD DATA INFILE statement.

https://www.red-gate.com/simple-talk/databases/mysql/optimizing-mysql-adding-data-to-tables/

#mysql #performance #insert

Simple Talk

Optimizing MySQL: Adding Data to Tables - Simple Talk

Welcome back to the MySQL optimization series! In case you haven’t been following this series, in the past couple of articles we have discussed the basics

47 views09:49

Jim's Lib

Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost

https://aws.amazon.com/blogs/big-data/best-practices-for-right-sizing-your-apache-kafka-clusters-to-optimize-performance-and-cost/

#MSK

Amazon

Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost | Amazon Web Services

Apache Kafka is well known for its performance and tunability to optimize for various use cases. But sometimes it can be challenging to find the right infrastructure configuration that meets your specific performance requirements while minimizing the infrastructure…

48 viewsedited 15:25

Jim's Lib

Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost https://aws.amazon.com/blogs/big-data/best-practices-for-right-sizing-your-apache-kafka-clusters-to-optimize-performance-and-cost/ #MSK

AWS MSK on AWS Graviton
https://aws.amazon.com/blogs/big-data/amazon-msk-now-provides-up-to-29-more-throughput-and-up-to-24-lower-costs-with-aws-graviton3-support/

#MSK

Amazon

Amazon MSK now provides up to 29% more throughput and up to 24% lower costs with AWS Graviton3 support | Amazon Web Services

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data. Today, we’re excited to bring the benefits of Graviton3 to Kafka workloads, with…

52 viewsedited 15:26

Jim's Lib

Best practices for running production workloads using Amazon MSK tiered storage

https://aws.amazon.com/blogs/big-data/best-practices-for-running-production-workloads-using-amazon-msk-tiered-storage/

#MSK

Amazon

Best practices for running production workloads using Amazon MSK tiered storage | Amazon Web Services

In the second post of the series, we discussed some core concepts of the Amazon Managed Streaming for Apache Kafka (Amazon MSK) tiered storage feature and explained how read and write operations work in a tiered storage enabled cluster. This post focuses…

54 viewsedited 15:42

Jim's Lib

RLHF & DPO: Simplifying and Enhancing Fine-Tuning for Language Models

https://www.linkedin.com/pulse/rlhf-dpo-simplifying-enhancing-fine-tuning-language-models-kirouane/

RLHF & DPO: Simplifying and Enhancing Fine-Tuning for Language Models

What Is RLHF? Reinforcement Learning from Human Feedback (RLHF) is a cutting-edge approach in the field of artificial intelligence that leverages human preferences and guidance to train and improve machine learning models. At its core, RLHF is a machine learning…

57 views08:45

Jim's Lib

yet another message broker

Memphis

https://docs.memphis.dev/memphis/memphis-broker/architecture

docs.memphis.dev

Architecture | Memphis.dev

This section describes Memphis' architecture

61 viewsedited 09:32

Jim's Lib

ZomboDB brings powerful text-search and analytics features to Postgres by using Elasticsearch as an index type. Its comprehensive query language and SQL functions enable new and creative ways to query your relational data.

ZomboDB is a 100% native Postgres extension written in Rust with PGRX. ZomboDB uses Postgres's Index Access Method API to directly manage and optimize ZomboDB's specialized indices. As a native Postgres index type, ZomboDB allows you to CREATE INDEX ... USING zombodb on your existing Postgres tables. At that point, ZomboDB takes over and fully manages the remote Elasticsearch index, guaranteeing transactionally-correct text-search query results.

https://github.com/zombodb/zombodb/

GitHub

GitHub - zombodb/zombodb: Making Postgres and Elasticsearch work together like it's 2023

Making Postgres and Elasticsearch work together like it's 2023 - zombodb/zombodb

109 views13:21

Jim's Lib

This project contains a series of tiny broken programs (and one nasty surprise). By fixing them, you'll learn how to read and write Zig code.

#zig

https://codeberg.org/ziglings/exercises

Codeberg.org

exercises

Learn the ⚡Zig programming language by fixing tiny broken programs.

70 views17:03

Jim's Lib

KahaDB is a file based persistence database that is local to the message broker that is using it. It has been optimized for fast persistence. It is the the default storage mechanism since ActiveMQ Classic 5.4. KahaDB uses less file descriptors and provides faster recovery than its predecessor, the AMQ Message Store.

In order to facilitate rapid retrieval of messages from the data logs, a B-tree index is created, which contains pointers to the locations of all the messages embedded in the data log files. The complete B-tree index is stored on disk and part or all of the B-tree index is held in a cache in memory. Evidently, the B-tree index can work more efficiently, if the complete index fits into the cache.

https://github.com/apache/activemq/tree/main/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb

GitHub

activemq/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb at main · apache/activemq

Mirror of Apache ActiveMQ. Contribute to apache/activemq development by creating an account on GitHub.

76 viewsedited 15:29

Jim's Lib

Interesting article about Kafka latency issue in scale due to ext4 filesystem, the sciencific approach they followed is really beuatiful.

https://blog.allegro.tech/2024/03/kafka-performance-analysis.html

blog.allegro.tech

Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF

At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we…

80 viewsedited 12:29

Jim's Lib

Slim allows developers to inspect, optimize and debug their containers .

#docker #optimization
https://github.com/slimtoolkit/slim

GitHub

GitHub - slimtoolkit/slim: Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled…

Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source) - slimtoolkit/slim

84 views10:17

Jim's Lib

Kafka Tierd storage feature, a novel approach for archiving kafka messages. IMO NOT READY FOR PRODUCTION BUT KEEP EYE ON IT

https://developers.redhat.com/articles/2024/03/13/kafka-tiered-storage-deep-dive

https://aws.amazon.com/blogs/big-data/deep-dive-on-amazon-msk-tiered-storage/

Red Hat Developer

Kafka tiered storage deep dive | Red Hat Developer

Tiered storage is a new early access feature available as of Apache Kafka 3.6.0 that allows you to scale compute and storage resources independently, provides better client isolation, and allows

73 viewsedited 00:47

Jim's Lib

How neural network works under the hood? explaining about vector embeddings and it's use cases.
https://www.datastax.com/guides/what-is-a-vector-embedding

DataStax

What are Vector Embeddings? Applications, Use Cases & More

Read this detailed guide to learn what vector embeddings are, how they are used in Generative AI, and how they can be stored and accessed in vector databases.

73 views17:34

About

Blog

Apps

Platform