GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#java #batch #cdc #change_data_capture #data_integration #data_pipeline #distributed #elt #etl #flink #kafka #mysql #paimon #postgresql #real_time #schema_evolution

Flink CDC is a tool that helps you move and transform data in real-time or in batches. It makes data integration simple by using YAML files to describe how data should be moved and transformed. This tool offers features like full database synchronization, table sharding, schema evolution, and data transformation. To use it, you need to set up an Apache Flink cluster, download Flink CDC, create a YAML file to define your data sources and sinks, and then run the job. This benefits you by making it easier to manage and integrate your data efficiently across different databases.

https://github.com/apache/flink-cdc
#go #cncf #distributed_tracing #hacktoberfest #jaeger #observability #opentelemetry #tracing

Jaeger is a tool that helps you understand how different parts of your software work together. It's like a map that shows where data goes and how long it takes to get there. This helps you find and fix problems faster. Jaeger is free and open source, meaning anyone can use and improve it. It's supported by a big community and has clear guides on how to get started and contribute. Using Jaeger can make your software run more smoothly and efficiently.

https://github.com/jaegertracing/jaeger
#java #analytics #big_data #cloudnative #database #datalake #delta_lake #distributed_database #hudi #iceberg #join #lakehouse #lakehouse_platform #mpp #olap #real_time_analytics #real_time_updates #realtime_database #sql #star_schema #vectorized

StarRocks is a very fast query engine for analyzing data quickly, even in just a second. It works 3 times faster than other similar tools and doesn't require you to move or change your data. Here are some key benefits:
- It uses advanced technology to speed up queries.
- It supports standard SQL and works with various clients and BI software.
- It optimizes complex queries efficiently.
- It allows real-time updates and direct access to data from different sources.
- It manages resources well and is easy to maintain and scale.

Using StarRocks can help you analyze data much faster and more efficiently, making your work easier and quicker.

https://github.com/StarRocks/starrocks
#other #architecture #awesome #awesome_list #backend #big_data #computer_science #design_patterns #devops #distributed_systems #interview #interview_practice #interview_questions #lists #machine_learning #programming #resources #scalability #system #system_design #web_development

This resource provides a comprehensive guide to building scalable, reliable, and performant large-scale systems. Here are the key benefits It offers detailed articles and case studies from prominent engineers on how to design systems that can handle heavy loads and perform well for both single users and millions of users.
- **System Design Interviews** It covers how to scale teams effectively, focusing on increasing team output and value rather than just growing the team size.
- **Community Contributions** The resource covers a wide range of topics including scalability, availability, stability, performance, intelligence, architecture, and more, providing a holistic view of system engineering.

Overall, this guide is invaluable for anyone looking to build or maintain large-scale systems efficiently.

https://github.com/binhnguyennus/awesome-scalability
#typescript #apm #application_monitoring #distributed_tracing #go #good_first_issue #jaeger #log #logs #metrics #monitoring #nextjs #observability #open_source #opentelemetry #prometheus #react #reactjs #self_hosted #tracing #typescript

SigNoz is a tool that helps you monitor and troubleshoot your applications easily. It combines logs, metrics, and traces in one place, allowing you to spot issues before they happen and fix problems quickly. It's cost-effective and open-source, similar to Datadog and New Relic but without the high costs. With SigNoz, you can monitor application performance, manage logs efficiently, track user requests across services, create customized dashboards, and set alerts for unusual activities. This makes it easier to identify and solve problems quickly, ensuring your application runs smoothly.

https://github.com/SigNoz/signoz
👏2
#cplusplus #deep_learning #deep_neural_networks #distributed #machine_learning #ml #neural_network #python #tensorflow

TensorFlow is a powerful tool for machine learning that helps you build and deploy AI applications easily. It was developed by Google and is now open source, meaning anyone can use and contribute to it. TensorFlow provides tools, libraries, and a strong community to support your work. You can install it using Python with a simple command like `pip install tensorflow`, and it supports various devices including GPUs. This makes it versatile for researchers and developers alike, allowing you to push the boundaries of machine learning and create innovative applications.

https://github.com/tensorflow/tensorflow
#python #ai #big_model #data_parallelism #deep_learning #distributed_computing #foundation_models #heterogeneous_training #hpc #inference #large_scale #model_parallelism #pipeline_parallelism

Colossal-AI is a powerful tool that helps make large AI models faster, cheaper, and easier to use. It uses special techniques like parallelism to speed up training on big models without needing expensive hardware. This means users can train complex AI models even on regular computers or laptops, saving time and money. Colossal-AI also supports various applications across industries like medicine, video generation, and chatbots, making it very versatile for developers.

https://github.com/hpcaitech/ColossalAI
#go #bugtracker #decentralized_application #distributed_systems #git #gitdb

git-bug is a powerful, decentralized issue tracker that stores issues, comments, and users directly inside a Git repository as versioned objects, not just files. This means you can manage your issues offline, sync them later, and keep everything clean and organized within your existing Git workflow. It’s very fast, supports syncing with platforms like GitHub and GitLab, and offers multiple ways to interact, including command line, text user interface, or web browser. This tool helps you track and manage project issues efficiently without needing a separate server or database, making collaboration and version control seamless.

https://github.com/git-bug/git-bug
#rust #blockchain #distributed_ledger_technology #move #smart_contracts

Sui is a next-generation blockchain platform designed for very fast, low-cost transactions and high scalability, making it ideal for apps like gaming, DeFi, and NFTs. It uses a unique object-based data model and the Move programming language, which helps create secure, flexible smart contracts and allows many transactions to happen at the same time. This means you get instant transaction finality and a smooth user experience. Sui’s native token, SUI, is used for fees, staking, and governance, helping keep the network secure and decentralized. Overall, Sui offers a powerful, efficient foundation for building and using web3 applications.

https://github.com/MystenLabs/sui
#cplusplus #compaction #database #distributed_database #kvstore #nosql #rocksdb

ToplingDB is a faster and more advanced key-value database built on RocksDB, designed for better performance and flexibility. It supports easy configuration through JSON/YAML, has an embedded web server to monitor and change settings without restarting, and improves speed with features like faster transaction locks and concurrent IO. It also offers plugins for enhanced functions and cloud-native services like MySQL and Redis on ToplingDB. This means you get a powerful, efficient database that is easier to manage and scales well for large or distributed systems, saving you time and improving your application's speed and reliability.

https://github.com/topling/toplingdb
#rust #bigdata #cloud_native #distributed_systems #filesystem #minio #object_storage #oss #rust #s3

RustFS is a fast and safe distributed object storage system built with Rust, offering high performance and scalability for large data needs like AI and big data. It is compatible with S3, easy to use, and open source under the business-friendly Apache 2.0 license. Compared to others like MinIO, RustFS provides better memory safety, no risky data logging, and supports local cloud providers. You can quickly install it via a script or Docker, manage storage through a simple web console, and benefit from a strong community and detailed documentation. This makes RustFS a reliable, cost-effective choice for secure, scalable storage.

https://github.com/rustfs/rustfs
#javascript #distributed_companies #hacktoberfest #jobs_search #jobsearch #jobseeker #remote #remote_companies #remote_job #remote_work

This list shows hundreds of companies, mostly in tech, that let people work from home either part-time or full-time, with many offering jobs to people all over the world. The list includes big names like Microsoft, Amazon, and Shopify, as well as smaller companies, and covers many different types of work, from software and design to education and health. For anyone looking for a remote job, this is a helpful starting point because it saves time—instead of searching one by one, you can quickly see which companies are open to remote work and find links to their websites for more details or to apply. This makes it much easier to find a job that fits your skills and lets you work from anywhere.

https://github.com/remoteintech/remote-jobs
#java #distributed_systems #durable_execution #grpc #java #javascript #microservice_orchestration #orchestration_engine #orchestrator #reactjs #spring_boot #workflow_automation #workflow_engine #workflow_management #workflows

Conductor is an open-source tool that helps you manage and automate complex workflows involving many microservices and systems. It makes your workflows flexible, reliable, and scalable by handling retries, errors, and monitoring automatically. You can define workflows as code in JSON, use various task types, and manage workflows dynamically without tightly coupling services. It offers an easy-to-use web interface and supports multiple databases like Redis and MySQL. This helps you build, run, and monitor workflows efficiently, saving time and reducing errors in managing distributed applications. It also has SDKs for Java, Python, JavaScript, Go, and C# to integrate easily with your projects.

https://github.com/conductor-oss/conductor
1
#rust #artificial_intelligence #big_data #data_engineering #distributed_computing #machine_learning #multimodal #python #rust

Daft is a powerful, easy-to-use data engine that lets you process large-scale data using Python or SQL with high speed and efficiency. It supports complex data types like images and tensors, works well interactively for quick data exploration, and can scale to huge cloud clusters using Ray. Daft integrates smoothly with cloud storage and data catalogs, making it ideal for data engineering, analytics, and machine learning workflows. By using Daft, you can handle big, multimodal datasets faster and more flexibly, improving your ability to analyze and prepare data for AI models without complex setup or slowdowns.

https://github.com/Eventual-Inc/Daft
#go #blob_storage #cloud_drive #distributed_file_system #distributed_storage #distributed_systems #erasure_coding #fuse #hadoop_hdfs #hdfs #kubernetes #object_storage #posix #replication #s3 #s3_storage #seaweedfs #tiered_file_system

SeaweedFS is a fast, simple, and highly scalable distributed file system designed to store billions of files and serve them quickly, especially small files. It uses a master server to manage volumes on volume servers, which handle file data and metadata, enabling very fast file access with minimal disk reads. It supports features like replication, erasure coding, cloud integration for elastic storage, and compatibility with many metadata stores and APIs including Amazon S3. This means you get efficient, cost-effective storage with fast access, easy scaling, and flexible deployment options for large-scale file storage needs.

https://github.com/seaweedfs/seaweedfs
#java #awesome #backend #computer_science #distributed_systems #high_level_design #hld #interview #interview_questions #scalability #system_design

You can learn important system design concepts for free, covering topics like scalability, availability, CAP theorem, caching, databases, APIs, microservices, and distributed systems. This resource offers clear explanations, interview preparation guides, and practical design problems from easy to hard, helping you understand how to build reliable, scalable software systems. It also provides links to courses, books, newsletters, and videos to deepen your knowledge. Using these materials can improve your skills for system design interviews and real-world software architecture, making you more confident and effective in designing complex systems.

https://github.com/ashishps1/awesome-system-design-resources