#python #big_data #data_engineering #data_quality #data_science #feature_store #features #machine_learning #ml #mlops
https://github.com/feast-dev/feast
https://github.com/feast-dev/feast
GitHub
GitHub - feast-dev/feast: The Open Source Feature Store for AI/ML
The Open Source Feature Store for AI/ML. Contribute to feast-dev/feast development by creating an account on GitHub.
#java #big_data #caching #data_in_motion #data_insights #distributed #distributed_computing #distributed_systems #hacktoberfest #hazelcast #in_memory #low_latency #real_time #scalability #stream_processing
https://github.com/hazelcast/hazelcast
https://github.com/hazelcast/hazelcast
GitHub
GitHub - hazelcast/hazelcast: Hazelcast is a unified real-time data platform combining stream processing with a fast data store…
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights. - hazelcast/hazelcast
👍1
#java #android #big_data #deserialization #fastjson #fastjson2 #graal #graalvm_native_image #high_performance #java #java_json #json #json_deserialization #json_parser #json_path #json_serialization #json_serializer #jsonb #serialization
FASTJSON 2 is a highly performant and easy-to-use Java JSON library. It offers significant performance improvements over other popular JSON libraries like Jackson and Gson. Here are the key benefits FASTJSON 2 outperforms other JSON libraries, making it ideal for applications requiring fast data processing.
- **Advanced Features** It includes support for JSONPath, SQL You can easily integrate it into your project using Maven or Gradle dependencies.
Using FASTJSON 2 simplifies your JSON handling tasks while providing superior performance and robust features.
https://github.com/alibaba/fastjson2
FASTJSON 2 is a highly performant and easy-to-use Java JSON library. It offers significant performance improvements over other popular JSON libraries like Jackson and Gson. Here are the key benefits FASTJSON 2 outperforms other JSON libraries, making it ideal for applications requiring fast data processing.
- **Advanced Features** It includes support for JSONPath, SQL You can easily integrate it into your project using Maven or Gradle dependencies.
Using FASTJSON 2 simplifies your JSON handling tasks while providing superior performance and robust features.
https://github.com/alibaba/fastjson2
GitHub
GitHub - alibaba/fastjson2: 🚄 FASTJSON2 is a Java JSON library with excellent performance.
🚄 FASTJSON2 is a Java JSON library with excellent performance. - alibaba/fastjson2
#cplusplus #ai #analytics #big_data #clickhouse #cpp #dbms #distributed_database #hacktoberfest #mpp #olap #rust #sql
ClickHouse is a free, open-source database that helps you get real-time analytical data reports. It's easy to install using a simple command on Linux, macOS, or FreeBSD. You can find lots of helpful resources like tutorials, documentation, and videos on their website. There are also community meetups and online chats where you can learn from other users. Using ClickHouse benefits you by allowing fast and efficient analysis of large amounts of data, which is useful for making quick decisions and improving your business operations.
https://github.com/ClickHouse/ClickHouse
ClickHouse is a free, open-source database that helps you get real-time analytical data reports. It's easy to install using a simple command on Linux, macOS, or FreeBSD. You can find lots of helpful resources like tutorials, documentation, and videos on their website. There are also community meetups and online chats where you can learn from other users. Using ClickHouse benefits you by allowing fast and efficient analysis of large amounts of data, which is useful for making quick decisions and improving your business operations.
https://github.com/ClickHouse/ClickHouse
GitHub
GitHub - ClickHouse/ClickHouse: ClickHouse® is a real-time analytics database management system
ClickHouse® is a real-time analytics database management system - ClickHouse/ClickHouse
#c_lang #ai #big_data #c #cloudberry #data_analysis #data_warehouse #database #distributed_database #greenplum #mpp #olap #postgres #postgresql #sql
Apache Cloudberry is a powerful, open-source database designed for large-scale data processing and analytics. It is built by the creators of Greenplum Database and uses a newer PostgreSQL kernel, making it suitable for data warehouses and AI/ML workloads. You can easily try it out using a Docker-based sandbox or build it from source on Linux or macOS. The community is active, with many channels for support, discussions, and contributions. This means you can get help quickly, share ideas, and even contribute to the project yourself. It's licensed under the Apache License, Version 2.0, making it free to use and modify. Overall, Apache Cloudberry offers advanced database capabilities and a supportive community, which can greatly benefit users needing robust data management solutions.
https://github.com/apache/cloudberry
Apache Cloudberry is a powerful, open-source database designed for large-scale data processing and analytics. It is built by the creators of Greenplum Database and uses a newer PostgreSQL kernel, making it suitable for data warehouses and AI/ML workloads. You can easily try it out using a Docker-based sandbox or build it from source on Linux or macOS. The community is active, with many channels for support, discussions, and contributions. This means you can get help quickly, share ideas, and even contribute to the project yourself. It's licensed under the Apache License, Version 2.0, making it free to use and modify. Overall, Apache Cloudberry offers advanced database capabilities and a supportive community, which can greatly benefit users needing robust data management solutions.
https://github.com/apache/cloudberry
GitHub
GitHub - apache/cloudberry: One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative…
One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database. - apache/cloudberry
#java #analytics #big_data #cloudnative #database #datalake #delta_lake #distributed_database #hudi #iceberg #join #lakehouse #lakehouse_platform #mpp #olap #real_time_analytics #real_time_updates #realtime_database #sql #star_schema #vectorized
StarRocks is a very fast query engine for analyzing data quickly, even in just a second. It works 3 times faster than other similar tools and doesn't require you to move or change your data. Here are some key benefits:
- It uses advanced technology to speed up queries.
- It supports standard SQL and works with various clients and BI software.
- It optimizes complex queries efficiently.
- It allows real-time updates and direct access to data from different sources.
- It manages resources well and is easy to maintain and scale.
Using StarRocks can help you analyze data much faster and more efficiently, making your work easier and quicker.
https://github.com/StarRocks/starrocks
StarRocks is a very fast query engine for analyzing data quickly, even in just a second. It works 3 times faster than other similar tools and doesn't require you to move or change your data. Here are some key benefits:
- It uses advanced technology to speed up queries.
- It supports standard SQL and works with various clients and BI software.
- It optimizes complex queries efficiently.
- It allows real-time updates and direct access to data from different sources.
- It manages resources well and is easy to maintain and scale.
Using StarRocks can help you analyze data much faster and more efficiently, making your work easier and quicker.
https://github.com/StarRocks/starrocks
GitHub
GitHub - StarRocks/starrocks: The world's fastest open query engine for sub-second analytics both on and off the data lakehouse.…
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class perf...
#other #architecture #awesome #awesome_list #backend #big_data #computer_science #design_patterns #devops #distributed_systems #interview #interview_practice #interview_questions #lists #machine_learning #programming #resources #scalability #system #system_design #web_development
This resource provides a comprehensive guide to building scalable, reliable, and performant large-scale systems. Here are the key benefits It offers detailed articles and case studies from prominent engineers on how to design systems that can handle heavy loads and perform well for both single users and millions of users.
- **System Design Interviews** It covers how to scale teams effectively, focusing on increasing team output and value rather than just growing the team size.
- **Community Contributions** The resource covers a wide range of topics including scalability, availability, stability, performance, intelligence, architecture, and more, providing a holistic view of system engineering.
Overall, this guide is invaluable for anyone looking to build or maintain large-scale systems efficiently.
https://github.com/binhnguyennus/awesome-scalability
This resource provides a comprehensive guide to building scalable, reliable, and performant large-scale systems. Here are the key benefits It offers detailed articles and case studies from prominent engineers on how to design systems that can handle heavy loads and perform well for both single users and millions of users.
- **System Design Interviews** It covers how to scale teams effectively, focusing on increasing team output and value rather than just growing the team size.
- **Community Contributions** The resource covers a wide range of topics including scalability, availability, stability, performance, intelligence, architecture, and more, providing a holistic view of system engineering.
Overall, this guide is invaluable for anyone looking to build or maintain large-scale systems efficiently.
https://github.com/binhnguyennus/awesome-scalability
GitHub
GitHub - binhnguyennus/awesome-scalability: The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems - binhnguyennus/awesome-scalability
#java #apache_kafka #big_data #cluster_management #event_streaming #hacktoberfest #kafka #kafka_brokers #kafka_client #kafka_cluster #kafka_connect #kafka_manager #kafka_producer #kafka_streams #kafka_ui #opensource #streaming_data #streams #web_ui
UI for Apache Kafka is a free, open-source web tool that helps you manage and monitor Apache Kafka clusters easily. It's lightweight and fast, making it simple to track key metrics like brokers, topics, partitions, production, and consumption. You can set it up quickly with a few commands and run it locally or in the cloud. The tool offers features like multi-cluster management, performance monitoring, browsing messages, and dynamic topic configuration. It also supports secure authentication and role-based access control. This makes it easier to observe data flows, troubleshoot issues, and ensure optimal performance of your Kafka clusters.
https://github.com/provectus/kafka-ui
UI for Apache Kafka is a free, open-source web tool that helps you manage and monitor Apache Kafka clusters easily. It's lightweight and fast, making it simple to track key metrics like brokers, topics, partitions, production, and consumption. You can set it up quickly with a few commands and run it locally or in the cloud. The tool offers features like multi-cluster management, performance monitoring, browsing messages, and dynamic topic configuration. It also supports secure authentication and role-based access control. This makes it easier to observe data flows, troubleshoot issues, and ensure optimal performance of your Kafka clusters.
https://github.com/provectus/kafka-ui
GitHub
GitHub - provectus/kafka-ui: Open-Source Web UI for Apache Kafka Management
Open-Source Web UI for Apache Kafka Management. Contribute to provectus/kafka-ui development by creating an account on GitHub.
❤1🤮1
#rust #artificial_intelligence #big_data #data_engineering #distributed_computing #machine_learning #multimodal #python #rust
Daft is a powerful, easy-to-use data engine that lets you process large-scale data using Python or SQL with high speed and efficiency. It supports complex data types like images and tensors, works well interactively for quick data exploration, and can scale to huge cloud clusters using Ray. Daft integrates smoothly with cloud storage and data catalogs, making it ideal for data engineering, analytics, and machine learning workflows. By using Daft, you can handle big, multimodal datasets faster and more flexibly, improving your ability to analyze and prepare data for AI models without complex setup or slowdowns.
https://github.com/Eventual-Inc/Daft
Daft is a powerful, easy-to-use data engine that lets you process large-scale data using Python or SQL with high speed and efficiency. It supports complex data types like images and tensors, works well interactively for quick data exploration, and can scale to huge cloud clusters using Ray. Daft integrates smoothly with cloud storage and data catalogs, making it ideal for data engineering, analytics, and machine learning workflows. By using Daft, you can handle big, multimodal datasets faster and more flexibly, improving your ability to analyze and prepare data for AI models without complex setup or slowdowns.
https://github.com/Eventual-Inc/Daft
GitHub
GitHub - Eventual-Inc/Daft: High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured…
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale - Eventual-Inc/Daft