Data1984
792 subscribers
44 photos
1 video
17 files
762 links
This channel is mostly about data related stuff, some of the main topics are #DataEngineering #SQL #Python #cloud .

Contact: @gorros
Download Telegram
This use-case puts Apache Doris into perspective. I don't know many alternatives to Clickhouse as open-source data warehouse but it looks like Doris is one of them.
👍3🤔1
Arroyo is an open-source stream processing engine, enabling users to transform, filter, aggregate, and join their data streams in real-time with SQL queries. It's designed to be easy enough for any SQL user to build correct, reliable, and scalable streaming pipelines.
https://www.arroyo.dev/blog/why-arrow-and-datafusion
1
Interesting projects from Microsoft for building LLM applications:
- AICI: Prompts as (Wasm) Programs
- AutoGen: A programming framework for agentic AI
- Semantic Kernel: Integrate cutting-edge LLM technology quickly and easily into your apps
👍1
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

https://arxiv.org/html/2404.07544v1
1
Not sure if this is a dbt alternative or just an abstraction layer 🤔
https://www.malloydata.dev/
DataFrames at Scale Comparison: TPC-H

https://docs.coiled.io/blog/tpch.html