Data1984

727 views15:36

Data1984

Open source self-hosted Delta Sharing server | Delta Lake
https://delta.io/blog/2023-04-24-open-source-selfhosted-delta-sharing-server/

delta.io

Open source self-hosted Delta Sharing server

This post explains Kotosiro Delta Sharing server basic instructions

👍5

667 views17:20

Data1984

dbt Guide | GitLab
https://about.gitlab.com/handbook/business-technology/data-team/platform/dbt-guide/

👍5

620 views19:26

Data1984

Microsoft introduced Fabric, which is a combination of Power BI, Azure Synapse, Data Factory and Data Explorer on top of ADLS gen2 using Delta (Parquet) as data lake format. A new component is Data Activator which seems to be a no-code rule engine.
https://azure.microsoft.com/en-us/blog/introducing-microsoft-fabric-data-analytics-for-the-era-of-ai/

Microsoft Azure Blog

Introducing Microsoft Fabric: The data platform for the era of AI | Microsoft Azure Blog | Microsoft Azure

Announcing Microsoft Fabric—a unified analytics platform that brings together all the data and analytics tools that organizations need. Learn more.

❤1

539 views18:43

Data1984

547 views11:32

Data1984

Thread about how Slack uses Apache Kafka at scale.

541 viewsedited 19:47

Data1984

615 views19:50

Data1984

The State of Data Engineering 2023
https://lakefs.io/blog/the-state-of-data-engineering-2023/

lakeFS

The State of Data Engineering 2023

Explore the leading tools and trends that shaped data engineering in 2023. Read the detailed report on data version control at scale.

697 views20:58

Data1984

Microsoft OneLake in Fabric, the OneDrive for data.

Microsoft

Microsoft OneLake in Fabric, the OneDrive for data | Microsoft Fabric Blog | Microsoft Fabric

Introducing Microsoft OneLake – “the OneDrive for Data”. OneLake is a complete, rich, ready-to-go enterprise-wide data lake provided as a SaaS service.

719 viewsedited 21:24

Data1984

This is huge! PowerBI now has
Git integration and Power BI Desktop ‘Developer Mode' which means you can edit files in code editor like VSCode and deploy them using PowerBI deployment pipelines. In other words "dashboards as a code". Apparently, this functionality was added to enable Copilot.

Microsoft

Introducing git integration in Microsoft Fabric for seamless source control management | Microsoft Fabric Blog | Microsoft Fabric

Git integration enables developers to integrate their development processes, tools, and best practices straight into the Microsoft Fabric workspace.

👍4

589 viewsedited 14:24

Data1984

Empower every BI professional to do more with Microsoft Fabric
https://build.microsoft.com/en-US/sessions/8b23c96e-7c35-463d-88b4-564d23dc14a5

629 views15:45

Data1984

Choosing an open table format for your transactional data lake on AWS | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/

Amazon

Choosing an open table format for your transactional data lake on AWS | Amazon Web Services

August 2023: This post was updated to include Apache Iceberg support in Amazon Redshift. Disclaimer: Due to rapid advancements in AWS service support for open table formats, recent developments might not yet be reflected in this post. For the latest information…

499 views16:58

Data1984

AWS Glue Data Quality is Generally Available | AWS Big Data Blog
https://aws.amazon.com/blogs/big-data/aws-glue-data-quality-is-generally-available/

Amazon

AWS Glue Data Quality is Generally Available | Amazon Web Services

We are excited to announce the General Availability of AWS Glue Data Quality. Our journey started by working backward from our customers who create, manage, and operate data lakes and data warehouses for analytics and machine learning. To make confident business…

557 views17:06

Data1984

Choosing an open table format for your transactional data lake on AWS | AWS Big Data Blog https://aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/

While AWS tries to support all three formats, and helps to choose right one for your use-case, Databricks introduces unified format, so you will not need to pick 😎

Datanami

Databricks Puts Unified Data Format on the Table with Delta Lake 3.0

Databricks today rolled out a new open table format in Delta Lake 3.0 that it says will eliminate the possibility of picking the wrong one. Dubbed

698 views18:23

Data1984

https://www.databricks.com/blog/introducing-english-new-programming-language-apache-spark

🔥5

1.02K views21:00

Data1984

https://www.databricks.com/blog/announcing-delta-lake-30-new-universal-format-and-liquid-clustering

1.02K views06:47

Data1984

https://engineering.linkedin.com/blog/2023/taking-charge-of-tables--introducing-openhouse-for-big-data-mana

Taking Charge of Tables: Introducing OpenHouse for Big Data Management

720 views20:50

Data1984

This reminds me of a solution designed and implemented a couple of years ago. But back then we used DynamoDB streams to capture item-level changes with exactly-one semantics, Lambda to modify data and Kinesis Firehose to deliver data to Redshift. Looks like now things are simpler.

Amazon

Near-real-time analytics using Amazon Redshift streaming ingestion with Amazon Kinesis Data Streams and Amazon DynamoDB | Amazon…

Amazon Redshift is a fully managed, scalable cloud data warehouse that accelerates your time to insights with fast, easy, and secure analytics at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical…

671 views19:52

Data1984

https://github.com/unum-cloud/ustore

GitHub

GitHub - unum-cloud/UStore: Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX…

Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️ - unum-cloud/US...

708 views15:31

Data1984

VP and distinguished engineer over at S3 tells the story of building S3.