#data-engineering
Read more stories on Hashnode
Articles with this tag
Building data pipelines in Dataform · Dataform is a tool that creates data pipelines using SQL. If you’re familiar with Dbt, Dataform is probably best...
Demystifying how Parquet works internally · After a lot of theory it's finally to talk about the code. Since there is a lot going on in the codebase,...
Demystifying how Parquet works internally · Previously, we talked about how to parquet writes data. In this article, we will talk about how parquet reads...
Using S3 and Athena is great for data storage and retrieval using queries. But when I first started using it, one common problem that came up fairly...
Demystifying how Parquet works internally · In this article we will get into how data is written as parquet format. To do that, one of the first thing we...
Demystifying how Parquet works internally · The parquet file format is a well known data storage format that is famed for its "efficient storage" and...