Category: Database, Data, Architecture

Like so many tech projects, Apache Iceberg grew out of frustration. We kept seeing problems that were not really at the file level that people were trying to solve at the file level, or, you know, basically trying to work around limitations,” he said.

With Hive, he explained, the idea was to keep data in directories and be able to prune out the directories you don’t need.

The problem with holding that state in the file system is that you can’t make fine-grained changes to it.

It’s part of the trend toward decoupling compute and data, according to Tomer Shiran, co-founder of Dremio, and an alternative to the tradeoffs between data lake and data warehouse.

Related Articles