Declarative Data Lake using Apache Hudi
Follow-up post: https://www.ekalavya.dev/how-to-run-apache-hudi-deltastreamer-kubevela-addon/
More content at https://denote.dev/
I have been working on operationalizing Apache Hudi recently and tried to do some user-friendly stuff for adopting it like here. I had a table service in mind but did not get a chance to complete my implementation.
But now, I want to write some tutorials about Apache Hudi while it is still fresh in my mind and I need a simple demo setup for readers.
And I read a blog about the fastest way to try out Apache Iceberg on a laptop. So I want to create my own
Kubernetes, Spark, and Hudi: The Fastest Way to Try Apache Hudi!
I am making slow progress towards it. Maybe I will start posting the tutorial blogs from next week. It works something like.
Setup local environment
kind create cluster -config kind.yaml
Setup hudi-operator
kubectl apply -k hudi-operator/config
Create hudi table
kubectl apply -f hudi_table.yaml
Run hudi query
kubectl apply -f hudi_query.yaml
So once the first 2 steps are done, we can try out many Apache Hudi features using 2 commands.
To accomplish this I am also implementing the Hudi Lock configuration using Kubernetes to demo the whole gamut of Hudi features, so taking some time.
This may sound like going a little overboard just for a blog, I am kind of obsessed with developer experience generally(devx).