Apache Hudi and Kubernetes: The Fastest Way to Try Apache Hudi!

2 min readFeb 28, 2022

Follow up is here: https://www.ekalavya.dev/how-to-run-apache-hudi-deltastreamer-kubevela-addon/

More related content at https://www.denote.dev/

As I previously stated, I am developing a set of scenarios to try out Apache Hudi features at https://github.com/replication-rs/apache-hudi-scenarios

Here is how you can try it out quickly if you have Docker running on your computer. You need at least 4 CPUs and 8GB memory allocated to it.

git clone https://github.com/h7kanna/apache-hudi-scenarioscd apache-hudi-scenarios

Quick start on macOS (Intel)

./quickstarts/kind-macos.sh

Quick start on macOS (M1)

./quickstarts/kind-macos-m1.sh

Quick start on Linux (Intel)

./quickstarts/kind-linux.sh

Wait for the services to initialize

./bin/kubectl get pods --all-namespaces -w

Hello Hudi

cd hello-hudi../bin/kubectl apply -f huditable.yaml

Then go to http://localhost:30001/ and log in using admin/password in around 2 mins.

You will find the Apache Hudi data lake bucket with the table we just created if the execution is successful.

You can query the table using

../bin/kubectl apply -f hudisparkquery.yaml

Watch progress and output using logs.

export POD_NAME=`../bin/kubectl get pods -l spark-role=driver -n spark-system | grep hudisparkquery | awk '{print $1;}'`

Use the above hudisparkquery-sample pod name to get the output

../bin/kubectl logs $POD_NAME -n spark-system

I will keep adding more complex scenarios using clustering and sync service using Kubernetes-based lock etc. and improve the operator as I get time. Keep watching if you are interested in learning to build an Apache Hudi Data Lake using Kubernetes.

My end goal is to build a production-ready hudi-operator based on the knowledge I have currently before it vanishes.

Apache Hudi and Kubernetes: The Fastest Way to Try Apache Hudi!

Quick start on macOS (Intel)

Quick start on macOS (M1)

Quick start on Linux (Intel)

Hello Hudi

Written by Harsha Teja Kanna

Responses (1)