This repository is an installer for who want to Setting up a Hadoop Single Node Cluster on Windows using WSL.
A Windows OS with WSL installed, notice that this repository was just tested on Ubuntu distribution.
Install git in WSL
sudo apt-get update && sudo apt-get install git
Open windows terminal, start WSL
wsl --distribution Ubuntu # or Ubuntu-22.04, ....
git clone https://github.com/cukhoaimon/hadoop-auto-installer && cd hadoop-auto-installer
- Change
HADOOP_USER_PASSWORDto the password of your wsl.
HADOOP_USER_PASSWORD=[your wsl-password]
- Change
HADOOP_PARENT_DIRto your username. Note that theusernameis theusernameof WSL distribution, not your windowsusername(but some case your windows username and wsl username is the same).
HADOOP_PARENT_DIR=/home/[wsl-username]/hadoop
- Change all below env variables to your wsl username
# Set Hadoop-specific environment variables here.
export HDFS_NAMENODE_USER=[wsl-username]
export HDFS_DATANODE_USER=[wsl-username]
export HDFS_SECONDARYNAMENODE_USER=[wsl-username]
export YARN_RESOURCEMANAGER_USER=[wsl-username]
export YARN_NODEMANAGER_USER=[wsl-username]
Change the value of username to your username
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/[wsl-username]/hadoop/dfs/name332</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/[wsl-username]/hadoop/dfs/data332</value>
</property>
</configuration>
chmod +rwx install-hadoop.sh && ./install-hadoop.sh
cd ~/hadoop/hadoop-3.3.6/
Run format name node
bin/hdfs namenode -format
Start Daemon
sbin/start-dfs.sh
Go to web browser and check NameNode at http://localhost:9870/
Start YARN
sbin/start-yarn.sh
Go to web browser and check Yarn http://localhost:8088/
For Vietnamese step-by-step guide, please check this link