• WordPress
  • cPanel
  • Softaculous
  • KVM Virtualization
  • Vmware Virtualization
  • Proxmox
Saturday, June 14, 2025
LinuxBoost
  • Home
  • Almalinux
  • CentOS
  • Debian
  • Fedora
  • Ubuntu
  • Red Hat Enterprise Linux
  • Rocky Linux
  • OpenSUSE
  • Arch Linux
  • Oracle Linux
No Result
View All Result
LinuxBoost
  • Home
  • Almalinux
  • CentOS
  • Debian
  • Fedora
  • Ubuntu
  • Red Hat Enterprise Linux
  • Rocky Linux
  • OpenSUSE
  • Arch Linux
  • Oracle Linux
LinuxBoost
  • Home
  • Almalinux
  • CentOS
  • Debian
  • Fedora
  • Ubuntu
  • Red Hat Enterprise Linux
  • Rocky Linux
  • OpenSUSE
  • Arch Linux
  • Oracle Linux

How to Set up Apache Hadoop on Rocky Linux

in Rocky Linux
How to Set up Apache Hadoop on Rocky Linux

Apache Hadoop is an open-source distributed storage and processing framework that can handle large data sets across clusters of computers. It has become a popular choice for organizations looking to process and analyze massive amounts of data. In this guide, we’ll show you how to set up Apache Hadoop on Rocky Linux, a community-supported enterprise operating system.

Prerequisites

Before we begin, make sure you have the following:

  • A Rocky Linux server
  • Root or sudo access

How to Set up Apache Hadoop on Rocky Linux

Update Your System

First, update your system to the latest available packages:

sudo dnf update -y

Install Java Development Kit (JDK)

Hadoop requires Java to function properly, so we’ll install JDK using the following command:

sudo dnf install java-11-openjdk-devel -y

Verify the installation by checking the Java version:

java -version

Create a Hadoop User on Rocky Linux

Create a new user for Hadoop and add it to the hadoop group:

sudo useradd -m -s /bin/bash -G hadoop hadoop

Set a password for the Hadoop user:

sudo passwd hadoop

Install Apache Hadoop on Rocky Linux

Download the latest version of Hadoop from the official Apache website. At the time of writing, the latest version is 3.3.1:

wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

Extract the downloaded archive:

tar xzf hadoop-3.3.1.tar.gz

Move the extracted files to the /opt/hadoop directory:

sudo mv hadoop-3.3.1 /opt/hadoop

Change the ownership of the /opt/hadoop directory to the Hadoop user:

sudo chown -R hadoop:hadoop /opt/hadoop

Step 5: Configure Hadoop Environment

Switch to the Hadoop user:

bash
su - hadoop

Add the following lines to the .bashrc file:

export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Load the new environment variables:

source .bashrc

Configure Hadoop on Rocky Linux

Edit the core-site.xml file in the $HADOOP_CONF_DIR directory:

vi $HADOOP_CONF_DIR/core-site.xml

Add the following configuration:

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

Edit the hdfs-site.xml file:

vi $HADOOP_CONF_DIR/hdfs-site.xml

Add the following configuration:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/var/lib/hadoop/hdfs/namenode</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/var/lib/hadoop/hdfs/datanode</value>
    </property>
</configuration>

Save and close the file. These settings define the replication factor, name node directory, and data node directory for HDFS. Next, configure YARN by editing the yarn-site.xml file:

vi $HADOOP_CONF_DIR/yarn-site.xml

Add the following configuration:

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>localhost</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

Save and close the file. This configuration sets the resource manager hostname and enables the MapReduce shuffle service on the node manager. Now, set up the MapReduce framework by editing the mapred-site.xml file:

cp $HADOOP_CONF_DIR/mapred-site.xml.template $HADOOP_CONF_DIR/mapred-site.xml
vi $HADOOP_CONF_DIR/mapred-site.xml

Add the following configuration:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

Save and close the file. This configuration sets the MapReduce framework to use YARN. After configuring Hadoop, you need to create the HDFS directories defined earlier:

sudo mkdir -p /var/lib/hadoop/hdfs/namenode
sudo mkdir -p /var/lib/hadoop/hdfs/datanode
sudo chown -R hadoop:hadoop /var/lib/hadoop

Now, format the Hadoop distributed file system (HDFS) with the following command:

sudo -u hadoop hdfs namenode -format

This command initializes the HDFS name node. With everything set up, start the Hadoop daemons:

sudo -u hadoop $HADOOP_HOME/sbin/start-dfs.sh
sudo -u hadoop $HADOOP_HOME/sbin/start-yarn.sh

To verify that Hadoop is running correctly, use the jps command:

sudo -u hadoop jps

You should see output similar to the following, indicating that the Hadoop daemons are running:

12345 NameNode
23456 SecondaryNameNode
34567 DataNode
45678 ResourceManager
56789 NodeManager

Congratulations! You have successfully set up Apache Hadoop on your Rocky Linux system. You can now start using Hadoop for your big data processing tasks. For more information on how to use Hadoop, refer to the official Hadoop documentation.

In this article, we’ve covered the installation and configuration of Apache Hadoop on Rocky Linux. We also discussed how to set up HDFS, YARN, and the MapReduce framework for big data processing. If you are interested in learning more about related topics How to Install and Configure Kibana on Rocky Linux and How to Install and Configure Puppet on Rocky Linux.

ShareTweet
Previous Post

How to Install and Configure Kibana on Rocky Linux

Next Post

How to Set up Apache Spark on Rocky Linux

Related Posts

How to Install and Configure OpenVAS on Rocky Linux

How to Install and Configure OpenVAS on Rocky Linux

How to Install and Configure Nikto on Rocky Linux

How to Install and Configure Nikto on Rocky Linux

Set up FreeIPA on Rocky Linux

How to Install and Configure FreeIPA on Rocky Linux

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Golden Host VPS
  • Privacy Policy
  • Terms and Conditions
  • About Us
  • Contact Us

Copyright © 2023 linuxboost.com All Rights Reserved.

  • Privacy Policy
  • Terms and Conditions
  • About Us
  • Contact Us

Copyright © 2023 linuxboost.com All Rights Reserved.