Install, setup Cloudera Hadoop on Linux
Install Cloudera Hadoop on Linux
This article talks about installing Hadoop on a single host machine.
Hadoop is the framework for a large amount of data processing paralleled
Hadoop implementation is provided by different vendors like hortionworks and Cloudera.
This article talks about installing Cloudera Hadoop on a single machine.
To set up Cloudera Hadoop, java is required. if java is not already installed, install JDK 1.6, at least update 8
Please download Cloudera-testing. repo from here🔗 and copy it to /etc/yum.repos.d/
and make sure you update the yum command.
Please run the below commands to install hadoop, hive, and pig
yum install hadoop-0.20 -y
yum install hadoop-hive -y
yum install hadoop-pig -y
The above commands installs hadoop to /usr/lib/hadoop
folder, hive installs to /usr/lib/hive
, pig to /usr/lib/pig
please set up the environment variables as described below in the .bash_rc file
$ \\vi ~/.bashrc
export HADOOP_HOME=/usr/lib/hadoop
export HIVE_HOME=/usr/lib/hive
export PIG_HOME=/usr/lib/pig
export PATH=$HADOOP\_HOME/bin:$PATH:$PIG\_HOME/bin:$HIVE_HOME/bin
save it to .bashrc file
$ source ~/.bashrc
Open $HADOOP_HOME/conf/hadoop-env.sh. Add JAVA_HOME path. Ex: export JAVA_HOME=/usr/java/jdk1.6.0_18
* Open $HADOOP_HOME/conf/core-site.XML. Add the Namenode server name or localhost and port for fs.default.name. Ex:
fs.default.name hdfs://localhost:9000