相关设备
主机 | 系统 | hostname | ip |
---|---|---|---|
master | CentOS7.2 | master.hadoop | 192.168.142.129 |
slave1 | CentOS7.2 | 1.slave.hadoop | 192.168.142.131 |
slave2 | CentOS7.2 | 2.slave.hadoop | 192.168.142.132 |
slave3 | CentOS7.2 | 3.slave.hadoop | 192.168.142.133 |
所有节点上的操作
配置hosts
vim /etc/hosts/
#添加以下内容
192.168.142.129 master.hadoop
192.168.142.131 1.slave.hadoop
192.168.142.132 2.slave.hadoop
192.168.142.133 3.slave.hadoop
创建用户、用户组和相关目录
- 创建用户组和用户
groupadd hadoop useradd -d /home/hadoop -m hadoop -g hadoop
为hadoop用户设置密码
passwd hadoop
生成密钥
sudo -Hu hadoop ssh-keygen -t rsa
* 创建相关目录
``` shell
mkdir -p /hadoop/tmp
mkdir -p /hadoop/hdfs/data
mkdir -p /hadoop/hdfs/name
chown -R hadoop:hadoop /hadoop/
安装ntp服务及其它相关程序
yum install ntp -y
systemctl enable ntpd
systemctl start ntpd
yum install openssl-devel -y
安装JDK
- 卸载centos自带的openjdk
#查看已安装的jdk
rpm -qa | grep jdk
#输出示例
java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64
java-1.8.0-openjdk-headless-1.8.0.121-0.b13.el7_3.x86_64
java-1.7.0-openjdk-1.7.0.131-2.6.9.0.el7_3.x86_64
java-1.7.0-openjdk-headless-1.7.0.131-2.6.9.0.el7_3.x86_64
copy-jdk-configs-1.2-1.el7.noarch
#卸载
yum remove java-1.7.0-openjdk-headless.x86_64 -y
yum remove java-1.7.0-openjdk -y
yum remove java-1.8.0-openjdk-headless-1.8.0.121-0.b13.el7_3.x86_64 -y
yum remove java-1.8.0-openjdk -y
- 从官网下载jdk的rpm安装包并安装
官网下载网站:
http://www.oracle.com/technetwork/java/javase/downloads/index.html
cd /usr/src/
wget http://download.oracle.com/otn-pub/java/jdk/8u121-b13/e9e7ea248e2c4826b92b3f075a80e441/jdk-8u121-linux-x64.rpm?AuthParam=1489989096_11dbb8b04d10d8c53d34c4aea30bdd71
#上面的这个链接是有有效期的,只能临时用用
#下载后将文件重命名为jdk1.8.0_121.rpm
mv jdk-8u121-linux-x64.rpm?AuthParam=1489989096_11dbb8b04d10d8c53d34c4aea30bdd71 jdk1.8.0_121.rpm
#试过用wget -O参数直接重命名,但是这样不知道为什么,下载速度很慢
rpm -ivh jdk1.8.0_121.rpm
#安装地址 /usr/java/jdk1.8.0_121
#版本不同,安装地址也会有所不同
- 配置JAVA系统环境变量
vim /etc/profile.d/java.sh
添加以下内容
#!/bin/bash
JAVA_HOME=/usr/java/jdk1.8.0_121
JRE_HOME=$JAVA_HOME/jre
PATH=$PATH:$JAVA_HOME/bin
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/jre/lib/rt.jar
export PATH JAVA_HOME JRE_HOME CLASSPATH
立即生效并查看
source /etc/profile.d/java.sh
java -version
安装hadoop
- 下载hadoop
cd /usr/src/
wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
tar -zxvf hadoop-2.7.3.tar.gz
#拷贝至特定目录
cp /usr/src/hadoop-2.7.3/ /usr/local/hadoop/ -R
chown -R hadoop:hadoop /usr/local/hadoop/
- 配置hadoop环境变量
vim /etc/profile.d/hadoop.sh
添加以下内容
#!/bin/bash
HADOOP_HOME=/usr/local/hadoop
PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export PATH HADOOP_HOME
立即生效
source /etc/profile.d/hadoop.sh
hadoop version
输出示例
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar
配置hadoop
HADOOP_HOME=/usr/local/hadoop
因为的配置了JAVA_HOME 系统环境变量,所以无需修改hadoop-env.sh和yarn-env.sh文件手动配置JAVA_HOME
- core-site.xml
vim /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master.hadoop:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
</configuration>
- hdfs-site.xml
vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master.hadoop:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
- /usr/local/hadoop/etc/hadoop/mapred-site.xml
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml vim /usr/local/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>master.hadoop:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master.hadoop:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master.hadoop:19888</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>http://master.hadoop:9001</value>
</property>
</configuration>
- yarn-site.xml
vim /usr/local/hadoop/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master.hadoop</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master.hadoop:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master.hadoop:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master.hadoop:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master.hadoop:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master.hadoop:8088</value>
</property>
</configuration>
在各节点上的操作
进行以下步骤前,请确保以上步骤已在所有节点上(包括master)操作完毕
- master节点上的操作
#配置slaves
vim /usr/local/hadoop/etc/hadoop/slaves
#删掉原先的locahost,添加以下内容
1.slave.hadoop
2.slave.hadoop
3.slave.hadoop
#ssh免密码登陆
#逐行操作,每步会需要hadoop用户账户的密码
su hadoop
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@master.hadoop
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@1.slave.hadoop
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@2.slave.hadoop
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@3.slave.hadoop
# 测试连通性
su hadoop
ssh master.hadoop
exit
ssh 1.slave.hadoop
exit
ssh 2.slave.hadoop
exit
ssh 3.slave.hadoop
exit
基本使用
在master节点操作
cd /usr/local/hadoop/sbin/
#切换为hadoop用户进行操作
su hadoop
#启动
./start-all.sh
#停止
./stop-all.sh