此集群三個(gè)節(jié)點(diǎn)基于三臺(tái)虛擬機(jī)(hadoop01、hadoop02、hadoop03)進(jìn)行搭建,虛擬機(jī)安裝的操作系統(tǒng)為Centos6.5,Hadoop版本選取為2.9.1。
實(shí)驗(yàn)過(guò)程
1、基礎(chǔ)集群的搭建
下載并安裝VMware WorkStation Pro,鏈接:https://pan.baidu.com/s/1rA30rE9Px5tDJkWSrghlZg 密碼:dydq
下載CentOS鏡像或者Ubuntu鏡像都可,可以去官網(wǎng)下載,我這里使用的Centos6.5。
使用VMware安裝linux系統(tǒng),制作三臺(tái)虛擬機(jī)。
2、集群配置
設(shè)置主機(jī)名:
vi /etc/sysconfig/network
修改內(nèi)容:
HOSTNAME=hadoop01
三臺(tái)虛擬機(jī)主機(jī)名分別為:hadoop01、hadoop02、hadoop03
修改hosts文件:
vi /etc/hosts
內(nèi)容:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.216.15 www.hadoop01.com hadoop01
192.168.216.16 www.hadoop02.com hadoop02
192.168.216.17 www.hadoop03.com hadoop03
注意:三臺(tái)虛擬機(jī)都做此操作
網(wǎng)絡(luò)環(huán)境配置:
vi/etc/sysconfig/network-scripts/ifcfg-eth0
示例內(nèi)容:
DEVICE=eth0
HWADDR=00:0C:29:0F:84:86
TYPE=Ethernet
UUID=70d880d5-6852-4c85-a1c9-2491c4c1ac11
ONBOOT=yes
IPADDR=192.168.216.111
PREFIX=24
GATEWAY=192.168.216.2
DNS1=8.8.8.8
DNS2=114.114.114.114
NM_CONTROLLED=yes
BOOTPROTO=static
DEFROUTE=yes
NAME="System eth0"
hadoop01:192.168.216.15
hadoop02:192.168.216.16
hadoop03:192.168.216.17
設(shè)置完后,可以通過(guò)ping進(jìn)行網(wǎng)絡(luò)測(cè)試
注意事項(xiàng):通過(guò)虛擬機(jī)文件復(fù)制,可能會(huì)產(chǎn)生網(wǎng)卡MAC地址重復(fù)的問(wèn)題,需要在VMware網(wǎng)卡設(shè)置中重新生成MAC,在虛擬機(jī)復(fù)制后需要更改內(nèi)網(wǎng)網(wǎng)卡的IP。
安裝jdk:
下載jdk,鏈接:
http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
解壓:
tar zxvf jdk-8u181-linux-x64.tar.gz -C /usr/local
配置環(huán)境變量:
vi /etc/profile
內(nèi)容為:
export JAVA_HOME=/usr/local/jdk1.8.0_181 export PATH=$PATH:$JAVA_HOME/bin:
使之生效:
source /etc/profile
設(shè)置免密登陸:
免密登陸,效果也就是在hadoop01上,通過(guò)ssh登陸到對(duì)方計(jì)算機(jī)上時(shí)不用輸入密碼。(注:若沒(méi)有安裝ssh,先進(jìn)行安裝ssh)
首先在hadoop01 上進(jìn)行如下操作:
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys $ chmod 0600 ~/.ssh/authorized_keys
然后:hadoop01 —> hadoop02
ssh-copy-id hadoop02
然后:hadoop01 —> hadoop03
ssh-copy-id hadoop03
說(shuō)明:
hadoop01給hadoop02發(fā)出請(qǐng)求信息,hadoop02接到去authorithd_keys找對(duì)應(yīng)的ip和用戶(hù)名,能找到則用其對(duì)應(yīng)公鑰加密一個(gè)隨機(jī)字符串(找不到則輸入密碼),然后將加密的隨機(jī)字符串返回給hadoop01,hadoop01接到加密后的字符然后用自己的私鑰解密,然后再發(fā)給HAdoop02,hadoop02判斷和加密之前的是否一樣,一樣則通過(guò)登錄,不一樣則拒絕。
如果需要對(duì)其他做免密操作,同理。
3、Hadoop安裝配置
安裝前三臺(tái)節(jié)點(diǎn)都需要需要關(guān)閉防火墻:
service iptables stop : 停止防火墻 chkconfig iptables off :開(kāi)機(jī)不加載防火墻
Hadoop安裝
首先在hadoop01節(jié)點(diǎn)進(jìn)行hadoop安裝配置,之后在hadoop02和hadoop03進(jìn)行同樣的操作,我們可以復(fù)制hadoop文件至hadoop02和hadoop03上:scp -r /usr/local/hadoop-2.9.1 hadoop02:/usr/local/
下載Hadoop壓縮包,下載地址:https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.9.1/hadoop-2.9.1.tar.gz
(1)解壓并配置環(huán)境變量
tar -zxvf /home/hadoop-2.9.1.tar.gz -C /usr/local/
vi /etc/profile
export HADOOP_HOME=/usr/local/hadoop-2.9.1/ export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:
source /etc/profile : 使環(huán)境變量生效
hadoop version : 檢測(cè)hadoop版本信息
[root@hadoop01 hadoop-2.9.1]# source /etc/profile [root@hadoop01 hadoop-2.9.1]# hadoop version Hadoop 2.9.1 Subversion https://github.com/apache/hadoop.git -r e30710aea4e6e55e69372929106cf119af06fd0e Compiled by root on 2018-04-16T09:33Z Compiled with protoc 2.5.0 From source with checksum 7d6d2b655115c6cc336d662cc2b919bd This command was run using /usr/local/hadoop-2.9.1/share/hadoop/common/hadoop-common-2.9.1.jar
此時(shí)環(huán)境變量配置成功。
(2)配置配置文件(注意,此時(shí)的操作是在hadoop解壓目錄下完成的)
[root@hadoop03 hadoop-2.9.1]# pwd /usr/local/hadoop-2.9.1
vi ./etc/hadoop/hadoop-env.sh :指定JAVA_HOME的路徑
修改:
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64
vi ./etc/hadoop/core-site.xml : 配置核心配置文件
fs.defaultFS hdfs://hadoop01:9000 io.file.buffer.size 4096 hadoop.tmp.dir /home/bigdata/tmp
vi ./etc/hadoop/hdfs-site.xml
dfs.replication 3 dfs.blocksize 134217728 dfs.namenode.name.dir /home/hadoopdata/dfs/name dfs.datanode.data.dir /home/hadoopdata/dfs/data fs.checkpoint.dir /home/hadoopdata/dfs/checkpoint/cname fs.checkpoint.edits.dir /home/hadoopdata/dfs/checkpoint/edit dfs.http.address hadoop01:50070 dfs.secondary.http.address hadoop02:50090 dfs.webhdfs.enabled true dfs.permissions false
cp ./etc/hadoop/mapred-site.xml.template ./etc/hadoop/mapred-site.xml
vi ./etc/hadoop/mapred-site.xml
mapreduce.framework.name yarn true mapreduce.jobhistory.address hadoop03:10020 mapreduce.jobhistory.webapp.address hadoop03:19888
vi ./etc/hadoop/yarn-site.xml
yarn.resourcemanager.hostname hadoop02 yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address hadoop02:8032 yarn.resourcemanager.scheduler.address hadoop02:8030 yarn.resourcemanager.resource-tracker.address hadoop02:8031 yarn.resourcemanager.admin.address hadoop02:8033 yarn.resourcemanager.webapp.address hadoop02:8088
vi ./etc/hadoop/slaves
hadoop01 hadoop02 hadoop03
這里,我們可以使用scp工具吧hadoop文件夾復(fù)制到hadoop02、hadoop03上,這樣就不需要重復(fù)配置這些文件了。
命令:
復(fù)制hadoop文件至hadoop02和hadoop03上: scp -r /usr/local/hadoop-2.9.1 hadoop03:/usr/local/ scp -r /usr/local/hadoop-2.9.1 hadoop03:/usr/local/
4、啟動(dòng)Hadoop
在hadoop01格式化namenode
bin/hdfs namenode -format
出現(xiàn):INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
則初始化成功。
啟動(dòng)hadoop:
start-all.sh
[root@hadoop01 hadoop-2.9.1]# start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh 18/09/13 23:29:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [hadoop01] hadoop01: starting namenode, logging to /usr/local/hadoop-2.9.1/logs/hadoop-root-namenode-hadoop01.out hadoop03: starting datanode, logging to /usr/local/hadoop-2.9.1/logs/hadoop-root-datanode-hadoop03.out hadoop01: starting datanode, logging to /usr/local/hadoop-2.9.1/logs/hadoop-root-datanode-hadoop01.out hadoop02: starting datanode, logging to /usr/local/hadoop-2.9.1/logs/hadoop-root-datanode-hadoop02.out Starting secondary namenodes [hadoop02] hadoop02: starting secondarynamenode, logging to /usr/local/hadoop-2.9.1/logs/hadoop-root-secondarynamenode-hadoop02.out 18/09/13 23:29:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable starting yarn daemons starting resourcemanager, logging to /usr/local/hadoop-2.9.1/logs/yarn-root-resourcemanager-hadoop01.out hadoop03: nodemanager running as process 61198. Stop it first. hadoop01: starting nodemanager, logging to /usr/local/hadoop-2.9.1/logs/yarn-root-nodemanager-hadoop01.out hadoop02: starting nodemanager, logging to /usr/local/hadoop-2.9.1/logs/yarn-root-nodemanager-hadoop02.out
這里的一個(gè)警告可以忽略,如果想解決,可以查看我之前帖子。
查看是否正常啟動(dòng):
hadoop01:
[root@hadoop01 hadoop-2.9.1]# jps 36432 Jps 35832 DataNode 36250 NodeManager 35676 NameNode
hadoop02
[root@hadoop02 hadoop-2.9.1]# jps 54083 SecondaryNameNode 53987 DataNode 57031 Jps 49338 ResourceManager 54187 NodeManager
hadoop03
[root@hadoop03 hadoop-2.9.1]# jps 63570 Jps 63448 DataNode 61198 NodeManager
一切正常。
這里,我們?cè)趆adoop03啟動(dòng)JobHistoryServer
[root@hadoop03 hadoop-2.9.1]# mr-jobhistory-daemon.sh start historyserver starting historyserver, logging to /usr/local/hadoop-2.9.1/logs/mapred-root-historyserver-hadoop03.out [root@hadoop03 hadoop-2.9.1]# jps 63448 DataNode 63771 Jps 63724 JobHistoryServer 61198 NodeManager
Web瀏覽器查看一下:
http://hadoop01:50070 namenode的web http://hadoop02:50090 secondarynamenode的web http://hadoop02:8088 resourcemanager的web http://hadoop03:19888 jobhistroyserver的web
驗(yàn)證一下:
namenode的web
secondarynamenode的web
resourcemanager的web
jobhistroyserver的web
OK?。?!
7、Hadoop集群測(cè)試
目的:驗(yàn)證當(dāng)前hadoop集群正確安裝配置
本次測(cè)試用例為利用MapReduce實(shí)現(xiàn)wordcount程序
生成文件helloworld:
echo "Hello world Hello hadoop" > ./helloworld
將文件helloworld上傳至hdfs:
hdfs dfs -put /home/words/helloworld /
查看一下:
[root@hadoop01 words]# hdfs dfs -ls / 18/09/13 23:49:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 3 items -rw-r--r-- 3 root supergroup 26 2018-09-13 23:46 /helloworld drwxr-xr-x - root supergroup 0 2018-09-13 23:41 /input drwxrwx--- - root supergroup 0 2018-09-13 23:36 /tmp [root@hadoop01 words]# hdfs dfs -cat /helloworld 18/09/13 23:50:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Hello world Hello hadoop
執(zhí)行wordcount程序,
[root@hadoop01 words]# yarn jar /usr/local/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar wordcount /helloworld/ /out/00000
出現(xiàn)如下內(nèi)容,則OK
Virtual memory (bytes) snapshot=4240179200 Total committed heap usage (bytes)=287309824 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=26 File Output Format Counters Bytes Written=25
查看生成下的文件:
代碼查看:
[root@hadoop01 words]# hdfs dfs -cat /out/00000/part-r-00000 18/09/14 00:32:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Hello 2 hadoop 1 world 1
結(jié)果正確。
此時(shí)我們就可以在jobHistory看到此任務(wù)的一些執(zhí)行信息,當(dāng)然,還有很多信息,不一一說(shuō)了。晚上00:37了,睡覺(jué)。。。。
竟然維護(hù)。。。。。。。。。。。。。。
詳細(xì)信息可查看官網(wǎng)文檔