格式化namenode时报错No Route to Host from node1/192.168.3.101 to hadoop05:8485 failed on socket timeout ex

Irma ·
更新时间:2024-11-13
· 995 次阅读

格式化namenode时 报错 No Route to Host from node1/192.168.3.101 to hadoop05:8485 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host解决方案

一、报错信息概要:

在配置hadoop高可用HA集群的时候,在使用hadoop namenode -format格式化namenode节点时候,始终保如下错误:

No Route to Host from  hadoop01/192.168.3.101 to hadoop05:8485 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost

二、具体报错如下:

[work@hadoop01 sbin]$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

20/02/16 01:51:52 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = hadoop01/192.168.3.101
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.7.7
STARTUP_MSG:   classpath = /home/work/softwares/had

**************************************************
*****************************************

20/02/16 01:51:59 INFO util.GSet: capacity      = 2^15 = 32768 entries
20/02/16 01:52:06 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:07 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:08 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:09 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:10 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:11 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:12 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:13 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:14 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:15 INFO ipc.Client: Retrying connect to server: hadoop05/192.168.3.105:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
20/02/16 01:52:15 WARN namenode.NameNode: Encountered exception during format: 
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 2 successful responses:
192.168.3.104:8485: false
192.168.3.106:8485: false
1 exceptions thrown:
192.168.3.105:8485: No Route to Host from  hadoop01/192.168.3.101 to hadoop05:8485 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost
    at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:233)
    at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:901)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:202)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1011)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1457)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1582)
20/02/16 01:52:15 ERROR namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 2 successful responses:
192.168.3.104:8485: false
192.168.3.106:8485: false
1 exceptions thrown:
192.168.3.105:8485: No Route to Host from  hadoop01/192.168.3.101 to hadoop05:8485 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost
    at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:233)
    at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:901)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:202)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1011)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1457)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1582)
20/02/16 01:52:15 INFO util.ExitUtil: Exiting with status 1
20/02/16 01:52:15 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop01/192.168.3.101
************************************************************/

三、错误排查和分析:

3.1、根据报错信息,查看https://cwiki.apache.org/confluence/display/HADOOP2/NoRouteToHost,可能出错的原因如下:

经过查看IP配置、hosts配置、DNS域名解析服务器均正常。那么很可能是防火墙问题造成的。

3.2、参考“https://blog.csdn.net/weixin_30379973/article/details/95539379” 这篇博客关闭防火墙的方法:

报错内容如上,在格式化namenode时报错,系统为centos7

原因:关闭防火墙时用的service iptables stop

解决方法:使用命令  systemctl stop firewalld 来关闭防火墙

再次格式化namenode,成功

上面命令却报错误如下:

[root@hadoop05 sbin]# service iptables status
Redirecting to /bin/systemctl status iptables.service
Unit iptables.service could not be found.
[root@hadoop05 sbin]# service iptables stop
Redirecting to /bin/systemctl stop iptables.service
Failed to stop iptables.service: Unit iptables.service not loaded.
 

查看linux版本:

[root@hadoop05 sbin]# cat /proc/version
Linux version 3.10.0-862.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Fri Apr 20 16:44:24 UTC 2018

[root@hadoop05 sbin]# uname -a
Linux hadoop05 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

四、正确解决办法:

[work@hadoop05 logs]$ systemctl status firewalld.service

● firewalld.service - firewalld - dynamic firewall daemon

Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)

Active: active (running) since Mon 2020-02-17 09:53:00 CST; 54min ago

Docs: man:firewalld(1)

Main PID: 712 (firewalld)

Tasks: 2

CGroup: /system.slice/firewalld.service

└─712 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid

[root@hadoop05 logs]# systemctl stop firewalld.service

[root@hadoop05 sbin]# systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Mon 2020-02-17 10:48:34 CST; 9s ago
     Docs: man:firewalld(1)
  Process: 712 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 712 (code=exited, status=0/SUCCESS)

Feb 17 09:52:45 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
Feb 17 09:53:00 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
Feb 17 10:48:30 hadoop05 systemd[1]: Stopping firewalld - dynamic firewall daemon...
Feb 17 10:48:34 hadoop05 systemd[1]: Stopped firewalld - dynamic firewall daemon.

[root@hadoop05 sbin]# systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Mon 2020-02-17 10:48:34 CST; 1h 12min ago
     Docs: man:firewalld(1)
  Process: 712 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 712 (code=exited, status=0/SUCCESS)

Feb 17 09:52:45 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
Feb 17 09:53:00 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
Feb 17 10:48:30 hadoop05 systemd[1]: Stopping firewalld - dynamic firewall daemon...
Feb 17 10:48:34 hadoop05 systemd[1]: Stopped firewalld - dynamic firewall daemon.

使用命令  systemctl stop firewalld.service 来关闭防火墙

再次格式化namenode,成功


作者:Data_IT_Farmer



timeout route FROM TO ON ex hadoop host socket

需要 登录 后方可回复, 如果你还没有账号请 注册新账号
相关文章