2016-11-09 5 views
0

У меня есть 2-х серверная сетка в Apache Ignite. В то время как загрузка базы данных в кеш, один из узлов отключается, следующее сообщение об ошибке, которое я получаю. Я также попытался установить FailureDetectionTimeout и NetworkTimeout значения до их максимального предела, например. 2147483647. Я также попытался настройки виртуальной машины Java на обоих узлах, как указано в пост JVM Tuning, но до сих пор я получаю ту же ошибку Узел, который отключен в 2-х серверной сетке в Apache Ignite

[16:30:31,244][SEVERE][pub-#96%null%][DataStreamProcessor] Failed to respond to node [nodeId=797bf03b-3baf-4724-8eca-ccccec64605c, res=DataStreamerResponse [reqId=34834, forceLocDep=true]] 
 
class org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=797bf03b-3baf-4724-8eca-ccccec64605c, addrs=[0:0:0:0:0:0:0:1%lo, 10.0.42.1, 127.0.0.1, 192.168.140.52], sockAddrs=[01hw146471/192.168.140.52:47500, /10.0.42.1:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1478687030160, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false], topic=T1 [topic=TOPIC_DATASTREAM, id=803ded84851-797bf03b-3baf-4724-8eca-ccccec64605c], msg=DataStreamerResponse [reqId=34834, forceLocDep=true], policy=0] 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1309) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1361) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1331) 
 
\t at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.sendResponse(DataStreamProcessor.java:348) 
 
\t at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:313) 
 
\t at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:50) 
 
\t at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:80) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829) 
 
\t at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 
\t at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 
\t at java.lang.Thread.run(Thread.java:745) 
 
Caused by: class org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=797bf03b-3baf-4724-8eca-ccccec64605c, addrs=[0:0:0:0:0:0:0:1%lo, 10.0.42.1, 127.0.0.1, 192.168.140.52], sockAddrs=[01hw146471/192.168.140.52:47500, /10.0.42.1:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1478687030160, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false] 
 
\t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1996) 
 
\t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1936) 
 
\t at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1304) 
 
\t ... 13 more 
 
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=797bf03b-3baf-4724-8eca-ccccec64605c, addrs=[01hw146471/192.168.140.52:47100, /10.0.42.1:47100, /0:0:0:0:0:0:0:1%lo:47100, /127.0.0.1:47100]] 
 
\t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2499) 
 
\t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2140) 
 
\t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2034) 
 
\t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1970) 
 
\t ... 15 more 
 
\t Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address: 01hw146471/192.168.140.52:47100 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2504) 
 
\t \t ... 18 more 
 
\t Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read remote node recovery handshake (connection closed). 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.safeHandshake(TcpCommunicationSpi.java:2709) 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2371) 
 
\t \t ... 18 more 
 
\t Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address: /10.0.42.1:47100 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2504) 
 
\t \t ... 18 more 
 
\t Caused by: java.net.ConnectException: Connection refused 
 
\t \t at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
 
\t \t at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
 
\t \t at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111) 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2363) 
 
\t \t ... 18 more 
 
\t Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address: /0:0:0:0:0:0:0:1%lo:47100 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2504) 
 
\t \t ... 18 more 
 
\t Caused by: class org.apache.ignite.IgniteCheckedException: Remote node ID is not as expected [expected=797bf03b-3baf-4724-8eca-ccccec64605c, rcvd=54ac75f7-7b87-4502-ba8c-1e3a82e87be3] 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.safeHandshake(TcpCommunicationSpi.java:2614) 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2371) 
 
\t \t ... 18 more 
 
\t Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address: /127.0.0.1:47100 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2504) 
 
\t \t ... 18 more 
 
\t Caused by: class org.apache.ignite.IgniteCheckedException: Remote node ID is not as expected [expected=797bf03b-3baf-4724-8eca-ccccec64605c, rcvd=54ac75f7-7b87-4502-ba8c-1e3a82e87be3] 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.safeHandshake(TcpCommunicationSpi.java:2614) 
 
\t \t at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2371) 
 
\t \t ... 18 more 
 

 
[16:30:31] Topology snapshot [ver=7, servers=1, clients=0, CPUs=48, heap=50.0GB]

ответ

0

Это сообщение обычно означает, что узел назначения уже мертв, или отвечать на запросы. Убедитесь, что:

  • Оба узла имеют достаточно кучи и не исчерпывают память и не страдают от длительных пауз GC.
  • Сеть стабильная, и оба узла могут соединяться друг с другом в любом случае.