I have a cluster of Swarm nodes:
- N10 and N11, hosted by provider P1 (some datacenter)
- N20 and N21, hosted by provider P2 (some other datacenter)
Traefik is deployed on N10:
- if my service is deployed on N11: Traefik routes the calls correctly.
- if my service is deployed on N20: Traefik goes in Gateway timeout 504.
Within my service
- if my app is deployed on N11 and my database on N12, all works.
- if my app is deployed on N11 and my database on N20, connection with the database fails.
Clearly, there’s a connection failure between N10/N11 and N20, and the same errors occurs if I try with a fresh clean server N21 from provider P2.
- nodes from P2 have no firewall (test servers)
- nodes from P1 have ports 2377, 4789, 7946 (UDP/TCP) open for N20/N21
- nodes from P2 join the swarm joyfully
- I can deploy stacks to nodes from P2 without errors
- Swarmpit, a swarm manager, deployed on N11 can access info and logs from services deployed on N20/N21
- within Traefik container on N10, connections to services on N20/N21 fail (timeout); but connections from N10 (Traefik’s host) to these services (via N20/N21 direct IP) succeeds.
I’m quite puzzled, and have still to improve in networking.
Any help, any clue appreciated.