{ Josh Rendek }

<3 Go & Kubernetes

You’ve raised your file descriptor limits, updated security limits, tweaked your network settings and done everything else in preperation to launch your shiny new dockerized application.

Then you have performance issues and you can’t understand why, it looks to be network related. Alright! Let’s see what’s going on:

1ping google.com
2unknown host google.com

Maybe its DNS related…. Let’s try again:

1ping 8.8.8.8
2ping: sendmsg: Operation not permitted

That’s odd, maybe it’s a networking issue outside of our servers. Lets try pinging another host on the subnet:

1ping 10.10.0.50
2ping: sendmsg: Operation not permitted

That’s even more odd, our other host isn’t having network issues at all. Lets try going the other way:

1ping 10.10.0.49 # the bad host
2# Lots of packet loss

We’re getting a lot of packet loss going from Host B to Host A (the problem machine). Maybe it’s a bad NIC?

Just for fun I decided to try and ping localhost/127.0.0.1:

1ping 127.0.0.1
2ping: sendmsg: Operation not permitted

That’s a new one. What the heck is going on? Now at this point I derped out and didn’t think to check dmesg. Lets assume you went down the road I went and derped.

What’s the different between host A and B? Well, host B doesn’t have docker installed!

 1apt-get remove docker-engine; reboot
 2
 3# .... wait for reboot
 4
 5ping 127.0.0.1
 6# working
 7ping 8.8.8.8
 8# working
 9ping google.com
10# working
1apt-get install docker-engine
2ping 127.0.0.1
3ping: sendmsg: Operation not permitted
4
5ping 8.8.8.8
6ping: sendmsg: Operation not permitted

Okay so it happens when docker is installed. We’ve isolated it. Kernel bug maybe? Queue swapping around kernels and still the same issue happens.

Fun side note: Ubuntu 14.04 has a kernel bug that prevents booting into LVM or software raided grub. Launchpad Bug

Switching back to the normal kernel (3.13) that comes with 14.04, we proceed. Docker bug? Hit up #docker on Freenode. Someone mentions checking dmesg and conntrack information.

Lo-and-behold, dmesg has tons of these:

1ip_conntrack: table full, dropping packet
2# x1000

How does docker networking work? NAT! That mean’s iptables needs to keep track of all your connections, hence the full message.

If you google the original message you’ll see a lot of people telling you to check your iptables rules and ACCEPT/INPUT chains to make sure there isn’t anything funky in there. If we combine this knowledge + the dmesg errors, we now know what to fix.

Lets update sysctl.conf and reboot for good measure ( you could also apply them with sysctl -p but I wanted to make sure everything was fresh. )

1net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 54000
2net.netfilter.nf_conntrack_generic_timeout = 120
3net.netfilter.nf_conntrack_max = 556000

Adjust the conntrack max until you hit a stable count (556k worked well for me) and don’t get anymore connection errors. Start your shiny new docker application that makes tons of network connections and everything should be good now.

Hope this helps someone in the future, as Google really didn’t have a lot of useful information on this message + Docker.

comments powered by Disqus