One of the mysteries for me in IP networks was the Path MTU Discovery (PMTUD) process. I’ve seldom seen any problems with the MTU at all. Fortunately, while troubleshooting some router issues, I captured several ICMP “packet too big” errors along with the original packets. 👍🏻
Let’s have a look at those PMTUD processes for IPv6 and legacy IP with Wireshark. Of course, these captured connections are part of the Ultimate PCAP as well, hence, you can download the most current version of it and analyze it by yourself.
Path MTU Discovery is a technique used in computer networking to determine the maximum transmission unit (MTU) size on the network path between two IP hosts. The MTU is the largest size of a packet that can be transmitted without needing fragmentation.
First example: an IPv6 client is trying to reach an IPv6 server. Please refer to the screenshot along with the numbered red dots (x). Within the Ultimate PCAP, find those packets with the following display filter: ipv6.addr eq 2003:c6:af35:ed00:4c3e:a836:be17:2f13 and ipv6.addr eq 2001:470:1f0b:16b0:6986:b8d4:3649:9cbe.
- Right after the TCP 3-way handshake, the first packet with data is 1971 bytes long (1), which is too long for one link on the path.
- The router that terminates this link replies with an ICMPv6 “Packet Too Big” message, type eq 2, stating that the the MTU must not exceed 1492 bytes (2). Side note: This was a PPPoE port. Please note the “MTU” column in Wireshark I’m using for this screenshot (3), which is a custom column with the fields value set to “icmpv6.mtu or icmp.mtu”. Don’t confuse it with the “Length” column.
- The client honours the new maximum MTU size and fragments the initial packet into smaller ones, now with an overall length of 1506 bytes (4). Note that the length which is displayed in Wireshark is the whole frame, that is, the payload for IP (= MTU) plus the Ethernet overhead of 18 bytes. The new MTU of 1492 + 18 bytes overhead = 1510 bytes max, while the sent 1506 bytes are smaller. ✅
- However, another link along the path replied with a “Packet Too Big” as well, stating that the max MTU should not exceed 1480 bytes (5).
- Hence, the client sent the same payload for the third time (6), but fragmented it with a smaller size, 1494 bytes, which is smaller than 1480+18=1498 bytes. ✅
- Note the packet details of the ICMPv6 messages, packet nr. 5, in the Wireshark pane beneath. This not only shows the required MTU size but also embeds as much as possible from the original packet (7).
- Also note that both ICMPv6 errors were sent from formerly unknown routers along the path, that is: from IPv6 addresses *other* than the communication participants!
The second example shows a legacy IPv4 connection. Use the following display filter to find those packets within the Ultimate PCAP: ip.addr eq 192.168.1.111 and ip.addr eq 100.25.225.34. Normally, every IPv4 hop along the path can fragment packets if the next link has a smaller MTU size. This is in contrast to IPv6, where only end nodes are allowed to fragment! But in this example, the “Don’t fragment” bit was set by the source node:
The same as in the first examples applies:
- The first packet has a frame length of 1967, which is too big for the whole path to succeed (1).
- Since this packet has the “Don’t fragment” bit set, the router with a link of a lower MTU can neither forward nor fragment this frame, hence it replies with a “Destination unreachable, Fragmentation needed” ICMP error (2), type eq 3, code eq 4. This ICMP error lists the max MTU for the next link to be 1492 bytes (3).
- The next packet from the source has a lower frame length of 1506 in summary (4). Again, subtracting the 18 bytes Ethernet overhead gives a payload of 1488, which is smaller than the required 1492. ✅
Soli Deo Gloria!
Photo by Ugne Vasyliute on Unsplash.
Thanks a lot. Very interesting read!
Hi Johannes,
genau damit habe ich mich auch erst kürzlich beschäftigt. :-) Gefunden habe ich dazu den folgenden Link: https://networkengineering.stackexchange.com/questions/13417/exactly-when-is-pmtud-performed-path-mtu-discovery
Diese Aussage fande ich spannend: “Note that in practice, PMTUD is a continuous process; packets are sent with the DF bit set – including the 3-way handshake packets – you can think of it as a connection property (although an implementation may be willing to accept a certain degree of fragmentation at some point and stop sending packets with the DF bit set). Thus, PMTUD is just a consequence of the fact that everything on that connection is being sent with DF.”
Um das genauer zu verstehen, habe ich tatsächlich mal RFC 1191 gelesen. Insbesondere die Kommunikation zwischen dem IP Layer und den “Packetization Protocols” (a.k.a. TCP und co.) finde ich wirklich interessant. Es *muss* ein permanenter Prozess sein, da sich die Path MTU während einer laufenden Verbindung ja ständig ändern kann (Re-Routing, …) – und zwar in beide Richtungen (increase/decrease).
Man kann die PMTU auch manuell ermitteln oder gar den im IP-Layer verwalteten PMTU-Wert pro Destination anzeigen.
Windows
– Ermittlung: Test-Connection [-TargetName] -MTUSize ODER ping -f -l
– Anzeige: netsh interface ipv4 show destinationcache
Linux
– Ermittlung: tracepath ODER ping -M do -s
– Anzeige: ip route get
Pro-Tipp: Anzeigen -> Ermitteln -> Anzeigen sollte sichtbar machen, dass sich dein PMTU-Cache füllt.
Grüße,
Lukas :-)
Yeah, wow. Thanks for this follow-up information!