I am currently in touch with a few HTTP proxy installations. As every time when troubleshooting network issues, I am looking at Wireshark on the network and trying to understand the different packets.
Here is a short overview of the differences between HTTP requests that are sent directly to the destination and HTTP requests that are sent via a proxy. Wireshark screenshots and a downloadable pcap round things up.
Proxy Traffic vs. Normal Traffic
Following is the main figure for this article. It shows the two different packet types:
- Direct HTTP requests: Destination IP is the HTTP server and the requested URI shows only the path behind the domain. Preceding is a DNS request from the client to its configured recursive DNS server.
- HTTP proxy requests: The first packet is sent to the proxy. The requested URI shows the complete URL (host + path). The second packet is sent from the proxy to the final destination. And since it is a “real” proxy, both packets are inside its own TCP connection with different source addresses as well. Only the proxy queries the DNS server. No DNS query is sent from the client itself.
In both scenarios, the “Host” value in the HTTP request is set to the requested domain. In the case of a proxy, the HTTP X-Forwarded-For header with the client IP address might be inserted.
Note that the arrows in the figure show only the first HTTP packet flow, though it is a bi-directional communication in which the returning packets have the inverse order of source/destination IPs and ports.
Wireshark HTTP vs. DIRECT
As an example, I opened my What-is-my-IP script at http://ip.webernetz.net two times: The first one without a proxy and the second one with a proxy on port 3128. (Note that the proxy server can run on different ports, e.g., 80, 8080, 3128.) The proxy IP came from a free proxy list (link below).
Note that you won’t see the X-Forwarded-For header here, because I captured at the client and not between the proxy and the webserver.
Wireshark HTTPS
[I initially wrote this post in 2014, when HTTPS was not the default on the Internet.]As everything is HTTPS nowadays, let’s have a look at an HTTPS connection through a proxy. It uses the “CONNECT” method while the URI only lists the FQDN/Host of the website, not the full path:
PCAP Download
As always, here’s the appropriate pcap for download, 7zipped, 184 KB:
Links
- Wikipedia: Proxy Server
- Free Proxy Lists
- Small What-is-my-IP script
- Firefox Add-on X-Forwared-For Header
Featured image “Packets of spice” by Saaleha Bamjee is licensed under CC BY 2.0.
Great article , just the answer I was looking for :-)
by the way, Perfect!!! haha
Many thanks! It’s what I was looking for, quick and to the point.
quick and to the point. Thanks!
Why method connect is not used ?
CONNECT is for SSL traffic; you can see in the capture that the request contains http, not https, hence GET will be used. If the request had been for https://ip.webernetz.net, you’d see the CONNECT request. You can play with the Fiddler application and experience this if you are interested.
Hey guys. Thanks for the hint. I just captured an HTTPS connection, updated the post with a Wireshark screenshot showing the CONNECT method, and uploaded an enhanced PCAP. ;)
Still helpful in 2017. Thanks!
谢谢你,我一直在中国论坛没有找到答案,谢谢你的帮助,
Just that, I was looking for.
Good article, would be nice to see the equivalent for transparent proxy situations – where client is unaware of an upstream proxy, especially in the light of TLS1.3 use-cases where you have to be careful as 1.3 does not allow selective decryption@proxy level.
Thx
Rik
Thanks, very well explained and easy.
Thanks for that, Giorgio. Almost 10 years after this post was published. ;) It’s always funny for me to see which posts are still relevant and which aren’t.
Interessant zu sehen! Insbesondere die genaue Funktionsweise des HTTPS-Proxy war mir vorher nicht ganz klar.
Lukas