HTTPS is widely used on Internet to secure the data being transferred. However, when a browser needs to send a HTTPS request through proxy, since the request hostname and port number are all encrypted in HTTPS request header and even the proxy cannot get them, then how does the proxy know where to send client's request? To solve this problem, the browser sends a HTTP request with method CONNECT and the target hostname and port number to the proxy. When receiving the CONNECT request, the proxy establishes a TCP connection to the requested hostname on the specified port and then returns HTTP 200 response to tell the browser the requested connection was made. After that, the proxy should just blindly forward the packets back and forth between the client and the server without looking at them until the tunnel is closed.
When does a browser use CONNECT?
CONNECT is intended only for use in requests to a proxy.
CONNECT tunnel workflow
We use Fiddler as a proxy server and browser to visit
https://www.microsoft.com/ to explain what will happen.
The browser sends a HTTP CONNECT request to the proxy:
CONNECT www.microsoft.com:443 HTTP/1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko Host: www.microsoft.com Content-Length: 0 DNT: 1 Connection: Keep-Alive Pragma: no-cache
The proxy then returns HTTP/1.0 200 indicating the requested connection was established:
HTTP/1.0 200 Connection Established FiddlerGateway: Direct StartTime: 11:56:22.008 Connection: close EndTime: 11:56:22.538 ClientToServerBytes: 1416 ServerToClientBytes: 1358
The browser starts HTTPS handshake with the server and exchanges encrypted data. Fiddler as a proxy only forwards the packets between the browser and the server without knowing the HTTPS data being transferred (unless you enabled the Decrypt HTTPS traffic option in Fiddler and installed the Fiddler root certificate). From a Wireshark log, we can see the browser started TLS handshake right after the
HTTP/1.0 200message in frame 12.
How to handle server error 5xx?
If the proxy server encounters errors like: DNS resolve failure, connect timeout when connecting to the target server, how should it handle the CONNECT request? Since these errors cause the proxy server fails to establish the TCP connection to the server, then the proxy should return the corresponding error code to the browser to reveal the failure reason.
- 502: Bad Gateway
- 504: Gateway timeout
Can the proxy return 3xx redirection?
Some proxy servers may return a 302 redirection to the browser after connection failure in order to redirect the user to a friendlier page that tells the exact failure reason. Though the HTTP 1.1 RFC does not explicitly forbid this, it is not appropriate to return 302 redirection for an unreachable server based on 302 definition: the target resource resides temporarily under a different URI because the resource was not moved to other place at all, but the destination is unreachable. If the requested resource was indeed moved to somewhere else, the browser should be told only by the target server in an encrypted way instead of being told by the proxy in plaintext.
Besides, most modern browsers do not proceed the 302 redirection from a CONNECT request. We can use the Auto responder feature of Fiddler to simulate the behaviors of browsers.
HTTP CONNECT request:
CONNECT www.nonexistingwebsite.com:443 HTTP/1.1 Host: www.nonexistingwebsite.com:443 Proxy-Connection: keep-alive User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36
Fiddler HTTP Response:
HTTP/1.1 302 Redirect FiddlerTemplate: True Date: Fri, 25 Jan 2013 16:49:29 GMT Location: http://www.fiddler2.com/sandbox/FormAndCookie.asp Content-Length: 0
Chrome, Firefox and IE behaved the same, they all just showed a built-in error page and did not redirect to the
Location specified in the 302 Redirect response.
- Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
- Understanding CONNECT Tunnels
- Bug 479880 (CVE-2009-1836) Non-200 responses to proxy CONNECT requests lead to attacks on HTTPS