The HTTP CONNECT tunnel
Aug 12, 2018

HTTPS is widely used on Internet to secure the data being transferred. However, when a browser needs to send a HTTPS request through proxy, since the request hostname and port number are all encrypted in HTTPS request header and even the proxy cannot get them, then how does the proxy know where to send client's request? To solve this problem, the browser sends a HTTP request with method CONNECT and the target hostname and port number to the proxy. When receiving the CONNECT request, the proxy establishes a TCP connection to the requested hostname on the specified port and then returns HTTP 200 response to tell the browser the requested connection was made. After that, the proxy should just blindly forward the packets back and forth between the client and the server without looking at them until the tunnel is closed.

When does a browser use CONNECT?

CONNECT is intended only for use in requests to a proxy.

CONNECT tunnel workflow

We use Fiddler as a proxy server and browser to visit https://www.microsoft.com/ to explain what will happen.

  1. The browser sends a HTTP CONNECT request to the proxy:

    CONNECT www.microsoft.com:443 HTTP/1.0
    User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
    Host: www.microsoft.com
    Content-Length: 0
    DNT: 1
    Connection: Keep-Alive
    Pragma: no-cache
    
  2. The proxy then returns HTTP/1.0 200 indicating the requested connection was established:

    HTTP/1.0 200 Connection Established
    FiddlerGateway: Direct
    StartTime: 11:56:22.008
    Connection: close
    EndTime: 11:56:22.538
    ClientToServerBytes: 1416
    ServerToClientBytes: 1358
    
  3. The browser starts HTTPS handshake with the server and exchanges encrypted data. Fiddler as a proxy only forwards the packets between the browser and the server without knowing the HTTPS data being transferred (unless you enabled the Decrypt HTTPS traffic option in Fiddler and installed the Fiddler root certificate). From a Wireshark log, we can see the browser started TLS handshake right after the HTTP/1.0 200 message in frame 12.

How to handle server error 5xx?

If the proxy server encounters errors like: DNS resolve failure, connect timeout when connecting to the target server, how should it handle the CONNECT request? Since these errors cause the proxy server fails to establish the TCP connection to the server, then the proxy should return the corresponding error code to the browser to reveal the failure reason.

  • 502: Bad Gateway
  • 504: Gateway timeout

Can the proxy return 3xx redirection?

Some proxy servers may return a 302 redirection to the browser after connection failure in order to redirect the user to a friendlier page that tells the exact failure reason. Though the HTTP 1.1 RFC does not explicitly forbid this, it is not appropriate to return 302 redirection for an unreachable server based on 302 definition: the target resource resides temporarily under a different URI because the resource was not moved to other place at all, but the destination is unreachable. If the requested resource was indeed moved to somewhere else, the browser should be told only by the target server in an encrypted way instead of being told by the proxy in plaintext.

Besides, most modern browsers do not proceed the 302 redirection from a CONNECT request. We can use the Auto responder feature of Fiddler to simulate the behaviors of browsers.

HTTP CONNECT request:

CONNECT www.nonexistingwebsite.com:443 HTTP/1.1
Host: www.nonexistingwebsite.com:443
Proxy-Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36

Fiddler HTTP Response:

HTTP/1.1 302 Redirect
FiddlerTemplate: True
Date: Fri, 25 Jan 2013 16:49:29 GMT
Location: http://www.fiddler2.com/sandbox/FormAndCookie.asp
Content-Length: 0

Chrome, Firefox and IE behaved the same, they all just showed a built-in error page and did not redirect to the Location specified in the 302 Redirect response.

References