My Journey Building a WebSocket Server in Go

Feb 24, 2024

A favorite YouTuber of mine once said that the best way to learn a new programming language is to build projects with it. Inspired by this this video, I decided to dive in and create my own WebSocket server using Go.

(There are articles out there explaining what the WebSocket protocol is. So I won’t write about it and let them play the role.)

Read RFC6455

Before starting, my knowledge of the WebSocket protocol was pretty limited:

It’s built on top of HTTP.
Communication starts with a client sending an HTTP “Upgrade” request to switch to WebSocket.
The server responds, and boom! A WebSocket connection is established.

Not knowing much else, I decided to consult the source - RFC6455. This is where I learned that WebSocket really only uses HTTP for the initial handshake. After that, it’s all about exchanging raw TCP data in the form of data frames defined by the RFC.

The Challenge: Mixing HTTP and TCP

This is where things got tricky. Normally in Go, we don’t have direct access to TCP payloads when working with HTTP objects. So, how do we combine HTTP and TCP communication in our server?

One straightforward approach is running a separate TCP server:

separate WebSocket server

// Here, conn is TCP connection (net.Conn)
// ...
const delimiter = "\r\n" //  "\u000D\u000A" // CR & LF
buf := make([]byte, 2048)
n, err := conn.Read(buf)
if err != nil {
    log.Println("error reading payload to buffer: ", err)
    return
}
payload := strings.Split(string(buf[:n]), delimiter)
reqLine := strings.Split(payload[0], " ")
// validate HTTP method, version
headers := getHTTPHeaders(payload[1:])
// check Host, Upgrade, Connection, and WebSocket version headers
// generate WebSocket key
key := hash(headers["Sec-WebSocket-Key"])
// respond
conn.Write([]byte("HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\n..."))
fmt.Println("=== handshake done! ===")

// from now on, we can send/receive WebSocket dataframe between server-client.

This is great for learning, as I got to manually construct HTTP responses over TCP. However, it’s not the most practical solution. This is where Go’s HTTP Hijack API saves the day!

Hijacking for the Win!

Go’s HTTP Hijacker interface lets us pull the raw TCP connection right out of an HTTP request—perfect for our use case. Even popular libraries like Gorilla WebSocket use this technique under the hood.

This means we can have our cake and eat it too - a single server handling both HTTP and WebSocket traffic:

combined WebSocket server

Here’s how the Hijack API fits into my WebSocket server:

// after validating HTTP header in http.Request...
hj := w.(http.Hijacker)
conn, _, err := hj.Hijack()
if err != nil {
    fmt.Println("error hijacking http response writer:", err)
    w.WriteHeader(http.StatusInternalServerError)
    w.Write([]byte("internal server error"))
    return
}
// don't forget to close the TCP connection, 
// otherwise the client will send FIN packet around after 2 seconds.
defer conn.Close()
conn.Write([]byte(WSHandshakeResponse(key)))
fmt.Println("=== handshake done! ===")

// from now on, we can send/receive WebSocket dataframe between server-client.
// ...

That’s the Handshake!

That’s a whirlwind tour of the WebSocket handshake process. If I have the time, I’ll follow up with an article about how to implement WebSocket data frames in Go.

Thanks for reading ✌️