Notes from Building WebSocketReflectorX

The problem

CTF competitions often include dynamic challenges, where every team gets its own isolated challenge container and every container carries a different flag. That raises the cost of cheating quite a bit, which helps cut down on abuse. It also prevents one team from permanently breaking the environment for everyone else, and if a team accidentally wrecks its own setup, it can usually just restart the container and recover on its own.

There are already some fairly mature solutions for dynamic challenges, such as frankli0324/CTFd-Whale. Still, there are always a few awkward corners that make those solutions hard to use in certain environments.

One of the hardest problems in competition-platform development is giving contestants a comfortable way to reach their containers, and that becomes even harder in highly constrained deployments. We ran into exactly that kind of setup:

The university server only had a single domain name, no wildcard DNS, no ability to issue our own certificates, and all traffic had to pass through a CDN before going through a bastion host that audited it and downgraded it into plaintext HTTP before finally forwarding it to our server. We had no control over the CDN, the bastion host, or the domain itself. The bastion host’s security group exposed exactly one path: bastion port 443 -> server port 80. No other server port was reachable from the outside, not even by connecting to the IP directly.

With a server environment this restrictive, how are you supposed to expose dynamic challenge containers at all?

Existing reverse-proxy approaches

The solutions mentioned above all address this problem in one way or another. For example, CTFd-Whale uses frp for NAT traversal, then reverse-proxies challenge traffic through a relay server we actually control, and finally distributes access through our own wildcard domain and port mapping.

The big upside is that the contestant experience is great. It feels almost identical to hitting the original challenge port directly. The downside is just as obvious: if the relay server has a bad day, every dynamic challenge container goes down at once. And because pwn and password-style challenges need raw interactive traffic, you cannot solve this with HTTPS-based L7 forwarding. You are forced into L4 dispatch based on different ports. That means the relay server has to sit directly on the public Internet, and once it gets hit with a DDoS, there is often not much you can do.

Note: L4 and L7 refer to layer 4 and layer 7 of the networking stack. I will use those abbreviations below.

An SNI-based approach for certain scenarios

Later, senior teammates zkonge and Frank suggested another idea: using Server Name Indication (SNI) for L4 traffic forwarding. Because TLS traffic is encrypted, a single host serving multiple sites cannot simply decrypt everything with one certificate in all cases. That is why SNI was introduced in the TLS extensions defined by RFC 3546 back in June 2003, and later became part of the IETF standards process through the RFC series. Today, basically every mainstream TLS client supports SNI.

This approach solves the multi-port and no-encryption issues, but it still does nothing about the lack of CDN protection. On top of that, it depends on multiple domains, which means we would need both a wildcard certificate and a domain we fully control. That made it a poor fit for our heavily restricted university deployment as well.

A WebSocket relay approach

Starting from there, I came up with a different idea: relay TCP traffic over WebSocket - in other words, build an L4 tunnel on top of L7.

graph LR;
    script["Exploit Script"] -->|Raw TCP traffic| relay["Forwarder"]
    relay -->|Specific URI + WebSocket traffic| receiver["Server Receiver"] -->|Raw TCP traffic| service["Dynamic Container Service"]

The forwarder runs on the contestant’s local machine, while the server receiver runs on the challenge server. When the forwarder starts, it is given a server address and a specific URI, then opens a local TCP port and behaves like a tiny TCP server. The receiver listens on a route, checks whether the requested target has a matching challenge container, establishes a WebSocket connection to the forwarder, and then blindly forwards all bytes to the corresponding container port. When the contestant opens a TCP connection to the local port exposed by the forwarder, the forwarder immediately initiates a WebSocket connection to the server and passes the original traffic through.

Because WebSocket requires the client to start with an HTTP Upgrade request before the connection is switched over, we can place the server behind a CDN or firewall as long as that layer supports forwarding WebSocket traffic. Fortunately, both the university bastion host and its CDN service supported WebSocket. With a bit of nginx configuration, the receiver could be placed under a subpath while other paths continued serving unrelated services such as the competition platform itself. That solved the routing problem nicely.

That said, the solution is not perfect.

One obvious issue is platform coverage. Contestants use wildly different environments, so the forwarder has to be truly cross-platform. And not just across Windows, Linux, and macOS today, but often across older versions of those systems as well. CTF challenge types vary a lot, and some of them require fairly specific local setups to solve comfortably. If the connector cannot run smoothly in those environments, it immediately becomes annoying.

There is also the user-experience problem. Contestants now have one more tool to download, one more process to launch, and one more local port to point their exploit at. Every extra step hurts the competition experience a little, so the connector design itself still needs more thought.