UMDCTF 2026 - Rush Hour
roobet recently introduced their new game: rush hour. the objective of the game is to successfully bet on how many cars cross an arbitrary section of road in a given time. i have identified a highway exit on a california traffic cam that experts predict will have no traffic during this ctf. however, you somewhat unwisely bet that one car would take this exit. well, now's your chance. can you hack the autonomous car RL policy to drive to the target?
Challenge Overview
We are given a local challenge bundle containing a fixed
reinforcement-learning driving policy and a remote service at
rush-hour.challs.umdctf.io. The service accepts a small
neural network that we control, which perturbs the observation vector
seen by the fixed policy. The goal is to make the autonomous car drive
into a hidden CTF target instead of the legitimate goal.
The twist: the local simulator uses a different physics timestep than the remote environment. An attack that works locally at one timestep may fail entirely at another.
Given Files
The challenge directory contains:
policy.py-- the fixed driving policyattack.py-- the attacker network definitiongame.py-- the environment/game loopobservations.py-- observation generationphysics.py-- car dynamicsweights.npz-- the fixed policy weights
We do not control the main driving policy. We only control a small attack network that adds a bounded perturbation to the policy's 8-dimensional observation.
Reverse Engineering
The Real Target
From game.py:
1 | LEGIT_GOAL = (-20.0, -20.0) |
The environment computes observations using LEGIT_GOAL,
but the flag is awarded if the car reaches CTF_GOAL. The
whole challenge is an adversarial-control problem: make the policy think
it should do something slightly different at every timestep until it
reaches the hidden target.
The Attacker Model
From attack.py, the network constraints are:
- Input dimension: 8
- Hidden dimension: 16
- Output dimension: 8
- Per-weight absolute value bound: 10.0
- Output L2 norm bound after forward pass: 0.5
The submitted .npz must contain:
W0shape(16, 8)b0shape(16,)W1shape(8, 16)b1shape(8,)
The forward pass is simple:
1 | h = np.tanh(W0 @ obs + b0) |
The perturbation is then added to the observation before the fixed policy runs.
Policy Inputs
From observations.py, the policy sees an 8-dimensional
vector:
- Normalized speed
- Normalized steer angle
- Heading cosine/sine
- Goal-forward and goal-right coordinates in the car frame
- Log-distance-to-goal
- Constant bias term
This means the attack must be state-dependent -- a fixed offset would not work because the policy's inputs change as the car moves.
Initial Solve Strategy
The most direct approach: optimize the attack network weights directly against the provided simulator. This is the right starting point because:
- The remote service accepts exactly this network format
- The bundle includes the full local environment and fixed policy
- The attack is small enough to search directly (280 parameters)
- The simulator is deterministic (same seed = same result)
I built a local solver (solver.py) that:
- Simulates episodes from reset
- Evaluates attack candidates using the local environment
- Scores candidates by:
- Huge reward for reaching the CTF goal
- Otherwise minimizing distance to the CTF goal
- Uses an evolution-style search over attack weights
Solver Architecture
The solver uses a simple evolution strategy. The core idea: maintain a "center" set of weights in 280-dimensional space, sample random variations around it, run each variant through the simulator, pick the best ones, and move the center toward them.
See the Full Solver Code section below for the complete 205-line script.
Local Success
Running at the default dt=0.1, the search converged
quickly:
1 | SearchConfig(seed=7, iterations=600, population=96, dt=0.1) |
The winning artifact looked great locally:
1 | { |
At this point, it looked solved.
Why the First Solve Failed Remotely
Uploading the local-winning artifact to the remote service produced:
1 | episode timed out |
This was the key twist in the challenge. The remote behavior clearly diverged from local, even though both appeared to represent the same game.
Root Cause Investigation
Instead of guessing, I connected directly to the websocket endpoint used by the frontend:
1 | wss://rush-hour.challs.umdctf.io/ws |
The frontend JavaScript bundle showed that the page renders state
messages including x, z, heading,
speed, obs, obsDelta,
goalReached, timedOut, and flag.
This allowed me to stream live state from the remote server.
What the Remote Stream Showed
The remote server was sending state updates at a much finer time cadence:
1 | {"t": 0.01986314600071637, ...} |
So the remote simulator steps at approximately:
1 | dt ~= 0.02 |
My original local solver was optimized at:
1 | dt = 0.1 |
That difference turned out to be fatal.
Local Confirmation
I replayed the same "winning" artifact locally under multiple timesteps:
1 | dt = 0.1 -> goal_reached = True |
The artifact was not robust. It only won under a coarse fixed-step local simulation. The finer timestep changes the car's trajectory enough that the attack perturbations no longer steer toward the CTF goal.
This explained the remote timeout perfectly.
The Real Exploit
The real solve was:
- Use the provided local simulator to understand the control surface
- Discover that the remote environment runs at a different timestep (dt ~ 0.02)
- Retune the search against the remote-like cadence
- Submit the new artifact
I re-ran the search with the corrected timestep:
1 | SearchConfig( |
I tested several seeds and saved multiple candidates:
remote_seed1.npzremote_seed7.npzremote_seed42.npzremote_seed99.npz
All of them transferred locally under the finer timestep.
Verification
I verified each candidate at both dt=0.02 and the exact
remote cadence dt=0.019863146:
1 | for seed in [1, 7, 42, 99]: |
All four seeds succeeded at both timesteps.
Final Remote Submission
Submitting remote_seed99.npz to the websocket and
waiting for server-side state updates eventually produced:
1 | { |
The relevant terminal output near the end:
1 | state 400 {'t': 8.4639, 'x': 16.118, 'z': -25.057, 'goalReached': False, 'flag': None} |
Full Solver Code
The complete solver script:
1 | from __future__ import annotations |