blog - page 2 | Shuhong Dai

Why SUMO’s Rendered Videos Should Never Be Used as RL Training Data

A examination of why the visual output of SUMO. Despite being clean and intuitive, it cannot serve as learning data for RL agents, and why this limitation is inherent in how the simulator is built.

5 min read · January 27, 2025

2025 · Simulation Traffic RL SUMO
Re-running an RL Experiment and Getting a Different Answer

A engineering reflection on why two RTX 4090 machines produced diverging RL curves despite identical code, seeds, and configurations. And what this reveals about RL’s numerical sensitivity.

5 min read · November 17, 2024

2024 · Reinforcement Learning CUDA Numerical Stability Reproducibility
Using Local v2rayN Proxy for Cloud Servers via SSH Reverse Tunnel

A practical record of troubleshooting outbound network restrictions on Chinese cloud servers and enabling stable access to foreign academic resources.

5 min read · February 02, 2024

2024 · Networking Proxy SSH DevOps