Shuhong Dai
  • about
  • blog
  • publications
  • Re-running an RL Experiment and Getting a Different Answer

    A engineering reflection on why two RTX 4090 machines produced diverging RL curves despite identical code, seeds, and configurations. And what this reveals about RL’s numerical sensitivity.

    5 min read   ·   November 17, 2024

    2024   ·   Reinforcement Learning   CUDA   Numerical Stability   Reproducibility

  • Using Local v2rayN Proxy for Cloud Servers via SSH Reverse Tunnel

    A practical record of troubleshooting outbound network restrictions on Chinese cloud servers and enabling stable access to foreign academic resources.

    5 min read   ·   February 02, 2024

    2024   ·   Networking   Proxy   SSH   DevOps

  • Newer
  • 1
  • 2
  • Older
© Copyright 2026 Shuhong Dai.