-
A Quick Mental Model for Estimating LLM GPU Memory Use
Before downloading a large model or spinning up a container, it’s useful to know whether an open-source LLM will actually fit on your GPU.
-
Designing a Maintainable Replay Buffer in RL Systems
A structured and engineering-focused reflection on replay buffer design in RL, emphasizing clarity, extensibility, and long-term maintainability.
-
Tracing the Root Cause of Missing GPUs in Docker Containers
A debugging record of why Docker refused to expose GPUs inside a container even though the host recognized them perfectly, and how every layer of the system contributed a small piece to the failure.
-
Running dm-control on a Headless Server: A Complete Debugging Log
A practical record of configuring dm-control with Mujoco on a headless Ubuntu server, covering rendering failures, version mismatches, and the final workable setup.
-
Why SUMO’s Rendered Videos Should Never Be Used as RL Training Data
A examination of why the visual output of SUMO. Despite being clean and intuitive, it cannot serve as learning data for RL agents, and why this limitation is inherent in how the simulator is built.