annalhq

hey there! i am currently exploring rl, low level optimizations, and gpu based llm inference.

my interests include inference time systems such as vllm and sglang, cuda, kernel level optimizations. more broadly, where tools from mathematics and optimizations help explain representation learning and generalization in models.