Gumbel SoftmaxIt models a continuous relaxation of the categorical distribution, allowing for differentiable sampling.2025-06-172 min read
DeepSeek's customised CUDA PTX instructionsample blog for testing, written with the help of ai and bit of a human touch2025-03-263 min read