Skip to content

Writing

Machine Learning and Quant Essays

A slow-growing collection. New entries are published when there is something substantive to share.

  • Jun 12, 2026

    How to correctly report LLM-as-a-Judge evaluations

    A practical guide to running, calibrating, and reporting LLM-as-a-Judge results — covering judge selection, position bias, pairwise vs scoring setups, and the statistics that actually belong in the paper.

  • Jun 10, 2026

    10 must-read machine learning research papers for ML engineers

    An annotated bibliography of foundational and recent work in LLM evaluation and reinforcement learning, with notes on why each paper matters in practice.

  • Apr 20, 2026

    Notes on evaluating reasoning models across families

    Observations from disentangling reasoning length effects from forced re-entry across Llama and Qwen distilled models.

  • Nov 2, 2025

    Reproducibility on a shared Slurm cluster

    Small operational habits that yield significant returns when several collaborators share the same GPU resources.