TIL

Writing things down helps me actually remember them, so I figured I’d share! This page is basically where I capture quick summaries or takeaways from talks, papers, or courses without the formality of a full blog post.

They’re all clickable links, but the notes themselves should be relatively short. Except for the course notes, which tend to get a bit out of hand.



  • International Programme on AI Evaluation: Capabilities and Safety 17 February 2026

    Module 4: New Benchmarking Paradigms

    Overview AI research embraces an “anything goes” philosophy where you can try any architecture, training method, or data preprocessing approach, but this freedom makes systems hard to compare. Benchmarks provide the necessary constraint. You can explore freely during development, but eventually you have to submit to competitive empirical testing on...
  • International Programme on AI Evaluation: Capabilities and Safety 17 February 2026

    Module 4: The Science of Benchmarking

    Overview This lecture covered the emerging science of benchmarking AI systems and why most current evaluation methods have serious flaws. The focus was on common problems like data contamination, construct validity issues, and measurement biases that make benchmark scores unreliable indicators of actual AI capabilities. “When measures become targets, they...
  • International Programme on AI Evaluation: Capabilities and Safety 12 February 2026

    Module 3: ML Model Deployment and Monitoring

    Overview This lecture covered the practical realities of putting machine learning models into production and keeping them working over time. The focus was on deployment strategies, why models degrade after deployment, and comprehensive monitoring approaches for both model performance and system health. This lecture was taught by Cèsar Ferri. Key...
  • International Programme on AI Evaluation: Capabilities and Safety 12 February 2026

    Module 3: Experiment Design

    Overview This lecture covered how to design experiments for evaluating AI systems. It covered traditional experimental design principles, statistical testing methods, and the specific challenges that come up when trying to evaluate AI. This lecture was taught by Line Clemmensen. Key Concepts & Takeaways Start small and specific. Don’t try...
  • International Programme on AI Evaluation: Capabilities and Safety 11 February 2026

    Module 3: Calibration

    Overview This lecture covered why we’re interested in calibration, calibration techniques and evaluation metrics for this, multi-class calibration and proper scoring rules. This lecture was taught by Peter Flach. Detailed Notes Calibration is about whether the confidence scores your machine learning model outputs actually mean what they claim to mean....
  • International Programme on AI Evaluation: Capabilities and Safety 10 February 2026

    Module 3: Statistical Foundation of AI Evaluation

    Overview This lecture establishes the statistical foundations necessary for AI evaluation. Statistics in AI evaluation are frequently misused or misinterpreted, and impressive-looking numbers that seem authoritative may be meaningless or misleading without understanding their underlying assumptions and limitations. This lecture was taught by Line Clemmensen. Key Concepts & Takeaways Takeaways...
  • International Programme on AI Evaluation: Capabilities and Safety 3 February 2026

    Module 1: AI Evaluation as a Scientific Discipline

    Overview The main focus of this lecture is on establishing evaluation as a scientific discipline. “Something is rotten in the field of evaluation… not because the science is wrong, but because it is really complicated and very cross-disciplinary.” This lecture was taught by José Hernández-Orallo. Key Take-Aways and Concepts The...
  • AI Safety, Ethics & Society course 21 July 2025

    Chapter 5: Complex Systems

    This chapter makes the argument that AI systems and the societies they operate within are complex systems, which makes AI safety a wicked problem with no single solution, potential unintended consequences from interventions, and requiring ongoing effort. AI Safety is a Wicked Problem Puzzles (like sudoku) have one correct answer...
  • AI Safety, Ethics & Society course 21 July 2025

    Chapter 4: Safety Engineering

    This chapter introduces the idea that AI safety should be seen as a specialized part of safety engineering, a concept borrowed from fields like aviation and medicine that focuses on designing systems to manage and reduce risks effectively. It also points out that AI brings unique challenges and risks that...
  • AI Safety, Ethics & Society course 14 July 2025

    Chapter 3: Single-agent safety

    This chapter focusses on the fundamental technical challenges of making individual single-agent AI systems safe, not even considering multi-agent dynamics or complex systems. Essentially, his can be summarized as problems with monitoring, robustness and alignment, which in turn reinforce each other. Monitoring We cannot monitor what we cannot understand. Current...
  • AI Safety, Ethics & Society course 1 July 2025

    Chapter 1+2: "Overview of catastrophic AI risks"

    AI Safety, Ethics & Society is a textbook written by Dan Hendrycks of the Center for AI Safety. As part of the Summer 2025 Cohort, I’ll work through the course content and take part in small-group discussions, led by a facilitator. In these notes, I’ll summarize the chapters we read...