The Fed Just Rewrote the Rulebook for Bank Supervision
The Federal Reserve's November 2025 Statement of Supervisory Operating Principles signals a seismic shift — from checkbox compliance to material risk. Here's what changed and why it matters.
Long-form analysis on AI, regulation, and financial technology.
The Federal Reserve's November 2025 Statement of Supervisory Operating Principles signals a seismic shift — from checkbox compliance to material risk. Here's what changed and why it matters.
Fast.ai uncovered something strange in LLM fine-tuning: training loss dropped suddenly after just one pass through the data — suggesting models can memorize inputs almost immediately. Here's what it means.
A comprehensive framework for analyzing open-source GenAI across near, mid, and long-term development stages — and why the benefits generally outweigh the risks when governance keeps pace.
Re-training LLMs from scratch when new data arrives is prohibitively expensive. Three simple strategies — LR re-warming, LR re-decaying, and minimal data replay — match the performance of full re-training at a fraction of the cost.
Traditional ensemble methods fail when correct answers are in the minority. AoR introduces hierarchical reasoning chain evaluation and dynamic sampling to fix this — and consistently outperforms standard approaches.
xLSTM revisits the classic LSTM architecture with exponential gating and new memory structures — then scales it to 300B tokens. The results are more competitive than most expected.
LLMs used as evaluators show an average 40% bias in their outputs and a 49.6% RBO score misalignment with human preferences. The COBBLER benchmark quantifies exactly how and where these biases emerge.
LLM agents hit 94% success on basic web tasks — but drop to 25% on compositional tasks that combine multiple steps. The CompWoB benchmark exposes exactly where and why this happens.
Gradio lets you wrap any Python function in a browser UI in under 10 lines. Here's how the Interface class works, what components are available, and why it's the fastest way to build an ML proof of concept.
A practical walkthrough of building regularised regression models for house price prediction — from raw data preprocessing to comparing Ridge and Lasso, with residual diagnostics to validate the result.
An EDA of 2007–2011 lending data to identify the driving factors behind loan defaults — amount-to-income ratios, revolving utilisation, derogatory records, and loan purpose all tell a story.
How do you detect fraudulent users at scale when they look like everyone else? A walkthrough of using search-to-communication ratios, geographic patterns, and temporal behaviour to surface bot activity.