Projects — Sameer Maurya

Featured

LLM Eval Framework

SR 11-7 & EU AI Act compliant LLM validation suite for financial services

A local-first auditing suite that automates Model Risk Management artifacts regulators require — converting raw LLM outputs into compliant evidence. Built on experience validating LLMs at financial institutions. Features adversarial red-teaming (50+ templates), SHAP/LIME token attribution, Financial-F1 accuracy scoring, and a YAML-based regulatory mapping registry.

PythonLLMsModel RiskSR 11-7EU AI ActHuggingFace

GitHub ↗ Live Demo ↗

Housing Valuation Engine

Production ML app — Ames Iowa + Delhi NCR price prediction

A production-grade real estate prediction portal with two models: a log-transformed LassoCV model for Ames, Iowa with SHAP attributions and MRM diagnostics, and per-region Gradient Boosting models for Delhi NCR (Gurgaon, Noida, Delhi) with a Folium map and Buy/Rent toggle. Raw data never leaves the machine — local-first by design.

Pythonscikit-learnStreamlitSHAPFoliumLassoCV

GitHub ↗

OCR Evaluation Suite

Multi-engine OCR benchmark with audit trail reporting

"In OCR, it reads is not a valid test result. Accuracy is the only metric." A production-grade OCR framework built around PaddleOCR with F1 Score evaluation, Character Error Rate (CER) using Levenshtein distance, and timestamped audit trail reports. Replaces black-box OCR implementations with measurable, auditable evidence — critical for document intelligence in regulated industries.

PythonPaddleOCRTesseractOpenCVDocker

GitHub ↗ Live Demo ↗

Regression Analysis

Linear regression deep-dive — Ridge, Lasso, MRM diagnostics

An end-to-end regression study on the Ames, Iowa housing dataset (2,919 sales, 80 features). Covers EDA, feature engineering, Ridge vs Lasso comparison, SHAP attributions, and Model Risk Management diagnostics aligned with SR 11-7 — VIF, Durbin-Watson (2.03), Breusch-Pagan, and Jarque-Bera. Companion notebook for a Medium article.

Pythonscikit-learnSHAPstatsmodelsJupyter

GitHub ↗

Lending Business Case Study

EDA on 2007–2011 lending data to identify default drivers

Exploratory data analysis of a real lending portfolio to identify the key factors behind loan defaults — amount-to-income ratio, revolving utilisation, derogatory records, and loan purpose. Uses univariate, bivariate, and trivariate analysis. Overall default rate of 14.16%, with Spain flagged as the highest-risk geography at 18.31%.

PythonPandasEDARisk AnalysisJupyter

GitHub ↗

Built things

LLM Eval Framework

Housing Valuation Engine

OCR Evaluation Suite

Regression Analysis

Lending Business Case Study