Skip to content

Instantly share code, notes, and snippets.

@ivanvmoreno
Created January 21, 2026 08:28
Show Gist options
  • Select an option

  • Save ivanvmoreno/4057e06e7cfb93b86bec27c9fb8124db to your computer and use it in GitHub Desktop.

Select an option

Save ivanvmoreno/4057e06e7cfb93b86bec27c9fb8124db to your computer and use it in GitHub Desktop.
Quarto Santander presentation
---
title: "Hello Santander 👋"
format:
revealjs:
mermaid:
theme: default
---
# Who am I?
* Iván Moreno ([ivan@ivan.build]())
* (Ex) Lead ML Engineer at Nieve Consulting
* PhD candidate in AI Safety at Universidad Internacional de Valencia
* Full-time motorcycles & coffee nerd 🛵☕
# What do I do?
* Recommender systems
* Customer segmentation
* Demand forecasting
* Zero-knowledge model verification
* Autonomous systems & agentic workloads
* Unstructured data processing (text and image)
* Conversational AI
* MLOps & LLMOps strategy and implementation
# Academic Research 👨‍🔬
# No Answer Needed: Predicting LLM Answer Accuracy
---
![](no_answer_needed.png)
---
::: {.incremental}
* **Idea:** do LLMs know whether they’re going to fail before generating a single token of the answer?
* **Setup:** extract residual stream activations at the last token of the input.
* **Methodology:** linear probe (classifier) trained to predict between questions the model will answer correctly vs incorrectly.
* **Results:** a generalizable latent "correctness" direction encoded in the model internals was found.
* [arxiv.org/abs/2509.10625](https://arxiv.org/abs/2509.10625)
:::
# Predictive Selection of Optimal Language for Reasoning Tasks in LLMs
---
![](lsk.png)
---
::: {.incremental}
* **Motivation:** recent empirical evidence shows non-English languages can match or exceed English performance on specific reasoning tasks.
* **Problem:** drastic gap between standard voting and the theoretical "upper bound" where the correct answer exists in at least one language.
* **Idea:** predict the optimal language subset a priori by analyzing the activation geometry of a single forward pass.
:::
---
::: {.incremental}
* **Hypothesis:** "representational diversity" (measured via activation similarity kernels) correlates with accessing different, complementary subsets of parametric knowledge, or alternative reasoning paths.
* **Methodology:** analyze the geometry of activations using non-linear kernels (e.g., RBF) and algorithmically (e.g. DDP) select the most diverse language subset for a given input.
:::
# Industry 👨‍🔧
# Text-to-object pipeline for CRM SaaS
## Challenge
* Custom query language, 2 variations, no formal specification and no common representation
* Single-tenant hybrid (on-premise, multi-cloud) deployments
* Unique, large & evolving data schemas per tenant
* Multiple languages, geographies and industries with unique terminology & knowledge requirements
## Stack
* **State machine:** Apache Burr
* **LLMOps (prompt versioning, tracing, experiment tracking, feedback collection):** Langfuse
* **AI Gateway:** LiteLLM
* **Structured generation:** Pydantic + Instructor
* **Evaluation Framework:** DeepEval
* **PII Redaction:** Presidio
* **Schema processing:** Celery + Apache Hamilton
---
::: {.r-stretch}
![](c4.png)
:::
---
![](seq.png)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment