Taktile Labs

AI is coming to financial services. Let’s make sure we can trust it.

Taktile Labs is an applied research group focused on making AI agents reliable, explainable, and production-ready for regulated financial institutions. We publish benchmarks, build evaluation frameworks, and conduct research grounded in realistic banking data.

Bridging the gap between frontier AI and regulated deployment.

Financial services will spend $97B on AI by 2027. Yet the distance between what general-purpose AI can do and what regulated institutions can reliably deploy remains vast. Taktile Labs exists to close this gap.

Focus on AI adoption within critical decision-making.

Every day, financial institutions run on a current of critical decisions, from determining if a transaction is fraudulent to calculating the amount of risk to accept when underwriting a loan. At Taktile Labs, we research what it takes to deploy AI in this high-stakes decision-making where errors aren’t an option.

Powered by high-quality, realistic data.

We work closely with development partners and industry experts to create high-quality evaluation data sets, which we use to assess AI performance in financial services-specific contexts.


What we’re working on.

We’re pursuing five research tracks focused on the most important requirements for trustworthy AI in regulated financial institutions.

01

Evaluations & Benchmarking

In collaboration with our development partners, we use real-world data to build trusted benchmarks for model performance in core financial services use cases. Every benchmark is designed around the KPIs business teams care about most: accuracy, cost per decision, and latency.

View FinSpread-Bench
02

Human-Agent Design Patterns

Not every decision should be fully automated, and not every decision requires a human to intervene. We pursue balanced research that helps teams unlock the benefits of AI in complex decision-making while preserving the value of human judgment and engagement.

03

Governance, Risk & Compliance

Agentic systems based on LLMs are stochastic and hard to inspect, challenging assumptions behind SR 11-7 and traditional model risk management. We work with institutions and regulators to clarify what responsible adoption looks like in practice.

04

Foundation Models for Financial Data

Foundation models are trained on public text, but financial decisioning runs on data with rich sequential and relational structure. We explore transformer-based architectures purpose-built for common data structures in financial services.

05

Hybrid Decision Architectures

The most effective AI systems will be built using hybrid architectures. We help financial institutions navigate AI hype with clarity and choose the right tool for each task while balancing cost, risk, and performance.

Discover our benchmark for financial spreading.

FinSpread-Bench evaluates how well agentic AI systems can extract, calculate, and reason across financial documents in realistic decisioning scenarios. Built with anonymized results from our development partners.

Model configurationField match rate
GPT-5.2
96.5%
Extraction: Gemini 3.1 ProTooling: All tools
GPT-5.2
96.5%
Extraction: Gemini 2.5 ProTooling: All tools
GPT-5.2
96.2%
Extraction: Gemini 2.5 FlashTooling: All tools
Gemini 3.1 Pro
95.9%
Extraction: Gemini 3.1 ProTooling: All tools
GPT-5.2
95.2%
Extraction: Gemini 3 FlashTooling: All tools
Claude Opus 4.6
94.3%
Extraction: Gemini 3.1 ProTooling: All tools
GPT-5
94.1%
Extraction: Gemini 3.1 ProTooling: All tools
GPT-5.2
94.0%
Extraction: Gemini 3.1 ProTooling: No calculator tool
Gemini 2.5 Pro
83.0%
Extraction: Gemini 3.1 ProTooling: All tools
Claude Sonnet 4.5
54.6%
Extraction: Gemini 3.1 ProTooling: All tools
Claude Haiku 4.5
34.1%
Extraction: Gemini 3.1 ProTooling: All tools
Human baseline (89%)

Meet our team.

Taktile Labs is powered by consistent collaboration between a dedicated internal research team and an external Research Council and Advisory Board.

Advisory Board

Parag AgrawalTom GlocerHarald SchneiderKarim LakhaniRobin GreenwoodTina ReichJill Zucker Sheckman

Research Council

Fagner AbreuBen LiebaldDaniel MeyerJonas NelleMikey ShulmanHenry VenturelliPieter ViljoenMichael Zambrano

Research Team

David AhnMaximilian Eber, PhDNico KleesAlexia PastréFabian Peters, PhDRobin Raymond, PhD

Browse our published work.

Benchmarks, technical reports, and research.

BenchmarkFebruary 2026

FinSpread-Bench: Evaluating Agentic AI for Financial Spreading

Nico KleesMaximilian Eber, PhD

Nico Klees, Maximilian Eber, PhD

The first public benchmark for agentic financial document processing. Evaluates extraction accuracy, cross-document reasoning, calculation correctness, and structured output quality across seven frontier models.

PaperComing Q1 2026

AI in AML: Understanding the New Model Risk Mandate for Banks and Fintechs

Dustin Eaton, Maximilian Eber, PhD

Why AML teams must now apply model risk management standards to AI systems. Published in ACAMS Today, exploring how regulators are extending MRM frameworks to AI deployed in compliance functions — and what institutions need to do to prepare.