Every financial data API on the market was built for the same customer: a developer writing formulas in a spreadsheet, or a quant building a backtest in Python. The response format reflects this. You get raw OHLCV bars. Open, high, low, close, volume. Rows and rows of numbers. Maybe an RSI calculation on top.
This worked fine for decades. But the consumer has changed. Today, the fastest-growing use case for financial data is feeding it into LLMs. AI agents, MCP integrations, autonomous trading bots. And raw OHLCV is a terrible format for all of them.
The token math
Here's the cost of asking a basic question with raw data.
"Is AAPL oversold?"
To answer this with a traditional API, your agent needs historical price data to compute RSI. That means pulling daily bars. A single year of daily OHLCV for one ticker is 252 trading days. Each bar carries at minimum 6 fields: date, open, high, low, close, volume.
What one year of daily bars actually costs
A typical OHLCV bar in JSON looks like this:
{
"date": "2026-03-28",
"open": 182.15,
"high": 184.95,
"low": 181.43,
"close": 184.25,
"volume": 62338249
}That single bar is roughly 25 tokens. Multiply that across a full year:
| Metric | Value |
|---|---|
| Trading days per year | 252 |
| Fields per bar | 6 (date, open, high, low, close, volume) |
| Tokens per bar (JSON) | ~25 |
| Total tokens (1 ticker, 1 year) | ~6,300 |
| 10 tickers, 1 year | ~63,000 |
| 10 tickers + 3 indicators each | ~100,000+ |
And that is before the LLM does any reasoning. That is just the raw data sitting in context.
LLMs cannot compute indicators
Now your agent needs to actually compute RSI from that data. LLMs are not reliable calculators. GPT-4, Claude, Gemini: none of them can consistently apply a 14-period RSI formula across 252 data points and arrive at the correct answer. They approximate. They hallucinate intermediate steps. They get the math wrong in ways that are difficult to detect because the output looks plausible.
Here is what happens when you ask an LLM to compute RSI from raw close prices:
- It receives 252 bars of OHLCV (6,300 tokens consumed)
- It attempts to calculate 14-period average gains and losses
- It makes arithmetic errors in the running average (common with long sequences)
- It returns a plausible-looking number, say
RSI: 34.2 - The actual RSI is 22.4. The agent makes a decision based on wrong data.
So you end up needing to compute the indicator yourself before passing it to the LLM. At which point you have built the exact infrastructure the API was supposed to save you from.
And that was just one indicator, for one ticker.
A typical agent workflow might monitor 10 tickers across RSI, MACD, trend direction, and support/resistance levels. With a raw data provider, you are looking at dozens of API calls and tens of thousands of tokens before the LLM even starts reasoning about what to do.
The menu problem
Some providers have noticed the token problem and started optimizing. Massive, for example, reduced their MCP tool definitions from 25,000 tokens down to 1,500 tokens. That is a real improvement. Fewer tokens describing what the API can do means the LLM spends less context on the menu and more on the task.
Tool definitions are the menu. Response payloads are the meal. You can shrink the menu and still serve a meal that overflows the table.
You can compress the menu to 1,500 tokens and still get back a response that dumps thousands of tokens of raw OHLCV data into context. The agent still has to process dense numerical arrays. It still cannot reliably compute a moving average from those arrays. The fundamental problem remains: the data format is wrong for LLMs.
Optimizing tool definitions is a good start. But the real leverage is in the response format.
Categories, not numbers
Consider the difference between these two responses to the question "What's the RSI situation for INTC?"
Raw numeric response
{
"indicator": "RSI",
"period": 14,
"values": [
{
"date": "2026-03-28",
"rsi": 22.41
},
{
"date": "2026-03-27",
"rsi": 24.87
},
{
"date": "2026-03-26",
"rsi": 28.33
},
{
"date": "2026-03-25",
"rsi": 31.02
}
]
}The LLM receives these numbers and now has to reason about what they mean. Is 22.41 oversold? Very oversold? How does it compare to this ticker's history? Is this rare or common? The model might know that RSI below 30 is generally considered oversold, but it has no context about whether this particular ticker has been at this level before, or how long it typically stays there.
Categorical response
{
"ticker": "INTC",
"rsi_zone": "deep_oversold",
"days_in_oversold": 8,
"historical_median_oversold_days": 4,
"historical_max_oversold_days": 14,
"condition_rarity": "very_rare",
"condition_percentile": 2.1,
"volume_context": "spike",
"trend_context": "downtrend",
"accumulation_state": "distribution",
"sector": "Semiconductors",
"valuation_zone": "deep_value"
}The LLM doesn't need to compute anything. Every field is a pre-computed fact:
deep_oversoldis unambiguous. No threshold interpretation needed.condition_rarity: "very_rare"tells it this situation is unusual for this specific ticker.days_in_oversold: 8withhistorical_median_oversold_days: 4tells it this has persisted longer than typical.volume_context: "spike"confirms something is actively happening.accumulation_state: "distribution"indicates sellers are in control.valuation_zone: "deep_value"adds a fundamental dimension the agent can weigh against the technical picture.
The model can immediately start reasoning about the implications rather than trying to derive them from raw numbers.
Why categories work better for LLMs
This is the core idea behind categorical data for LLMs. Instead of passing numbers and expecting the model to interpret them, you pass pre-computed, labeled facts. The interpretation has already happened on the server, where it can be done deterministically.
| Property | Raw numbers | Categorical labels |
|---|---|---|
| Interpretation | LLM must derive meaning from values | Meaning is the data |
| Consistency | Different LLMs interpret thresholds differently | Same label, same meaning, every time |
| Historical context | Requires additional data + computation | Built into fields like condition_rarity |
| Multi-indicator synthesis | LLM must cross-reference multiple numeric arrays | All dimensions in one flat object |
| Failure mode | Silent math errors (plausible but wrong) | No math to get wrong |
The scan-to-lookback workflow
The difference becomes even more dramatic in multi-step workflows. Consider a common agent pattern: scan the market for opportunities, then look back at historical precedents to validate the thesis.
Step 1: Scan for oversold assets
$ curl -G https://api.tickerdb.com/v1/search \
--data-urlencode 'filters=[{"field":"asset_class","op":"eq","value":"stock"},{"field":"momentum_rsi_zone","op":"eq","value":"deep_oversold"}]' \
-H "Authorization: Bearer YOUR_API_KEY"{
"timeframe": "daily",
"date": "2026-03-28",
"fields": [
"ticker",
"asset_class",
"momentum_rsi_zone",
"extremes_days_in_condition",
"extremes_condition_rarity",
"extremes_condition_percentile",
"volume_ratio_band",
"volume_accumulation_state",
"trend_direction",
"fundamentals_valuation_zone"
],
"filter_count": 2,
"result_count": 1,
"results": [
{
"ticker": "BGM",
"asset_class": "stock",
"momentum_rsi_zone": "deep_oversold",
"extremes_days_in_condition": 12,
"extremes_condition_rarity": "extremely_rare",
"extremes_condition_percentile": 0.8,
"volume_ratio_band": "extremely_high",
"volume_accumulation_state": "strong_distribution",
"trend_direction": "strong_downtrend",
"fundamentals_valuation_zone": "deep_value"
}
]
}One API call. The agent immediately sees that BGM is in deep_oversold territory and that this condition is extremely_rare (0.8th percentile). But it also picks up contradicting context:
- Bullish case:
deep_oversold,deep_value,extremely_rarecondition - Bearish case:
strong_downtrend,strong_distribution,volume_ratio_band: extremely_high(active selling pressure)
Oversold and cheap, but with sellers in control. A classic potential value trap setup. An LLM can read these labels and immediately reason about the tension between them. With raw numbers, it would need to compute RSI, rank the condition historically, infer whether volume confirms the move, and derive the broader trend state. Each step is a potential point of failure.
Step 2: Check historical precedent
The agent wants to know: what happened the last time BGM was in this condition?
$ curl https://api.tickerdb.com/v1/summary/BGM?field=momentum_rsi_zone&band=deep_oversold&limit=5 \
-H "Authorization: Bearer YOUR_API_KEY"{
"ticker": "BGM",
"field": "momentum_rsi_zone",
"events": [
{
"date": "2025-11-14",
"band": "deep_oversold",
"prev_band": "oversold",
"duration_days": 9,
"aftermath": {
"5d": {
"performance": "slight_decline"
},
"10d": {
"performance": "moderate_decline"
},
"20d": {
"performance": "slight_decline"
},
"50d": {
"performance": "moderate_decline"
},
"100d": {
"performance": "sharp_decline"
}
}
},
{
"date": "2025-06-03",
"band": "deep_oversold",
"prev_band": "neutral_low",
"duration_days": 5,
"aftermath": {
"5d": {
"performance": "slight_gain"
},
"10d": {
"performance": "flat"
},
"20d": {
"performance": "slight_decline"
},
"50d": {
"performance": "moderate_decline"
},
"100d": {
"performance": "sharp_decline"
}
}
}
],
"total_occurrences": 3,
"query_range": "5y"
}Two API calls total. TickerDB's summary event mode returns pre-computed aftermath data showing what actually happened after each historical occurrence. The pattern is clear:
| Event date | 5d | 10d | 20d | 50d | 100d |
|---|---|---|---|---|---|
| 2025-11-14 | slight decline | moderate decline | slight decline | moderate decline | sharp decline |
| 2025-06-03 | slight gain | flat | slight decline | moderate decline | sharp decline |
Even the brief 5-day bounce in June faded to a sharp_decline by 100 days. This is a stock that looks cheap but keeps getting cheaper. The value trap hypothesis is confirmed by actual historical data.
No indicator computation. No pulling years of price bars. No forward return calculations. The agent gets pre-computed facts and immediately starts reasoning about what to do.
What the raw data equivalent looks like
To answer the same question with a traditional provider like Alpha Vantage, your agent (or your code) would need to:
- Pull the full RSI history:
GET /query?function=RSI&symbol=BGM&outputsize=full - Scan the array for values below 20 (deep_oversold threshold)
- Pull daily price bars in a separate call:
GET /query?function=TIME_SERIES_DAILY&symbol=BGM&outputsize=full - For each oversold crossing, compute forward returns at 5d, 10d, 20d, 50d, 100d
- Categorize those returns into performance bands
- Pass all of this derived data to the LLM
That is 2+ API calls returning thousands of rows, plus custom code for the entire aftermath pipeline. And you would need to build and maintain that pipeline for every indicator you want to look back on.
Full comparison
Here is the same question answered both ways: "Is there an oversold stock worth looking at today, and what happened last time it was in this condition?"
| Raw data provider | TickerDB | |
|---|---|---|
| API calls | 10+ (scan universe, pull bars, compute RSI per ticker, pull historical bars, compute forward returns) | 2 (search + summary event mode) |
| Developer code required | RSI calculation, percentile ranking, forward return computation, performance bucketing | None |
| Tokens into LLM context | Thousands per ticker (raw bars + indicator arrays) | A fraction of that (categorical labels, pre-computed aftermath) |
| LLM computation | Interpret raw numbers, compare to thresholds, derive meaning | None. Labels are the meaning. |
| Risk of LLM math errors | High. RSI, MACD, and moving averages are multi-step calculations LLMs get wrong | Zero. Computation happens on the server. |
| Aftermath context | You build it. Pull years of bars, compute indicators daily, identify crossings, calculate forward returns, categorize performance. | Built in. One parameter. |
The gap is not just about token count. It is about where the computation happens. Raw data providers outsource the hard part to the LLM (or to you). Categorical data handles it on the server, where deterministic code can do it correctly every time.
What this means for agents
If you are building an AI agent, MCP integration, or any LLM-powered system that needs financial data, the format of that data matters more than the breadth of the API.
An API with 200 endpoints returning raw numbers will underperform an API with a handful of endpoints returning the right abstractions. Your agent does not need 252 daily close prices. It needs to know that the stock is in a strong_uptrend with decelerating momentum and a squeeze_active in volatility.
Consider what a well-structured categorical response gives your agent in a single call:
- Trend state:
direction,ma_alignment,volume_confirmation - Momentum:
rsi_zone,macd_state,divergence_detected - Extremes:
condition_rarity,condition_percentile,days_in_condition - Volatility:
regime,squeeze_active,regime_trend - Volume:
accumulation_state,climax_detected - Fundamentals:
valuation_zone,growth_zone,earnings_proximity
All of that in one flat JSON object. No arrays of 252 bars. No multi-step indicator pipelines. No room for LLM math errors. Here is what that looks like in practice:
{
"ticker": "AAPL",
"trend": {
"direction": "uptrend",
"duration_days": 18,
"ma_alignment": "aligned_bullish",
"volume_confirmation": "confirmed"
},
"momentum": {
"rsi_zone": "neutral_high",
"macd_state": "contracting_positive",
"direction": "decelerating",
"divergence_detected": false
},
"volatility": {
"regime": "normal",
"squeeze_active": false,
"regime_trend": "stable"
},
"volume": {
"accumulation_state": "accumulation",
"climax_detected": false
},
"fundamentals": {
"valuation_zone": "fair_value",
"growth_zone": "moderate_growth",
"earnings_proximity": "this_month"
}
}One call. Every field is a label the LLM can reason about directly. An agent reading this response can immediately synthesize: AAPL is in an uptrend with bullish alignment, but momentum is decelerating. Volatility is stable with no squeeze. Accumulation is healthy. Fair value with moderate growth and earnings coming this month. That is useful analysis from a single HTTP request.
Categorical data is not a simplification. It is the correct abstraction layer for LLM consumption. The same way you would not pass raw pixel data to a language model and ask it to identify objects, you should not pass raw OHLCV data and ask it to identify market conditions.
The right data format eliminates an entire class of errors, reduces token usage, and lets the model focus on what it is actually good at: reasoning about labeled facts and making decisions.
TickerDB turns raw market data into categorical market intelligence for APIs, MCP clients, and AI agents. Read the docs or try it free.