*** title: Structured Data Analysis position: 8 ----------- ## The Problem: LLMs Can't Do Math at Scale LLMs are powerful reasoners, but they have a fundamental limitation: they struggle with mathematical and analytical tasks. A classic example is asking a model *"How many r's are in strawberry?"* - it confidently answers two, when the correct answer is three. Without code execution, the model is guessing at character counts rather than actually counting. Ask it to verify with code, and it gets it right every time. Now scale that problem up. If an LLM can't reliably count letters in a single word, imagine asking it to compute an NPS score across 2,000 survey responses, or total ARR by industry across 500 accounts. It will summarize from a tiny sample, hallucinate totals, or silently truncate the data. The results look plausible but aren't trustworthy. The Reasoning Engine needs a different tool for the job. ## The Solution: Code Interpreter The answer is to let the Reasoning Engine do what it's good at. Understanding the question and writing code and then execute that code deterministically against the full dataset. Instead of the Reasoning Engine trying to reason 500 ARR values, it writes a Python script that computes the exact total and hands the result back. This is called the **Code Interpreter**, and it's a built-in skill of the Reasoning Engine. In the Moveworks platform, this capability is called **Structured Data Analysis (SDA)**. ## What is SDA? SDA is the **Reasoning Engine's Code Interpreter capability**. When a plugin returns structured data that's too large for inline summarization, the Reasoning Engine automatically writes and executes Python code against the full dataset - performing aggregations, groupings, filters, charts, and more - then synthesizes the result into a final user-facing message. SDA activates automatically. After your plugin's final action completes, the output is checked against the **7K token limit**. If the output exceeds that limit and is valid structured JSON, SDA takes over. There's nothing to enable, wire up, or configure. | Without SDA | With SDA | | -------------------------------------------------------- | ------------------------------------------------------ | | Limited to \~7K tokens of inline data | Handles thousands of rows via code execution | | Reasoning Engine attempts to summarize, often truncating | Writes targeted Python - aggregations, filters, charts | | Users hit "too much data" errors on large datasets | Transparent to the user - they just get an answer | Consider a plugin that returns 2,247 NPS survey responses. Without SDA, the Reasoning Engine sees only a handful of records that fit within its context window. It produces a surface level summary with no actual NPS calculation, no regional breakdown, no theme frequency counts. With SDA, the same plugin and same dataset produces a complete statistical analysis: overall NPS score, promoter/passive/detractor splits by region, comment theme extraction with frequencies, and actionable signals - all computed from the full 2,247 records. SDA triggers based on output size alone. There's no configuration needed. Your only job is to return well-structured JSON. This guide covers how to do that effectively. ## How SDA Works ### Data Flow Architecture Every plugin response follows one of three paths based on its size: 1. **≤ 7K tokens:** The Reasoning Engine summarizes the data. No code execution involved. 2. **> 7K tokens:** The output routes to SDA. SDA samples a snippet of the dataset to understand its structure, writes Python code against the full data, executes it, and hands the result back to the Reasoning Engine for the final user facing message. 3. **> 100MB:** Hard failure - the entire plugin response is rejected and the user sees an error. The key implication: your plugin doesn't decide whether SDA runs. The **size of the final output** decides. Two conditions must both be true for SDA to activate: 1. **Output exceeds 7K tokens.** Below this threshold, the Reasoning Engine handles everything with reasoning alone. 2. **Output is valid structured JSON.** SDA will not activate on raw text, semi-structured content, or data embedded inside a string field (e.g., `"data": "{\"accounts\": [...]}"` ). If your source API returns stringified JSON, parse it in your output mapper before returning it. ### Critical Limits | Threshold | Value | What Happens | Why It Matters | | ----------------- | ------------------ | ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | **Token limit** | 7K tokens (\~28KB) | SDA activates | This is the routing threshold. Below it, the Reasoning Engine handles your data with reasoning alone. Above it, the Code Interpreter takes over. | | **Response size** | 100MB | Hard failure - the plugin returns nothing | The entire response is rejected. | ## Designing Your Output SDA can only work with what it receives. It doesn't know what your plugin *could* return it only sees what it *did* return, plus any metadata you attached. That makes output design the highest leverage area for improving SDA quality. ### Include All Data SDA Needs in the Final Output SDA receives only the output of your plugin's **action activities**. It can't see intermediate results inside compound actions, it can't call APIs, and it can't ask for more data. If a field isn't in the final output, SDA will either skip it or produce an incorrect analysis. This matters most for multi-step plugins. If your process fetches accounts from one API and enrichment data from another, make sure the final action merges both into a single output. Use `MAP()` to combine data from multiple actions: ```yaml return: output_mapper: processed_employee_list: MAP(): # Iterate over the output of Action A (List of users) items: data.raw_user_list converter: # Map fields from the current item id: item.user_id display_name: item.name.$TITLECASE() # Merge data from Action B (Company Context) into every item office_location: data.company_context.primary_hq_address # Create a derived field combining both sources email_link: $CONCAT(["mailto:", item.username, "@", data.company_context.email_domain], "") ``` In this example, `data.raw_user_list` comes from one action and `data.company_context` comes from another. The final output merges both so SDA has everything it needs in a single flat list. ### Use Friendly Field Names SDA uses field names as its primary signal for understanding what each column represents. A name like `churn_rate` is self-documenting; `ChurnRate__c` forces the model to infer meaning from a system-generated suffix. Friendly names also improve the user experience - these names surface in messages and charts. | System Name | Friendly Name | | ------------------- | --------------------------- | | `ARR__c` | `annual_recurring_revenue` | | `MRR__c` | `monthly_recurring_revenue` | | `NPS_Score__c` | `net_promoter_score` | | `ChurnRate__c` | `churn_rate` | | `Territory__c` | `territory` | | `NumberOfEmployees` | `employees` | | `LastModifiedDate` | `date_last_modified` | ### Flatten Nested Structures SDA works best with tabular data - flat rows and columns that map cleanly to a dataframe. Deeply nested JSON forces SDA to write extra code to unpack the structure before it can even start the analysis, and that extra complexity increases the chance of errors. **Bad - Nested with cryptic API field names:** ```json { "d": { "results": [ { "rec_id": "001A00000", "attr": { "n1": "Acme Corp", "loc": { "line1": "New York", "geo_cd": "US-NY" } }, "fin": { "amt1": 250000, "amt2": 20833, "pct_r": 8.2 }, "_metadata": { "sys_mod": "2024-01-15", "rec_type": "enterprise" } } ] } } ``` **Good - Flat with friendly names:** ```json { "accounts": [ { "account_name": "Acme Corp", "city": "New York", "region": "US-NY", "annual_recurring_revenue": 250000, "monthly_recurring_revenue": 20833, "churn_rate": 8.2 } ] } ``` ### Complete Output Mapper Example Here's a production-ready output mapper that ties together everything above - friendly names, flat structure, and `display_instructions_for_model` for baseline analysis rules: ```yaml MAP(): items: response.accounts converter: account_name: item.Name annual_recurring_revenue: item.ARR__c monthly_recurring_revenue: item.MRR__c net_promoter_score: item.NPS_Score__c churn_rate: item.ChurnRate__c territory: item.Territory__c industry: item.Industry employees: item.NumberOfEmployees next_renewal_date: item.NextRenewalDate__c open_opportunities: item.OpenOpportunities__c closed_won_opportunities: item.ClosedWonOpportunities__c support_tickets: item.SupportTickets__c display_instructions_for_model:> CRM accounts. annual_recurring_revenue is USD. monthly_recurring_revenue is USD. churn_rate is 0-100%. Group by territory when comparing regions. Flag churn_rate > 15 with renewal in 90 days as at-risk. ``` ## Steering SDA with Instructions You steer SDA's behavior at two levels: the **plugin name + description** and the `display_instructions_for_model` field (which steers SDA on how to interpret the data once it has it). ### Name and Describe Your Plugin Effectively SDA uses your plugin's **name** and **description** as context when writing code against your data. A descriptive name like `NPS_Score_Analysis` immediately tells SDA what domain it's working in and what kinds of calculations make sense. A vague name like `excel_data_plugin` gives it nothing to work with. The description matters even more. When your description lists the fields your plugin returns, SDA can map the user's question to the right columns and choose appropriate analysis methods. If your description says "Returns insights on NPS score with account owner, internal scoring, customer, and region," SDA knows to look for score fields, group by region, and segment by customer. If your description just says "Returns data from excel," SDA is lacking critical context. | Quality | Plugin Name | Plugin Description | | -------- | -------------------- | ------------------------------------------------------------------------------------------ | | **Good** | `NPS_Score_Analysis` | "Returns insights on NPS score with account owner, internal scoring, customer, and region" | | **Bad** | `excel_data_plugin` | "Returns data from excel" | The bad example may fail a question like *"Which accounts have the highest churn rate?"* - nothing in the description signals that churn data is available. ### Use display\_instructions\_for\_model for Baseline Rules Once the Reasoning Engine routes to your plugin, SDA decides what code to write based on two inputs: the **user's question** and the `display_instructions_for_model` field attached to your data. Think of the user's question as *what* to compute and the instructions as *how* to interpret the data while computing it. The distinction matters. If you put `"Calculate total ARR by industry"` in your instructions, SDA might execute that calculation even when the user asks a completely different question. Instructions should define **baseline rules** that apply regardless of the query, not specific analyses. Stick to rules that should hold true for **every** query against this dataset: * **Units and formats** - `"annual_recurring_revenue is USD"`, `"churn_rate is 0-100%"` * **Default groupings** - `"Group by territory when comparing regions"` * **Business rules** - `"Flag churn_rate > 15 with renewal in 90 days as at-risk"` ```yaml display_instructions_for_model:> CRM accounts. annual_recurring_revenue is USD. churn_rate is 0-100%. Group by territory. Flag churn_rate > 15 with renewal in 90 days as at-risk. ``` ### What NOT to Put in Instructions Don't use instructions to dictate specific analyses. That's the user's job - their question tells SDA what to compute. Instructions that overlap with user intent create conflicts, and SDA may ignore one or both. | Quality | Instruction Example | Why | | -------- | ----------------------------------------- | ----------------------------------------------------------------------------------- | | **Good** | `"annual_recurring_revenue is USD"` | Tells SDA the unit - applies to any query involving revenue | | **Good** | `"Flag churn_rate > 15 as at-risk"` | Business rule that should surface in every analysis touching churn | | **Bad** | `"Calculate total ARR by industry"` | If the user asks about churn by territory, SDA now has conflicting directives | | **Bad** | `"Please fuzzy match Telecam to Telecom"` | One-off data quality issues belong in your data pipeline, not in model instructions | Instructions that tell the model *how* to analyze (e.g., `"fuzzy match Telecam to Telecom"`) are unreliable. SDA may skip them entirely, or worse, apply them in contexts where they don't make sense. Fix data quality issues upstream in your output mapper or API call, not in instructions. ## Size Optimization Your plugin's response has a hard ceiling (100MB), and every field you return eats into that budget. But size isn't just about avoiding failures. Smaller, focused datasets also produce better analysis. When SDA has 15 fields to work with instead of 50, it writes simpler code with fewer opportunities for error. Optimize at three levels: pre-filter at the API, prune the schema, and post-filter with DSL. A 500-row dataset with 30 fields is dramatically larger than the same 500 rows with 10 fields. Pruning aggressively can be the difference between a dataset that fits size limitations and can shave seconds off your plugin run time. ### Pre-filtering at the API Level The cheapest data is data you never fetch. Use plugin inputs to narrow the query **before** hitting the external API. You can define multiple filter inputs on the same plugin and mark them as optional, so the plugin works broadly but can be scoped when the user provides constraints: * **Date ranges** - `start_date` and `end_date` to bound time-series queries * **Current user** - `assigned_to` the requesting user, so each person sees only their records * **Numeric thresholds** - e.g., only accounts with more than 500 employees * **Optional inputs** - Mark all of the above as optional so the plugin still works when the user doesn't specify a filter ### Schema Pruning with MAP() `MAP()` lets you project only the fields SDA needs from the raw API response. Drop system metadata, internal IDs, audit timestamps, and long text blobs. If SDA won't reference a field in a calculation or summary, it shouldn't be in the output. ```yaml MAP(): items: response.accounts converter: account_name: item.Name annual_recurring_revenue: item.ARR__c territory: item.Territory__c churn_rate: item.ChurnRate__c # Omitted: Id, SystemModstamp, CreatedById, # LastModifiedById, Notes__c, Description, etc. ``` ### Post-filtering with FILTER() and other Data Mapper Syntax When the source API doesn't support server-side filtering, use DSL and Data Mapper expressions to remove records inside your plugin's process before the output reaches SDA: ```yaml FILTER(): items: data.accounts condition: item.annual_recurring_revenue > 100000 ``` ## Debugging and Validation After launching your plugin, verify SDA is working correctly: 1. **Look for the "Running analysis" indicator.** When SDA activates, the chat shows a progress indicator during code execution. If you don't see it, your output is likely under the 7K token threshold or isn't valid structured JSON. 2. **Check the code citation.** SDA generates a reference labeled "Analysis Code" alongside its response. Expand it to see the exact Python code that ran - including how it parsed your fields, what aggregations it performed, and whether it interpreted your data correctly. This is your primary debugging tool. 3. **Test across question types.** The same dataset should support different analyses depending on the user's question. Ask for totals, groupings, trends, and filters to confirm SDA handles your schema correctly across all of them. ## Common Pitfalls | Pitfall | What Goes Wrong | Fix | | ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | | Incomplete final output | Your plugin fetches accounts in step 1 and enrichment data in step 2, but only step 2's output reaches SDA - so SDA can't answer questions about the account fields. | Merge all data into the final action's output. Verify it contains every field the user might ask about. May need a compound action | | System field names (`ARR__c`) | SDA sees `ARR__c` and doesn't know if it's annual revenue, an account type, or a status code. It may write code that treats it as a string or ignores it entirely. | Use `MAP()` to rename: `annual_recurring_revenue: item.ARR__c` | | Deeply nested JSON | SDA has to write extra code to unpack `response._sfdcMeta.rec_obj.info` before it can start the actual analysis. More code means more room for bugs. | Flatten to `{ "city": "NY" }`. Keep nesting only where it's semantically meaningful. | | Data serialized as a string | Your API returns data inside a string field (`"data": "{\"accounts\": [...]}"`) instead of structured JSON. SDA can't parse or operate on it. | Return real JSON objects and arrays. If the source API returns stringified data, parse it in your output mapper. | | Analysis logic in instructions | You add `"Calculate total ARR by industry"` to instructions. A user asks about churn by territory, and SDA now has two conflicting directives. | Instructions are for baseline rules (units, formats, thresholds). Let the user's question drive the analysis. | | Multiple overlapping plugins | Three plugins cover the same Salesforce accounts with different filters. The Reasoning Engine picks `GetEnterpriseAccountsPlugin` when the user wanted all accounts. | One plugin per data source. Let SDA and user queries handle the filtering. |