Claude Code AI analyst tutorial: Build an SQL-free analyst

Insights AI News Claude Code AI analyst tutorial: Build an SQL-free analyst

AI News

20 Jan 2026

Read 17 min

Claude Code AI analyst tutorial: Build an SQL-free analyst

Claude Code AI analyst tutorial helps you build an SQL-free analyst that delivers fast, clear insights

This Claude Code AI analyst tutorial shows how to turn a chat-first coding model into a hands-on data teammate. You will learn a four-step workflow (Monitor, Explore, Craft, Impact), how to avoid context blowups, and how to connect Slack and Drive so the agent answers business questions with speed and clarity—without writing SQL. Data moves fast. Your team needs answers, not dashboards to babysit. With Claude Code, you can build an AI analyst that runs checks, digs into context, and writes clear updates. It can pull facts from your tools, explain anomalies, and propose next steps. This guide summarizes the approach shared by Sumeet Marwaha, Head of Data at Brex, and adds practical patterns you can ship today.

What you need before you start

Tools and access

Claude Code with model context tools and MCP support

Read access to your warehouse (Snowflake, BigQuery, Redshift, Postgres)

Connections to Slack and Drive (or similar) for business context

Optional MCP connectors for services like issue trackers and analytics

Data scope and permissions

Define which schemas and tables the agent can query

Create a service account with least-privilege access

Mask PII and set row-level filters if needed

Decide where outputs live: Slack channels, docs, or BI boards

Claude Code AI analyst tutorial: from question to insight

This framework keeps the agent focused and useful. It mirrors how a human analyst works, but with faster loops.

Step 1: Monitor

Set the agent to run a small set of key queries on a schedule. Do not start with everything. Pick five metrics that matter.

Revenue or GMV by day and week

Activation rate and onboarding funnel

Churn and reactivation

Acquisition by channel and cohort quality

Latency, error rate, or other product health signals

Design outputs for action:

Agent posts a daily Slack summary with deltas vs last week

It flags anomalies that pass a simple threshold or z-score

It links to raw query results with row limits, not full dumps

Step 2: Explore

When a metric looks off, the agent gathers context. It searches Slack threads, issue trackers, and code changes to learn what changed.

Quote key messages from Slack (“Feature X turned on for 50% on Tuesday”)

Pull open tickets and change logs that match the time window

Show relevant dashboards or queries used before

Ask the agent to form a simple hypothesis and propose the next query. Keep each loop short. The agent should run one or two focused queries, then report.

Step 3: Craft

Now the agent writes an explanation anyone can read. It uses clear language and cites sources.

Lead with the headline and a single chart or table

Explain who is affected and how big the change is

Add quotes or links from Slack and Drive for context

Call out unknowns and suggest how to fill the gaps

Ask the agent to include a one-paragraph summary and one-page detail. This mix serves both execs and builders.

Step 4: Impact

Have the agent estimate the business effect and propose an action.

Compare to past experiments or incidents

Estimate impact size based on historical lift or drop

Identify the owner and next step (ship, rollback, test)

Create or update tickets with the summary and links

This flow is fast because it limits each step. The agent never tries to do everything at once.

Build a data MCP in under an hour

You can spin up a useful agent with just a few queries and a handful of skills.

Map your data model

Teach the agent your domain with a short schema note.

Main entities (users, accounts, workspaces, projects)

Key events (signup, activation, paid, churn, feature used)

Join keys and grain (user_id, account_id, timestamps)

Business definitions (active = event within 7 days)

Keep this as a compact system prompt or a pinned “schema card” the agent can recall. Do not paste 20k lines of DDL.

Write three starter queries

Use a simple example like startup funding to show power quickly. Three queries can unlock a lot:

Entities: a list of startups with id, name, industry, founding date

Rounds: funding rounds with date, amount, round type

Trends: counts by month and averages by round type

Teach the agent how to join entities to rounds safely. Add rules like “always join rounds on company_id and restrict to last 24 months unless asked.”

Define skills and guardrails

Skills are small, reusable behaviors. They keep the agent safe and consistent.

limit_rows: append “LIMIT 50” to any join by default

sample_large: random sample when result > 5,000 rows

rewrite_on_timeout: if query runs > 120 seconds, reduce filters and retry

explain_first: ask the agent to explain the planned query before running it

quote_sources: include links to each dataset or doc used

With these skills, the agent stays within a safe context size and gives clear, repeatable outputs.

Keep the context window under control

Large result sets can blow up the model’s memory and cause confusion. Tame this with simple rules.

Token budget

Give the agent a token budget and make it explain how it will use it.

Schema notes: 10%

Query plans: 10%

Results: 50% (summaries over raw rows)

Writing: 30%

If the agent expects to exceed the budget, it should prune, summarize, or defer.

Enforce limits and sampling

Default every query to “LIMIT 50” unless the user overrides

Use “SELECT DISTINCT” when counts are enough

Aggregate early (group by) to shrink row count

Sample large tables with a stable hash or time window

Timeouts trigger rewrites

Set a 2–3 minute timeout per query. On timeout, the agent should:

Reduce date range

Drop non-essential columns

Swap inner joins for left joins only if needed

Explain the change and try again

Cache and summarize

Cache common aggregates and reuse them. Ask the agent to turn raw results into small, typed summaries:

Top 5 segments by change and absolute numbers

Sparkline for the last 30 days

One-sentence takeaway per segment

Connect Slack and Drive to enrich analysis

Context often lives in chat and docs. Bring it in, but keep it lean.

Slack threads

Search by keywords, dates, and message authors

Quote only the lines that mention launches, flags, or rollbacks

Link to the thread; do not dump full transcripts

Drive documents

Target design docs, PRDs, and experiment readouts

Pull titles, key bullets, and decision sections

Include doc links and last updated dates

Prompt pattern:

“Cite at most three sources”

“Use quotes under 30 words”

“Prefer docs updated in the last 90 days”

This keeps your context relevant and within budget.

Predictive example: Which startups might reach Series B?

You can ask the agent to run a lightweight prediction with transparent features. Use this as a teaching case, not as investment advice.

Features and labels

Label: whether a startup raised Series B within 24 months

Candidate features: seed amount, time between rounds, headcount trend, sector, geography, investor network

Data hygiene: remove obvious leakage (e.g., features that directly encode the label)

Method and sanity checks

Start with a simple logistic regression or decision tree

Use a time-based split for train/test

Report AUC, precision/recall at a chosen threshold

List top features and their direction of effect

Create a short list of companies with confidence bands

Have the agent include notes on limits: missing data, selection bias, and external shocks.

Operationalize your AI analyst

Once you trust the flow, make it part of daily work.

Alerts and schedules

Daily core metric post at 9am with a two-sentence summary

Weekly deep dive with three insights and two actions

Real-time anomaly alerts for severe drops or spikes

Reviews in code and docs

Have the agent comment on PRs that touch known metric owners

Ask it to check experiment analysis for sample size and power

Let it draft changelog entries with links to data

Governance and privacy

Log every query and file access for audit

Keep a redaction layer for PII

Define allowed channels for posting metrics

Rotate keys and refresh tokens on a schedule

Pitfalls and how to fix them

Context overload

Symptom: the agent loses track and contradicts itself. Fix: stricter limits, tighter summaries, and a smaller schema card.

Wrong joins and double counting

Symptom: numbers do not match dashboards. Fix: assert join keys, test with known cohorts, and pin definitions in the system prompt.

Timeouts and slow queries

Symptom: the agent stalls or drops. Fix: add indexes, reduce scan size, cache aggregates, and use “rewrite_on_timeout.”

Hallucinations

Symptom: confident claims without citations. Fix: enforce “quote_sources,” require links, and block claims with no evidence.

Security drift

Symptom: scope creep into sensitive data. Fix: enforce least privilege, add row-level filters, and put PII behind a separate role the agent cannot use.

Metrics to track success

Speed and quality

Time to first insight after an alert

Share of alerts with clear owners and actions

Reduction in ad-hoc SQL requests

Adoption

Weekly active users of the agent

Replies and reactions on summary posts

Tickets created from agent prompts

Accuracy

Discrepancy vs trusted dashboards within a small band

Escalations due to wrong numbers

Coverage of core metrics and definitions

A simple day-in-the-life flow

Morning check

Agent posts key metrics with deltas and one chart

Flags a drop in activation in EU last week

Exploration

Agent finds a Slack thread about a signup experiment

Runs a filtered query: EU users on version X vs Y

Reports a 3% drop tied to a new step in the flow

Craft and impact

Writes a short summary with two charts and quotes the Slack decision

Estimates weekly revenue risk from the change

Opens a ticket to revert the step for EU while testing a fix

This is the AI analyst you want: fast, cited, and focused on next actions.

Pro tips to level up your agent

Teach it to say “I don’t know.”

Add a rule: when confidence is low or data is missing, the agent should state that and ask for a specific permission or dataset.

Use templates for write-ups

Give the agent a standard outline:

Headline

What changed

Who is affected

Why we think it happened

What we will do next

Sources and links

This makes outputs consistent and scannable.

Start narrow, then grow

Begin with one team or product area. Expand only after accuracy and adoption look good. This prevents noise and builds trust.

Where this fits with your tool stack

The agent does not replace your warehouse or BI. It makes them easier to use.

Warehouse stays the source of truth

BI holds curated dashboards and certified metrics

Claude Code sits in front, answering questions and drafting updates

Slack and Drive hold context that the agent cites

Over time, you can let the agent open tickets, comment on code, and help with A/B test reviews. Add skills slowly and measure impact. The bottom line: you can build a reliable AI analyst in days, not months, if you keep the scope small, manage context strictly, and connect the right sources. Follow the steps in this Claude Code AI analyst tutorial to get your first wins fast, and then scale with guardrails intact.

(Source: https://creatoreconomy.so/p/build-an-ai-data-analyst-with-claude-code-sumeet)

For more news: Click Here

FAQ

Q: What is the four-step workflow described in the Claude Code AI analyst tutorial? A: The Claude Code AI analyst tutorial describes a four-step Monitor → Explore → Craft → Impact workflow that mirrors how a human analyst works but with faster loops. Monitor runs scheduled checks, Explore gathers context, Craft writes clear explanations, and Impact sizes business effects and recommends actions. Q: What tools and access do I need to set up the AI analyst? A: To follow the Claude Code AI analyst tutorial you need Claude Code with model context tools and MCP support, read access to your data warehouse (Snowflake, BigQuery, Redshift, or Postgres), and connections to Slack and Drive for business context. Optional MCP connectors for issue trackers and analytics are useful, and you should create a least-privilege service account and mask PII as needed. Q: How does the agent avoid blowing up the context window? A: The tutorial recommends strict token budgets, query limits, sampling, and timeouts to prevent context overload. Practical guardrails include defaulting joins to LIMIT 50, sampling large results, setting 2–3 minute query timeouts that trigger rewrites, and summarizing results instead of dumping raw rows. Q: How should I connect Slack and Drive so the agent uses business context without exceeding limits? A: Connect Slack and Drive using targeted searches and selective quoting so the agent pulls only relevant lines and links rather than full transcripts. The Claude Code AI analyst tutorial suggests citing at most three sources, preferring recent docs, quoting lines under 30 words, and linking threads or docs instead of pasting them. Q: What starter queries and data model guidance does the Claude Code AI analyst tutorial recommend? A: Start with a short schema card that lists main entities, key events, join keys and business definitions, then write three starter queries (entities, rounds, and trends) to demonstrate value quickly. The tutorial also advises rules like joining on company_id and restricting to the last 24 months unless explicitly requested. Q: What skills and guardrails should I define for a safe data agent? A: Define small reusable skills such as limit_rows (append LIMIT 50), sample_large (random sample when results are huge), rewrite_on_timeout (reduce range on timeouts), explain_first, and quote_sources to keep outputs consistent and auditable. These guardrails enforce token limits, reduce hallucinations, and make query behavior predictable. Q: How can I use the agent to run a simple prediction like which startups might reach Series B? A: Use a transparent feature set and a simple model as a teaching case: label Series B within 24 months and candidate features like seed amount, time between rounds, headcount trend, sector, and investor network. The tutorial recommends simple methods (logistic regression or decision tree), a time-based train/test split, and reporting AUC and precision/recall while noting limits like missing data and selection bias. Q: What metrics should teams track to measure the AI analyst’s success? A: Track speed and quality metrics such as time to first insight after an alert, share of alerts with clear owners and actions, and reduction in ad-hoc SQL requests, as well as adoption metrics like weekly active users and tickets created from agent prompts. Also measure accuracy by discrepancy versus trusted dashboards, number of escalations due to wrong numbers, and coverage of core metrics and definitions.

Claude Code AI analyst tutorial: Build an SQL-free analyst

What you need before you start

Tools and access

Data scope and permissions

Claude Code AI analyst tutorial: from question to insight

Step 1: Monitor

Step 2: Explore

Step 3: Craft

Step 4: Impact

Build a data MCP in under an hour

Map your data model

Write three starter queries

Define skills and guardrails

Keep the context window under control

Token budget

Enforce limits and sampling

Timeouts trigger rewrites

Cache and summarize

Connect Slack and Drive to enrich analysis

Slack threads

Drive documents

Predictive example: Which startups might reach Series B?

Features and labels

Method and sanity checks

Operationalize your AI analyst

Alerts and schedules

Reviews in code and docs

Governance and privacy

Pitfalls and how to fix them

Context overload

Wrong joins and double counting

Timeouts and slow queries

Hallucinations

Security drift

Metrics to track success

Speed and quality

Adoption

Accuracy

A simple day-in-the-life flow

Morning check

Exploration

Craft and impact

Pro tips to level up your agent

Teach it to say “I don’t know.”

Use templates for write-ups

Start narrow, then grow

Where this fits with your tool stack

FAQ

Similar Articles

WhatsApp device linking warning: How to avoid scams

Forced AI adoption at Amazon: How to protect your job

Are AI chatbot conversations discoverable and what to do

How Oracle avoids SaaS apocalypse with AI agents

How AI coding tools for enterprise SaaS cut development time

2026 generative AI app trends How to monetize them