AI/DA Prompt Framework¶
The five-step framework: Task → Context → References → Evaluate → Iterate.
1. Task¶
Describe with specificity and format preference.
- What you want
- The output format (table, code, prose, JSON)
- Length (one paragraph, 3 bullets, no longer than X)
Bad: "Help with my data."
Good: "Write a Python function that takes a DataFrame, removes outliers using IQR (1.5×) on thepricecolumn, and returns the cleaned DataFrame plus a count of removed rows."
2. Context¶
Include relevant context.
- The dataset shape, types, sample rows
- Business goal
- Constraints (must run on Python 3.13, must use pandas, must complete in < 5s)
- Audience (technical / non-technical)
"Dataset: 50,000 e-commerce orders with columns
order_id, user_id, total, status, created_at. Goal: identify top 10 customers by lifetime value. Output: SQL for PostgreSQL 16."
3. References¶
Something the AI can use to inform its output.
- Example output ("here's similar code we use elsewhere")
- API docs ("here's the relevant pandas method signature")
- Schema ("here's the table DDL")
- Style guide ("we use snake_case, no abbreviations")
4. Evaluate¶
Ask for opportunities for improvement.
- "What edge cases did you miss?"
- "How would this fail with NULL values?"
- "What's the time complexity? Can it be faster?"
- "Are there alternative approaches I should consider?"
5. Iterate¶
Refine based on the evaluation.
- Apply fixes
- Re-prompt with clarifications
- Don't accept the first answer if it's not clear or correct
Full example¶
TASK
Write a SQL query (PostgreSQL 16) that returns the top 10 customers by
lifetime value (total revenue across all orders). Include customer name and
country. Format as a single CTE-based query.
CONTEXT
Tables:
customers (id PK, name, country, created_at)
orders (id PK, customer_id FK, total NUMERIC, status, created_at)
Only count orders where status = 'paid'. Active customer = at least 1 order
in the last 365 days.
REFERENCES
Style: lowercase keywords, snake_case, comma-leading on long lists.
Match existing query: SELECT c.id, c.name FROM customers c WHERE ...
EVALUATE
After your draft:
1. What happens if a customer has no orders? Are they excluded?
2. Does this handle currency mixing? (Assume USD only for now.)
3. Performance — would this benefit from indexes?
Tips¶
- Be specific about formats — "as Markdown table" / "as JSON with keys X, Y"
- Show, don't tell — give examples of input/output
- Constrain output — "no longer than 5 lines" prevents rambling
- Use roles — "Act as a senior data analyst reviewing this SQL"
- Chain prompts — break complex tasks into steps; let the AI plan first
- Cache schema — paste once at the start, refer back