AI Tools for DA¶
Code-aware AI¶
| Tool | Best for |
|---|---|
| Claude | Long context, code analysis, careful reasoning |
| ChatGPT | General purpose, plugin ecosystem |
| GitHub Copilot | In-IDE code completion |
| Cursor | AI-first editor, multi-file edits |
| Claude Code | Terminal/CLI agent, full repo context |
Data-specific AI¶
| Tool | Description |
|---|---|
| Julius AI | Upload CSV → natural language analysis + viz |
| Hex Magic | AI inside notebook environment |
| Mode AI Helper | SQL and Python assist |
| Databricks Genie | NL → SQL on lakehouse |
| Snowflake Cortex | LLM functions in SQL |
| BigQuery — Data Canvas | NL data exploration |
Local LLMs¶
For sensitive data:
- Ollama — easiest local LLM runner
- llama.cpp — efficient C++ inference
- LM Studio — desktop app
- GPT4All
Data Cards¶
Google — Data Cards Playbook — framework for transparent, structured documentation of datasets used in ML.
DASF 2.0¶
Databricks AI Security Framework 2.0 — security framework for AI systems.
Choose by task¶
| Task | Recommended tool |
|---|---|
| Quick CSV exploration | Julius AI, ChatGPT (Code Interpreter) |
| Codebase-aware refactor | Claude Code, Cursor |
| In-IDE autocomplete | Copilot |
| NL → SQL | Genie, Cortex, custom GPT |
| Sensitive on-prem | Ollama + Llama 3.1 |
| Long-context document analysis | Claude (200K context) |