feat: migrate to skills-based approach
This commit is contained in:
105
.agents/skills/workflow/SKILL.md
Normal file
105
.agents/skills/workflow/SKILL.md
Normal file
@@ -0,0 +1,105 @@
|
||||
---
|
||||
name: workflow
|
||||
description: Create a marimo notebook for data analysis. Use when the request involves analysis over time periods, large data volumes, or when the user asks to "create a workflow".
|
||||
disable-model-invocation: true
|
||||
argument-hint: [topic]
|
||||
---
|
||||
|
||||
# Workflow — Create a Marimo Notebook
|
||||
|
||||
## When to create a marimo notebook
|
||||
|
||||
Any request that involves **analysis over a period of time** (e.g., "meetings this month", "emails since January", "interaction trends") is likely to return a **large volume of data** — too much to process inline. In these cases, **always produce a marimo notebook** (a `.py` file following the patterns in the [notebook-patterns skill](.agents/skills/notebook-patterns/SKILL.md)).
|
||||
|
||||
Also create a notebook when the user asks to "create a workflow", "write a workflow", or "build an analysis".
|
||||
|
||||
If you're unsure whether a question is simple enough to answer directly or needs a notebook, **ask the user**.
|
||||
|
||||
## Always create a new workflow
|
||||
|
||||
When the user requests a workflow, **always create a new notebook file**. Do **not** modify or re-run an existing workflow unless the user explicitly asks you to (e.g., "update workflow 001", "fix the sentiment notebook", "re-run the existing analysis"). Each new request gets its own sequentially numbered file — even if it covers a similar topic to an earlier workflow.
|
||||
|
||||
## File naming and location
|
||||
|
||||
All notebooks go in the **`workflows/`** directory. Use a sequential number prefix so workflows stay ordered by creation:
|
||||
|
||||
```
|
||||
workflows/<NNN>_<topic>_<scope>.py
|
||||
```
|
||||
|
||||
- `<NNN>` — zero-padded sequence number (`001`, `002`, …). Look at existing files in `workflows/` to determine the next number.
|
||||
- `<topic>` — what is being analyzed, in snake_case (e.g., `greyhaven_meetings`, `alice_emails`, `hiring_discussions`)
|
||||
- `<scope>` — time range or qualifier (e.g., `january`, `q1_2026`, `last_30d`, `all_time`)
|
||||
|
||||
**Examples:**
|
||||
|
||||
```
|
||||
workflows/001_greyhaven_meetings_january.py
|
||||
workflows/002_alice_emails_q1_2026.py
|
||||
workflows/003_hiring_discussions_last_30d.py
|
||||
workflows/004_team_interaction_timeline_all_time.py
|
||||
```
|
||||
|
||||
**Before creating a new workflow**, list existing files in `workflows/` to find the highest number and increment it.
|
||||
|
||||
## Plan before you implement
|
||||
|
||||
Before writing any notebook, **always propose a plan first** and get the user's approval. The plan should describe:
|
||||
|
||||
1. **Goal** — What question are we answering?
|
||||
2. **Data sources** — Which entity types and API endpoints will be used?
|
||||
3. **Algorithm / ETL steps** — Step-by-step description of the data pipeline: what gets fetched, how it's filtered, joined, or aggregated, and what the final output looks like.
|
||||
4. **Output format** — Table columns, charts, or summary statistics the user will see.
|
||||
|
||||
Only proceed to implementation after the user confirms the plan.
|
||||
|
||||
## Validate before delivering
|
||||
|
||||
After writing or editing a notebook, **always run `uvx marimo check`** to verify it has no structural errors (duplicate variables, undefined names, branch expressions, etc.):
|
||||
|
||||
```bash
|
||||
uvx marimo check workflows/NNN_topic_scope.py
|
||||
```
|
||||
|
||||
A clean check (no output, exit code 0) means the notebook is valid. Fix any errors before delivering the notebook to the user.
|
||||
|
||||
## Steps
|
||||
|
||||
1. **Identify people** — Use ContactDB to resolve names/emails to `contact_id` values. For "me"/"my" questions, always start with `GET /api/contacts/me`.
|
||||
2. **Find data** — Use DataIndex `GET /query` (exhaustive, paginated) or `POST /search` (semantic, ranked) with `contact_ids`, `entity_types`, `date_from`/`date_to`, `connector_ids` filters.
|
||||
3. **Analyze** — For simple answers, process the API response directly. For complex multi-step analysis, build a marimo notebook (see the [notebook-patterns skill](.agents/skills/notebook-patterns/SKILL.md) for detailed patterns).
|
||||
|
||||
## Quick Example (Python)
|
||||
|
||||
> "Find all emails involving Alice since January"
|
||||
|
||||
```python
|
||||
import httpx
|
||||
|
||||
CONTACTDB = "http://localhost:42000/contactdb-api"
|
||||
DATAINDEX = "http://localhost:42000/dataindex/api/v1"
|
||||
client = httpx.Client(timeout=30)
|
||||
|
||||
# 1. Resolve "Alice" to a contact_id
|
||||
resp = client.get(f"{CONTACTDB}/api/contacts", params={"search": "Alice"})
|
||||
alice_id = resp.json()["contacts"][0]["id"] # e.g. 42
|
||||
|
||||
# 2. Fetch all emails involving Alice (with pagination)
|
||||
emails = []
|
||||
offset = 0
|
||||
while True:
|
||||
resp = client.get(f"{DATAINDEX}/query", params={
|
||||
"entity_types": "email",
|
||||
"contact_ids": str(alice_id),
|
||||
"date_from": "2025-01-01T00:00:00Z",
|
||||
"limit": 50,
|
||||
"offset": offset,
|
||||
})
|
||||
data = resp.json()
|
||||
emails.extend(data["items"])
|
||||
if offset + 50 >= data["total"]:
|
||||
break
|
||||
offset += 50
|
||||
|
||||
print(f"Found {len(emails)} emails involving Alice")
|
||||
```
|
||||
Reference in New Issue
Block a user