Add LLM filtering pattern, .env.example, and workflows/lib
- Add .env.example with LLM_API_URL, LLM_MODEL, LLM_API_KEY - Add .gitignore to exclude .env - Add Pattern 5 (LLM filtering) to notebook-patterns.md - Track workflows/lib with llm_call helper using mirascope - Update README with LLM setup step and updated project structure
This commit is contained in:
3
.env.example
Normal file
3
.env.example
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
LLM_API_URL=https://litellm-notrack.app.monadical.io
|
||||||
|
LLM_MODEL=GLM-4.5-Air-FP8-dev
|
||||||
|
LLM_API_KEY=xxxxx
|
||||||
1
.gitignore
vendored
Normal file
1
.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
|||||||
|
.env
|
||||||
14
README.md
14
README.md
@@ -33,6 +33,14 @@ The goal is to use [opencode](https://opencode.ai) (or any LLM-powered coding to
|
|||||||
|
|
||||||
Replace `xxxxx` with your actual LiteLLM API key.
|
Replace `xxxxx` with your actual LiteLLM API key.
|
||||||
|
|
||||||
|
4. **(Optional) LLM filtering in workflows** — if your workflows need to classify or score entities via an LLM, copy `.env.example` to `.env` and fill in your key:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cp .env.example .env
|
||||||
|
```
|
||||||
|
|
||||||
|
The `workflows/lib` module provides an `llm_call` helper (using [mirascope](https://mirascope.io)) for structured LLM calls — see Pattern 5 in `docs/notebook-patterns.md`.
|
||||||
|
|
||||||
## Quickstart
|
## Quickstart
|
||||||
|
|
||||||
1. Run `opencode` from the project root
|
1. Run `opencode` from the project root
|
||||||
@@ -60,11 +68,15 @@ It also includes API base URLs, a translation table mapping natural-language que
|
|||||||
```
|
```
|
||||||
internalai-agent/
|
internalai-agent/
|
||||||
├── AGENTS.md # LLM agent routing guide (entry point)
|
├── AGENTS.md # LLM agent routing guide (entry point)
|
||||||
|
├── .env.example # LLM credentials template
|
||||||
├── docs/
|
├── docs/
|
||||||
│ ├── company-context.md # Monadical org, tools, key concepts
|
│ ├── company-context.md # Monadical org, tools, key concepts
|
||||||
│ ├── contactdb-api.md # ContactDB REST API reference
|
│ ├── contactdb-api.md # ContactDB REST API reference
|
||||||
│ ├── dataindex-api.md # DataIndex REST API reference
|
│ ├── dataindex-api.md # DataIndex REST API reference
|
||||||
│ ├── connectors-and-sources.md # Connector → entity type mappings
|
│ ├── connectors-and-sources.md # Connector → entity type mappings
|
||||||
│ └── notebook-patterns.md # Marimo notebook templates and patterns
|
│ └── notebook-patterns.md # Marimo notebook templates and patterns
|
||||||
└── workflows/ # Generated analysis notebooks go here
|
└── workflows/
|
||||||
|
└── lib/ # Shared helpers for notebooks
|
||||||
|
├── __init__.py
|
||||||
|
└── llm.py # llm_call() — structured LLM calls via mirascope
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -425,6 +425,89 @@ def display_timeline(timeline_df):
|
|||||||
timeline_df
|
timeline_df
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Pattern 5: LLM Filtering with `lib.llm`
|
||||||
|
|
||||||
|
When you need to classify, score, or extract structured information from each entity (e.g. "is this meeting about project X?", "rate the relevance of this email"), use the `llm_call` helper from `workflows/lib`. It sends each item to an LLM and parses the response into a typed Pydantic model.
|
||||||
|
|
||||||
|
**Prerequisites:** Copy `.env.example` to `.env` and fill in your `LLM_API_KEY`. Add `mirascope` and `pydantic` to the notebook's PEP 723 dependencies.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# /// script
|
||||||
|
# requires-python = ">=3.12"
|
||||||
|
# dependencies = [
|
||||||
|
# "marimo",
|
||||||
|
# "httpx",
|
||||||
|
# "polars",
|
||||||
|
# "mirascope",
|
||||||
|
# "pydantic",
|
||||||
|
# ]
|
||||||
|
# ///
|
||||||
|
```
|
||||||
|
|
||||||
|
### Setup cell — import `llm_call`
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.cell
|
||||||
|
def setup():
|
||||||
|
import httpx
|
||||||
|
import marimo as mo
|
||||||
|
import polars as pl
|
||||||
|
from lib.llm import llm_call
|
||||||
|
client = httpx.Client(timeout=30)
|
||||||
|
return (client, llm_call, mo, pl,)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Define a response model
|
||||||
|
|
||||||
|
Create a Pydantic model that describes the structured output you want from the LLM:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.cell
|
||||||
|
def models():
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
class RelevanceScore(BaseModel):
|
||||||
|
relevant: bool
|
||||||
|
reason: str
|
||||||
|
score: int # 0-10
|
||||||
|
|
||||||
|
return (RelevanceScore,)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Filter entities through the LLM
|
||||||
|
|
||||||
|
Iterate over fetched entities and call `llm_call` for each one. Since `llm_call` is async, use `asyncio.gather` to process items concurrently:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.cell
|
||||||
|
async def llm_filter(meetings, llm_call, RelevanceScore, pl, mo):
|
||||||
|
import asyncio
|
||||||
|
|
||||||
|
_topic = "Greyhaven"
|
||||||
|
|
||||||
|
async def _score(meeting):
|
||||||
|
_text = meeting.get("summary") or meeting.get("title") or ""
|
||||||
|
_result = await llm_call(
|
||||||
|
prompt=f"Is this meeting about '{_topic}'?\n\nMeeting: {_text}",
|
||||||
|
response_model=RelevanceScore,
|
||||||
|
system_prompt="Score the relevance of this meeting to the given topic. Set relevant=true if score >= 5.",
|
||||||
|
)
|
||||||
|
return {**meeting, "llm_relevant": _result.relevant, "llm_reason": _result.reason, "llm_score": _result.score}
|
||||||
|
|
||||||
|
scored_meetings = await asyncio.gather(*[_score(_m) for _m in meetings])
|
||||||
|
relevant_meetings = [_m for _m in scored_meetings if _m["llm_relevant"]]
|
||||||
|
|
||||||
|
mo.md(f"**LLM filter:** {len(relevant_meetings)}/{len(meetings)} meetings relevant to '{_topic}'")
|
||||||
|
return (relevant_meetings,)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tips for LLM filtering
|
||||||
|
|
||||||
|
- **Keep prompts short** — only include the fields the LLM needs (title, summary, snippet), not the entire raw entity.
|
||||||
|
- **Use structured output** — always pass a `response_model` so you get typed fields back, not free-text.
|
||||||
|
- **Batch wisely** — `asyncio.gather` sends all requests concurrently. For large datasets (100+ items), process in chunks to avoid rate limits.
|
||||||
|
- **Cache results** — LLM calls are slow and cost money. If iterating on a notebook, consider storing scored results in a cell variable so you don't re-score on every edit.
|
||||||
|
|
||||||
## Do / Don't — Quick Reference for LLM Agents
|
## Do / Don't — Quick Reference for LLM Agents
|
||||||
|
|
||||||
When generating marimo notebooks, follow these rules strictly. Violations cause `MultipleDefinitionError` at runtime.
|
When generating marimo notebooks, follow these rules strictly. Violations cause `MultipleDefinitionError` at runtime.
|
||||||
|
|||||||
5
workflows/lib/__init__.py
Normal file
5
workflows/lib/__init__.py
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
"""Library modules for contact analysis workbooks."""
|
||||||
|
|
||||||
|
from lib.llm import llm_call
|
||||||
|
|
||||||
|
__all__ = ["llm_call"]
|
||||||
51
workflows/lib/llm.py
Normal file
51
workflows/lib/llm.py
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
"""Simple LLM helper for workbooks using Mirascope."""
|
||||||
|
|
||||||
|
import os
|
||||||
|
from typing import TypeVar
|
||||||
|
|
||||||
|
from mirascope.core import Messages, openai
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
T = TypeVar("T", bound=BaseModel)
|
||||||
|
|
||||||
|
# Configure from environment (defaults match .env.example)
|
||||||
|
_api_key = os.getenv("LLM_API_KEY", "")
|
||||||
|
_base_url = os.getenv("LLM_API_URL", "https://litellm-notrack.app.monadical.io")
|
||||||
|
_model = os.getenv("LLM_MODEL", "GLM-4.5-Air-FP8-dev")
|
||||||
|
|
||||||
|
if _api_key:
|
||||||
|
os.environ["OPENAI_API_KEY"] = _api_key
|
||||||
|
if _base_url:
|
||||||
|
base = _base_url.rstrip("/")
|
||||||
|
os.environ["OPENAI_BASE_URL"] = base if base.endswith("/v1") else f"{base}/v1"
|
||||||
|
|
||||||
|
|
||||||
|
async def llm_call(
|
||||||
|
prompt: str,
|
||||||
|
response_model: type[T],
|
||||||
|
system_prompt: str = "You are a helpful assistant.",
|
||||||
|
model: str | None = None,
|
||||||
|
) -> T:
|
||||||
|
"""Make a structured LLM call.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The user prompt
|
||||||
|
response_model: Pydantic model for structured output
|
||||||
|
system_prompt: System instructions
|
||||||
|
model: Override the default model
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Parsed response matching the response_model schema
|
||||||
|
"""
|
||||||
|
use_model = model or _model
|
||||||
|
|
||||||
|
@openai.call(model=use_model, response_model=response_model)
|
||||||
|
async def _call(sys: str, usr: str) -> openai.OpenAIDynamicConfig:
|
||||||
|
return {
|
||||||
|
"messages": [
|
||||||
|
Messages.System(sys),
|
||||||
|
Messages.User(usr),
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
return await _call(system_prompt, prompt)
|
||||||
Reference in New Issue
Block a user