168 lines
7.1 KiB
Markdown
168 lines
7.1 KiB
Markdown
# InternalAI Workspace
|
|
|
|
Agent-assisted workspace to work on your own data with InternalAI (ContactDB / DataIndex).
|
|
|
|
## Things you can do
|
|
|
|
- **Onboard yourself** — `can you onboard me?` creates your `MYSELF.md`
|
|
- **Weekly checkout** — `create my checkout of last week` builds a summary from your activity
|
|
- **Data analysis** — `create a workflow that searches all meetings since 2024 where Max is listed as a participant (not a contactdb), and output as csv` creates a marimo notebook in `workflows/`
|
|
- **Init a project** — `create the creatrix project` creates `projects/creatrix/` with base information
|
|
- **Sync a project** — `sync the creatrix project` runs a full 1-year analysis on the first run, then incremental syncs afterward, producing a live `project.md` document
|
|
|
|
## Setup
|
|
|
|
### Prerequisites
|
|
|
|
- [Greywall](https://gitea.app.monadical.io/monadical/greywall) installed — verify with `greywall --version`
|
|
- [OpenCode](https://opencode.ai) installed as a native binary (not a wrapper via bun/npm/pnpm)
|
|
|
|
### Greywall sandbox template
|
|
|
|
Run OpenCode in learning mode so Greywall can observe which files it reads and writes:
|
|
|
|
```
|
|
greywall --learning -- opencode
|
|
```
|
|
|
|
Interact briefly, then exit OpenCode. Greywall generates a sandbox template based on the observed filesystem access. Edit the template if needed.
|
|
|
|
### MCP configuration
|
|
|
|
Add the ContactDB and DataIndex MCP servers:
|
|
|
|
```
|
|
greywall -- opencode mcp add
|
|
```
|
|
|
|
Run the command twice with these settings:
|
|
|
|
| Name | Type | URL | OAuth |
|
|
|------|------|-----|-------|
|
|
| `contactdb` | Remote MCP | `http://caddy/contactdb-api/mcp/` | No |
|
|
| `dataindex` | Remote MCP | `http://caddy/dataindex/mcp/` | No |
|
|
|
|
Verify the servers are registered:
|
|
|
|
```
|
|
greywall -- opencode mcp list
|
|
```
|
|
|
|
Then open your proxy at `http://localhost:42000/proxy` and allow access to Caddy.
|
|
|
|
### LiteLLM provider
|
|
|
|
Add a `litellm` provider in `opencode.json`:
|
|
|
|
```json
|
|
{
|
|
"$schema": "https://opencode.ai/config.json",
|
|
"provider": {
|
|
"litellm": {
|
|
"npm": "@ai-sdk/openai-compatible",
|
|
"name": "Litellm",
|
|
"options": {
|
|
"baseURL": "https://litellm-notrack.app.monadical.io",
|
|
"apiKey": "sk-xxxxx"
|
|
},
|
|
"models": {
|
|
"Kimi-K2.5-sandbox": {
|
|
"name": "Kimi-K2.5-sandbox"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Replace `apiKey` with your own key (check 1Password for "litellm notrack").
|
|
|
|
## Usage
|
|
|
|
Start OpenCode inside the Greywall sandbox:
|
|
|
|
```
|
|
greywall -- opencode
|
|
```
|
|
|
|
### First-run checklist
|
|
|
|
1. Select the Kimi K2.5 model under litellm in `/models` — type "hello" to confirm it responds (if not, check the proxy)
|
|
2. Test ContactDB access — ask "who am I?" (should trigger `get_me`)
|
|
3. Test DataIndex access — ask "what was my last meeting about?"
|
|
|
|
## Skills
|
|
|
|
Skills are agent instructions stored in `.agents/skills/`. They follow the [Agent Skills](https://agentskills.io) standard (same structure as `.claude/skills/`). Some are invoked by the user via `/name`, others are background knowledge the agent loads automatically when relevant.
|
|
|
|
### Task Skills (user-invoked)
|
|
|
|
These are workflows you trigger explicitly. The agent will not run them on its own.
|
|
|
|
| Skill | Invocation | Purpose |
|
|
|-------|-----------|---------|
|
|
| **project-init** | `/project-init [name]` | Set up a new project: create directory structure, discover data sources (Zulip streams, git repos, meeting rooms), write `datasources.md` and `background.md` skeleton. Stops before gathering data so you can review the sources. |
|
|
| **project-history** | `/project-history [name] [from] [to]` | Build the initial timeline for a project. Queries all datasources for a date range, creates week-by-week analysis files, builds the timeline index, and synthesizes the background. Requires `project-init` first. |
|
|
| **project-sync** | `/project-sync [name]` | Incremental update of a project timeline. Reads the last sync date from `sync-state.md`, fetches new data through today, creates new week files, and refreshes the timeline and background. |
|
|
| **checkout** | `/checkout` | Build a weekly review (Sunday through today). Gathers meetings, emails, Zulip conversations, and Gitea activity, then produces a structured checkout summary. |
|
|
| **workflow** | `/workflow [topic]` | Create a marimo notebook for data analysis. Use for any request involving analysis over time periods or large data volumes. |
|
|
| **self-onboarding** | `/self-onboarding` | Generate a personalized `MYSELF.md` by analyzing 12 months of historical activity (meetings, emails, Zulip, calendar). Runs 19 parallel subagents to build a comprehensive profile. |
|
|
|
|
### Reference Skills (agent-loaded automatically)
|
|
|
|
These provide background knowledge the agent loads when relevant. They don't appear in the `/` menu.
|
|
|
|
| Skill | What the agent learns |
|
|
|-------|----------------------|
|
|
| **connectors** | Which data connectors exist and what entity types they produce (reflector, zulip, email, calendar, etc.) |
|
|
| **dataindex** | How to query the DataIndex REST API (`GET /query`, `POST /search`, `GET /entities/{id}`) |
|
|
| **contactdb** | How to resolve people to contact IDs via the ContactDB REST API |
|
|
| **company** | Monadical org structure, Zulip channel layout, communication tools, meeting/calendar relationships |
|
|
| **notebook-patterns** | Marimo notebook rules: cell scoping, async patterns, pagination helpers, analysis templates |
|
|
|
|
## Project Tracking
|
|
|
|
Project analysis files live in `projects/`. See [projects/README.md](projects/README.md) for the directory structure and categorization guidelines.
|
|
|
|
**Typical workflow:**
|
|
|
|
```
|
|
/project-init myproject # 1. Discover sources, create skeleton
|
|
# Review datasources.md, adjust if needed
|
|
/project-history myproject 2025-06-01 2026-02-17 # 2. Backfill history
|
|
# ... time passes ...
|
|
/project-sync myproject # 3. Incremental update
|
|
```
|
|
|
|
Each project produces:
|
|
|
|
```
|
|
projects/{name}/
|
|
├── datasources.md # Where to find data (Zulip streams, git repos, meeting rooms)
|
|
├── background.md # Living doc: current status, team, architecture
|
|
├── sync-state.md # Tracks last sync date for incremental updates
|
|
└── timeline/
|
|
├── index.md # Navigation and milestones
|
|
└── {year-month}/
|
|
└── week-{n}.md # One week of history (write-once)
|
|
```
|
|
|
|
## Data Analysis Workflows
|
|
|
|
Analysis notebooks live in `workflows/`. Each is a marimo `.py` file.
|
|
|
|
```
|
|
/workflow meetings-with-alice # Creates workflows/NNN_meetings_with_alice.py
|
|
```
|
|
|
|
See the [workflow skill](.agents/skills/workflow/SKILL.md) for naming conventions and the [notebook-patterns skill](.agents/skills/notebook-patterns/SKILL.md) for marimo coding rules.
|
|
|
|
## Data Sources
|
|
|
|
All data flows through two APIs:
|
|
|
|
- **DataIndex** (`localhost:42000/dataindex/api/v1` direct, `http://caddy/dataindex/api/v1` via greywall sandbox) — unified query interface for all entity types
|
|
- **ContactDB** (`localhost:42000/contactdb-api` direct, `http://caddy/contactdb-api/` via greywall sandbox) — people directory, resolves names/emails to contact IDs
|
|
|
|
Connectors that feed DataIndex: `reflector` (meetings), `zulip` (chat), `mbsync_email` (email), `ics_calendar` (calendar), `hedgedoc` (documents), `browser_history` (web pages), `babelfish` (translations).
|