Files

106 lines
5.3 KiB
Markdown

---
name: connectors
description: Reference for all data connectors and their entity type mappings. Use when determining which connector produces which entity types, understanding connector-specific fields, or choosing the right data source for a query.
user-invocable: false
---
# Connectors and Data Sources
Each connector ingests data from an external source into DataIndex. Connectors run periodic background syncs to keep data fresh.
Use `list_connectors()` at runtime to see which connectors are actually configured — not all connectors below may be active in every deployment.
## Connector → Entity Type Mapping
| Connector ID | Entity Types Produced | Description |
|------------------|-----------------------------------------------------------------|----------------------------------|
| `reflector` | `meeting` | Meeting recordings + transcripts |
| `ics_calendar` | `calendar_event` | ICS calendar feed events |
| `mbsync_email` | `email` | Email via mbsync IMAP sync |
| `zulip` | `conversation`, `conversation_message`, `threaded_conversation` | Zulip chat streams and topics |
| `babelfish` | `conversation_message`, `threaded_conversation` | Chat translation bridge |
| `hedgedoc` | `document` | HedgeDoc collaborative documents |
| `contactdb` | `contact` | Synced from ContactDB (static) |
| `browser_history`| `webpage` | Browser extension page visits |
| `api_document` | `document` | API-ingested documents (static) |
## Per-Connector Details
### `reflector` — Meeting Recordings
Ingests meetings from Reflector, Monadical's meeting recording tool.
- **Entity type:** `meeting`
- **Key fields:** `transcript`, `summary`, `participants`, `start_time`, `end_time`, `room_name`
- **Use cases:** Find meetings someone attended, search meeting transcripts, get summaries
- **Tip:** Filter with `contact_ids` to find meetings involving specific people. The `transcript` field contains speaker-diarized text.
### `ics_calendar` — Calendar Events
Parses ICS calendar feeds (Google Calendar, Outlook, etc.).
- **Entity type:** `calendar_event`
- **Key fields:** `start_time`, `end_time`, `attendees`, `location`, `description`, `calendar_name`
- **Use cases:** Check upcoming events, find events with specific attendees, review past schedule
- **Tip:** Multiple calendar feeds may be configured as separate connectors (e.g., `personal_calendar`, `work_calendar`). Use `list_connectors()` to discover them.
### `mbsync_email` — Email
Syncs email via mbsync (IMAP).
- **Entity type:** `email`
- **Key fields:** `text_content`, `from_contact_id`, `to_contact_ids`, `cc_contact_ids`, `thread_id`, `has_attachments`
- **Use cases:** Find emails from/to someone, search email content, track email threads
- **Tip:** Use `from_contact_id` and `to_contact_ids` with `contact_ids` filter. For thread grouping, use the `thread_id` field.
### `zulip` — Chat
Ingests Zulip streams, topics, and messages.
- **Entity types:**
- `conversation` — A Zulip stream/channel with recent messages
- `conversation_message` — Individual chat messages
- `threaded_conversation` — A topic thread within a stream
- **Key fields:** `message`, `mentioned_contact_ids`, `recent_messages`
- **Use cases:** Find discussions about a topic, track who said what, find @-mentions
- **Tip:** Use `threaded_conversation` to find topic-level discussions. Use `conversation_message` with `mentioned_contact_ids` to find messages that mention specific people.
### `babelfish` — Translation Bridge
Ingests translated chat messages from the Babelfish service.
- **Entity types:** `conversation_message`, `threaded_conversation`
- **Use cases:** Similar to Zulip but for translated cross-language conversations
- **Tip:** Query alongside `zulip` connector for complete conversation coverage.
### `hedgedoc` — Collaborative Documents
Syncs documents from HedgeDoc (collaborative markdown editor).
- **Entity type:** `document`
- **Key fields:** `content`, `description`, `url`, `revision_id`
- **Use cases:** Find documents by content, track document revisions
- **Tip:** Use `search()` for semantic document search rather than `query_entities` text filter.
### `contactdb` — Contact Sync (Static)
Mirrors contacts from ContactDB into DataIndex for unified search.
- **Entity type:** `contact`
- **Note:** This is a read-only mirror. Use ContactDB MCP tools directly for contact operations.
### `browser_history` — Browser Extension (Static)
Captures visited webpages from a browser extension.
- **Entity type:** `webpage`
- **Key fields:** `url`, `visit_time`, `text_content`
- **Use cases:** Find previously visited pages, search page content
### `api_document` — API Documents (Static)
Documents ingested via the REST API (e.g., uploaded PDFs, imported files).
- **Entity type:** `document`
- **Note:** These are ingested via `POST /api/v1/ingest/documents`, not periodic sync.