Workspace Model
MailAtlas stores email artifacts in one local workspace root. The default workspace root is .mailatlas in the current directory unless you set MAILATLAS_HOME, pass --root, or configure a project root.
Think of the workspace as the durable boundary between MailAtlas and your application. The CLI, Python API, MCP server, exports, and outbound send workflow all read from or write to this local root.
The built-in workspace uses:
- Files on disk for raw inbound messages, HTML snapshots, extracted assets, exports, and outbound audit artifacts.
- SQLite for metadata, lookup, dedupe, receive accounts, receive cursors, receive runs, IMAP receive cursors, and outbound send records.
This is the default local storage layout. Applications can copy the resulting files and metadata into their own storage systems when needed.
Directory layout
Section titled “Directory layout”.mailatlas/ store.db raw/ html/ assets/ exports/ outbound/ raw/ text/ html/ attachments/| Path | Purpose |
|---|---|
store.db | SQLite index for document metadata, lookup, dedupe, receive account state, receive cursors, receive run history, IMAP receive cursors, and outbound records. |
raw/ | Original inbound email bytes, usually stored as .eml files. |
html/ | Normalized HTML body snapshots with local asset references when an inbound message contains HTML. |
assets/ | Extracted inline images and regular file attachments from inbound messages. |
exports/ | Default destination for file-based outputs such as PDF exports when --out is omitted. |
outbound/raw/ | Rendered outbound .eml snapshots. |
outbound/text/ | Plain-text outbound body files. |
outbound/html/ | HTML outbound body files. |
outbound/attachments/ | Copied outbound attachments. |
Why this layout exists
Section titled “Why this layout exists”The workspace is designed to be inspectable:
- You can inspect every stage of the pipeline.
- Raw messages stay linked to parsed records.
- Assets stay next to the documents that reference them.
- SQLite is enough for document listing, lookup, dedupe, receive state, cursor state, and send records.
- Exported files are ordinary artifacts that can be reviewed, copied, archived, or indexed elsewhere.
What MailAtlas stores
Section titled “What MailAtlas stores”MailAtlas can store raw email bytes, cleaned body text, normalized HTML, extracted inline files, extracted attachments, document metadata, parser notes, exported artifacts, receive account state, receive cursor state, receive run history, IMAP receive cursor state, outbound records, copied outbound attachments, and BCC recipients in SQLite for audit.
MailAtlas omits BCC from local raw MIME snapshots while preserving BCC in SQLite for audit.
Document lifecycle
Section titled “Document lifecycle”File ingest
Section titled “File ingest”- MailAtlas reads an
.emlfile ormboxarchive. - It parses each message.
- It stores raw bytes in
raw/. - It stores normalized HTML in
html/when available. - It extracts inline images and attachments into
assets/. - It writes document metadata to
store.db. - It returns document references with IDs.
IMAP receive
Section titled “IMAP receive”- MailAtlas connects to selected IMAP folders.
- It fetches messages not already covered by cursor state when possible.
- It runs the same parsing and storage path as file ingest.
- It stores per-folder cursor state in SQLite.
- It does not store mailbox passwords or OAuth access tokens.
Gmail receive
Section titled “Gmail receive”- MailAtlas reads a short-lived Gmail access token from a flag, environment variable, or the local Gmail token store.
- It lists Gmail message candidates by label, query, or incremental history cursor.
- It fetches full raw messages through the Gmail API.
- It decodes Gmail raw payloads into RFC 2822 bytes.
- It runs the same parsing and storage path as file ingest.
- It stores Gmail provider metadata on the document, including message ID, thread ID, label IDs, history ID, internal date, and receive account ID.
- It updates the receive cursor after a successful pass.
- It stores receive account, cursor, and run records in SQLite.
Receive is read-only. MailAtlas does not mark Gmail messages read, archive them, delete them, or change labels.
Export
Section titled “Export”- You request an export with
mailatlas get <document-id> --format ...or the Python API. - MailAtlas reads the stored document and local artifacts.
- It writes or returns JSON, Markdown, HTML, or PDF depending on the format.
- It writes to
--outif provided. - For PDF without
--out, it writes toexports/<document-id>.pdf.
Outbound send
Section titled “Outbound send”- MailAtlas validates the outbound message fields.
- It renders a raw
.emlsnapshot and body files. - It copies attachments into
outbound/attachments/. - It stores an outbound record in SQLite.
- It contacts the configured provider unless the message is a dry run.
- It updates provider status, provider message ID, error details, timestamps, and retry metadata.
Dedupe
Section titled “Dedupe”MailAtlas deduplicates by message_id when present and falls back to a normalized content hash otherwise.
Security note
Section titled “Security note”Treat the workspace as sensitive source data. It can contain raw email, attachments, BCC recipients, outbound drafts, sent-message records, and exported files. Review workspace contents before committing, sharing, or uploading them.
Next step
Section titled “Next step”- Use Document Schema for the stored fields.
- Use Security and Privacy for operational guidance.
- Use CLI Overview to work with the workspace from a shell.