Example: Ingest an mbox Archive
Use this example when you already have a mailbox archive on disk.
An mbox file is a local mailbox file that can contain many messages. It is often created by an export tool or another local mail program.
An mbox file is not the same thing as IMAP receive. Use IMAP Receive when MailAtlas should connect to a live mailbox and fetch selected folders.
Before you start
Section titled “Before you start”python -m pip install mailatlasexport MAILATLAS_HOME="$PWD/.mailatlas"If you need sample fixtures:
git clone https://github.com/mailatlas/sample-data.gitIngest the archive
Section titled “Ingest the archive”mailatlas ingest sample-data/fixtures/mbox/atlas-demo.mboxMailAtlas iterates over each message in the archive, parses it, preserves source metadata, extracts assets, deduplicates records, and writes the results into the local filesystem plus SQLite workspace.
Expected output shape:
{ "status": "ok", "ingested_count": 5, "duplicate_count": 0, "document_refs": [ { "id": "<document-id>", "subject": "<subject>", "source_kind": "mbox", "created_at": "<timestamp>" } ]}Inspect and export
Section titled “Inspect and export”mailatlas listmailatlas get <document-id>mailatlas get <document-id> --format json --out ./mbox-message.jsonDuplicate behavior
Section titled “Duplicate behavior”If the same message appears more than once, MailAtlas deduplicates by message_id when present and falls back to a normalized content hash otherwise.
A nonzero duplicate_count is expected when an archive overlaps with messages already stored in the workspace.
When to use this example
Section titled “When to use this example”Use mbox ingest when you have a mailbox export, want repeatable local parsing, want to build a retrieval corpus from an archive, and do not need MailAtlas to connect to a live mailbox.
Use IMAP receive instead when messages still live in a mailbox and should be fetched over IMAP.