This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
1. What does BICHON_ENCRYPT_PASSWORD encrypt? Are email contents encrypted?
BICHON_ENCRYPT_PASSWORD is used to encrypt sensitive information such as user login passwords, OAuth2 client_id/client_secret, access/refresh tokens obtained during authorization, and IMAP account passwords.
All of this data is stored in encrypted form to prevent potential leakage.
It is important to note that this password does not encrypt the email contents themselves. Bichon does not apply end-to-end encryption to emails. Its goal is secure storage and fast search, not functioning as an encrypted email service.
You also don’t need to worry about forgetting this password and losing access to your emails — your archived messages will always remain recoverable. Additionally, both Bichon and Tantivy are fully open-source, so everything about how your data is stored and processed is transparent and trustworthy.
2. How are archived emails stored? Why not use PostgreSQL or other databases?
Bichon’s design philosophy is to keep the system simple, stable, easy to deploy, and low-maintenance. Introducing an external database would mean:
- More operational work
- More potential points of failure
- More complex upgrade and backup procedures
- Additional cognitive load for users who don’t need it
For the vast majority of users, backing up personal emails does not require a microservice architecture or a distributed system — just a reliable single-node application. This is why Bichon manages its own metadata and archive storage instead of relying on PostgreSQL or other external databases.
Bichon uses Tantivy, a mature Rust full-text search engine that serves as the core of Quickwit (which has been acquired by Datadog) and is widely used in production. Its reliability is well established.
Bichon uses two Tantivy indexes:
(1) Metadata Index
The metadata index stores far more than just subject and body text. It contains all searchable and filterable information extracted from each email, including:
- Identification fields: Email ID, account ID, mailbox ID, UID, thread ID, Message-ID
- Addresses: From, To, Cc, Bcc (stored as exact-match fields for fast filtering)
- Content fields: Subject (tokenized) Full-text body (plain text when available, otherwise extracted from HTML)
- Date fields: Sent date and internal date (stored as numeric values for fast range queries)
- Size information: Email size (for filtering and range queries)
- Attachment information: Attachment names, tags, and whether the email contains any attachments
- Labels / custom tags: Stored as facets for structured filtering
These fields enable:
- Full-text search on subjects and bodies
- Exact match search on addresses
- Fast filtering by mailbox, account, thread, or date range
- Attachment-aware search
- Faceted filtering (e.g., by tags or categories)
In short, the metadata index is a structured, high-performance search layer that supports both full-text queries and rich filtering capabilities across your entire mail archive.
Here is a more polished, expanded, paragraph-style version — professional, clear, and suitable for README:
(2) Email Content Index
The email content index stores the complete raw message data for every email. Each message is saved as the full EML file encoded directly as bytes, without tokenization or the creation of an inverted index. In practice, this index functions as a high-efficiency key–value store, where the email ID serves as the key and the full message content is the value. The content index is linked to the metadata index through this ID, allowing Bichon to quickly retrieve and parse any message on demand.
Because this index preserves the entire original EML as-is, it is always possible to fully recover all emails even if metadata were to become corrupted or lost—simply by iterating through the stored messages. Both the storage format and the indexing logic are completely open-source and transparent, ensuring that there is no vendor lock-in and that your data remains fully accessible and portable at all times.
3. Why not store each email as an individual .eml file?
Because:
- Large numbers of small files consume massive amounts of inodes
- Heavy pressure on the filesystem and fragmentation
- Difficult to compress efficiently
- Not sustainable for large-scale email archives
Tantivy, on the other hand, can:
- Compress email contents
- Manage segments transparently
- Perform automatic merges and optimization
- Control compression levels
Bichon compresses emails in 2 MB compression blocks using zstd level 6. Since most emails are only a few hundred KB, multiple messages are packed into each block. In real usage, this reduces storage by about 50% compared to raw EML files. The block size is part of Bichon’s own design and may become configurable in future versions.
Here’s an English version suitable for the FAQ:
4. How Bichon Syncs Emails
When you add a new email account, Bichon starts a scheduled task for it:
- Initial task: Downloads emails from the IMAP server in batches of 50, parses them, and stores an index locally on your hard drive.
- Incremental sync: After the initial download, Bichon checks the largest UID on the local index at the configured interval (default 10 minutes) and downloads any new emails.
Please note that the term “sync” in Bichon may be misleading—“download” might be more accurate. Bichon works differently from a typical email client:
- An email client syncs changes in email flags (e.g., read/unread, custom flags) and deletions.
- Bichon does not sync these changes; it only downloads new emails to the local index.
- Bichon will never delete anything from your IMAP server.
In short, Bichon is an email archiver and indexer, not a full-featured email client.