Guide

Email Collection for E-Discovery: PST, IMAP, and Best Practices

April 4, 2026 · 8 min read

Email remains the single most important source of electronically stored information (ESI) in litigation. In most civil cases, the key evidence — the conversations that prove intent, establish timelines, and reveal decision-making — lives in email. Getting email collection right is one of the most critical steps in the e-discovery process.

Why email dominates e-discovery

Studies consistently show that email accounts for the largest portion of ESI in litigation. There are several reasons:

  • Volume — The average business professional sends and receives over 100 emails per day. Over the course of a multi-year business relationship, that adds up to thousands of potentially relevant messages.
  • Candor — People tend to be less guarded in email than in formal documents. Emails often contain the unfiltered thoughts, opinions, and decisions that are most relevant to litigation.
  • Attachments — Emails carry documents, spreadsheets, presentations, and images as attachments, making them a hub for other ESI.
  • Metadata — Email inherently includes timestamps, sender/recipient information, CC/BCC fields, and threading data that establishes timelines and communication patterns.

Preservation obligations

Before collecting a single email, you need to ensure preservation obligations are met. Once litigation is reasonably anticipated, the duty to preserve attaches. For email, this means:

  • Issue a litigation hold — Notify all relevant custodians that they must preserve their email and must not delete, modify, or move messages related to the matter
  • Suspend auto-delete policies — Many organizations have email retention policies that automatically purge messages after a set period. These must be suspended for custodians under hold.
  • Preserve mailbox state — Document the state of each custodian's mailbox at the time of the hold, including folder structure and approximate message counts
  • Include all accounts — Don't overlook personal email accounts, shared mailboxes, distribution lists, or archived mailboxes that may contain relevant messages

Failure to preserve email can result in spoliation sanctions. Courts have imposed severe penalties — including adverse inference instructions and default judgments — for email destruction after the duty to preserve attached.

Collection method 1: PST export

A PST (Personal Storage Table) file is a Microsoft Outlook data file that contains a copy of a user's mailbox — emails, attachments, calendar items, and contacts. PST export is the most common method for collecting email from Microsoft environments.

How it works:

  1. An administrator (or the custodian) exports the relevant mailbox or folders to a .pst file using Outlook or Exchange admin tools
  2. The PST file is transferred to the e-discovery platform for processing
  3. The platform parses the PST, extracting individual emails, attachments, and metadata

Advantages: Well-understood format, widely supported, captures folder structure, works offline once exported.

Disadvantages: Can be very large (10+ GB per custodian), export process can be slow, requires Outlook or Exchange access, point-in-time snapshot only (new emails after export are missed).

Collection method 2: IMAP direct connection

IMAP (Internet Message Access Protocol) allows an e-discovery tool to connect directly to a custodian's mailbox and pull emails without requiring a manual export step.

How it works:

  1. You provide the e-discovery platform with the IMAP server address, port, and credentials (or OAuth token) for the custodian's account
  2. The platform connects to the mailbox and downloads messages from specified folders or date ranges
  3. Emails and attachments are processed and indexed automatically

Advantages: No manual export step, can target specific folders or date ranges, works with any IMAP-compatible email provider (Gmail, Outlook, Yahoo, etc.), can be re-run to capture new messages.

Disadvantages: Requires network access to the mail server, authentication setup may involve IT coordination, download speed depends on mailbox size and network conditions.

Metadata considerations

Email metadata is often as important as the message content itself. During collection, ensure your method preserves:

Date/time fields — Sent date, received date, and time zone information

Participants — From, To, CC, and BCC fields (BCC is only available from the sender's copy)

Subject line — Including RE:/FW: prefixes that indicate threading

Attachment names — Original filenames of all attachments

Message ID and threading — Headers that establish the conversation thread and parent-child relationships

Folder path — Which folder the message was stored in (Inbox, Sent, custom folders)

Collecting email as screenshots or printed PDFs strips this metadata and is generally not considered defensible. Always collect in native format (PST, EML, or MSG) or via direct IMAP connection.

Chain of custody

Maintaining chain of custody for collected email means documenting:

  • Who performed the collection and when
  • What source was collected (which mailbox, which folders, what date range)
  • How the data was transferred to the review platform
  • Hash values (MD5 or SHA-256) of the collected files to prove they haven't been altered

This documentation protects you if opposing counsel challenges the authenticity or completeness of your email production.

How Athens Search handles email import

Athens Search supports both major email collection methods, making it simple to get email into your review workflow:

  • PST upload — Upload PST files directly. Athens Search parses them automatically, extracting individual emails, attachments, and full metadata. Each email becomes a searchable, reviewable document.
  • IMAP connection — Connect directly to any IMAP-compatible mailbox. Specify folders and date ranges, and Athens Search pulls the messages directly into your case.
  • Automatic processing — Emails are indexed for full-text search immediately upon import. Attachments are extracted and processed separately, linked to their parent email.
  • Custodian assignment — Imported emails are automatically associated with the custodian they belong to, keeping your review organized.

No third-party collection tools. No separate processing vendors. Import email, search it, review it, and produce it — all in one platform.

Need to collect email for a case?

Athens Search handles PST and IMAP email import with full metadata preservation. See it in action.

Schedule a Demo