Build EDRM XML Load File (Interchange Package)
Skill: Convert a document collection into an EDRM XML interchange package
Region: United States Category: Legal / eDiscovery Does: Takes a processed ESI document set (metadata + native/text/image references) and assembles an EDRM XML interchange package — the vendor-neutral load file used to move documents and their metadata, tags, and relationships between eDiscovery platforms (Relativity, Nuix, Everlaw, Reveal). Spec: EDRM XML Interchange Format v1.2 (edrm.net / LegalXML)
EDRM XML is an industry interchange standard, not a government schema — its authority is the EDRM/LegalXML working group. It travels with an external file store (the natives/images/text referenced by path), so the XML + the file payload must stay together. Element names below follow the EDRM XML v1.2 structure; confirm against the published schema and the target platform's import mapping before delivery.
When this applies
- Moving a processed collection between review platforms or vendors without re-processing — EDRM XML preserves field metadata, families, tags, and custodian assignments.
- An alternative to flat Concordance DAT/OPT load files when relationships (parent/child, duplicates) and typed fields must survive the hand-off.
Structure
<Root DataInterchangeType="Update|Append">
<Batch>
<Documents>
<Document DocType="Message|File|..." MimeType="..." DocID="DOC000001">
<Tags>
<Tag TagName="#Custodian" TagDataType="Text" TagValue="Doe, Jane"/>
<Tag TagName="#DateSent" TagDataType="DateTime" TagValue="2025-01-15T09:30:00"/>
<Tag TagName="#BegBates" TagDataType="Text" TagValue="ABC000001"/>
...
</Tags>
<Files>
<File FileType="Native">
<ExternalFile FilePath="\NATIVE\001\" FileName="DOC000001.msg" FileSize="..." Hash="<MD5/SHA1>"/>
</File>
<File FileType="Text"><ExternalFile FilePath="\TEXT\001\" FileName="DOC000001.txt"/></File>
<File FileType="Image"><ExternalFile FilePath="\IMAGES\001\" FileName="ABC000001.tif"/></File>
</Files>
<Locations/>
</Document>
</Documents>
<Relationships>
<Relationship Type="Attachment" ParentDocID="DOC000001" ChildDocID="DOC000002"/>
<Relationship Type="Duplicate" .../>
</Relationships>
</Batch>
</Root>
Key elements
| Element | Carries |
|---|---|
Root@DataInterchangeType |
Update (full) vs Append (add to existing) load semantics |
Document@DocID |
the stable unique key used by relationships and the file store |
Tag |
one metadata field — TagName / TagDataType (Text, DateTime, Number, Boolean, LongText) / TagValue |
File@FileType |
Native, Text (extracted), or Image (TIFF/PDF) |
ExternalFile |
path + filename + size + hash of the actual file on disk |
Relationship@Type |
Attachment (family), Duplicate, NearDuplicate, EmailThread |
Data rules
- DocID is the join key — every
RelationshipParentDocID/ChildDocID must reference an existingDocument@DocID; orphaned references fail import. - Family integrity: a parent email and its attachments are linked by
Type="Attachment"; keep families together so review coding (and privilege) propagates correctly. - Field datatypes must match the target workspace field types — a
DateTimetag delivered as free text imports as text and breaks date sorting/searching. - Hashes (MD5/SHA-1) on
ExternalFilelet the importer verify file integrity and support dedup. - Bates fields (
#BegBates/#EndBates/#BegAttach/#EndAttach) are conventional tag names — align them with the production Bates plan.
Worked example (one email + one attachment — skeleton)
<Root DataInterchangeType="Update">
<Batch>
<Documents>
<Document DocType="Message" MimeType="application/vnd.ms-outlook" DocID="DOC000001">
<Tags>
<Tag TagName="#Custodian" TagDataType="Text" TagValue="Doe, Jane"/>
<Tag TagName="#From" TagDataType="Text" TagValue="jane@acme.com"/>
<Tag TagName="#DateSent" TagDataType="DateTime" TagValue="2025-01-15T09:30:00"/>
<Tag TagName="#BegBates" TagDataType="Text" TagValue="ABC000001"/>
<Tag TagName="#EndBates" TagDataType="Text" TagValue="ABC000001"/>
</Tags>
<Files>
<File FileType="Native"><ExternalFile FilePath="\NATIVE\001\" FileName="DOC000001.msg" Hash="9e107d9d..."/></File>
<File FileType="Text"><ExternalFile FilePath="\TEXT\001\" FileName="DOC000001.txt"/></File>
</Files>
</Document>
<Document DocType="File" MimeType="application/pdf" DocID="DOC000002">
<Tags><Tag TagName="#BegBates" TagDataType="Text" TagValue="ABC000002"/></Tags>
<Files><File FileType="Native"><ExternalFile FilePath="\NATIVE\001\" FileName="DOC000002.pdf"/></File></Files>
</Document>
</Documents>
<Relationships>
<Relationship Type="Attachment" ParentDocID="DOC000001" ChildDocID="DOC000002"/>
</Relationships>
</Batch>
</Root>
Validation checklist
-
DataInterchangeType(Update/Append) set to the intended load semantics - Every
Documenthas a uniqueDocID; allRelationshipreferences resolve to existing DocIDs - Email/attachment families linked via
Type="Attachment"; families kept intact - Each document has the expected
Filetypes (Native/Text/Image) and the paths/files exist in the payload -
Tagdatatypes match the target workspace field types (esp. DateTime, Number) - Bates tag names align with the production Bates plan;
#BegBates/#EndBatespopulated - File
Hashvalues present for integrity/dedup - Test-loaded into the target platform (or against its import map) before bulk delivery
Last updated: 2026-05-31 — EDRM XML is an industry interchange format; confirm element/attribute usage against the published EDRM XML v1.2 schema and the receiving platform's (Relativity/Nuix/Everlaw/Reveal) import mapping before use.