Overview
Golang module to parse email files of both .msg (Windows Outlook OLE format) and .eml (MacOS Outlook text/MIME) types. Read more on our blog.
I've almost entirely rewritten this GitHub module as msgparse, initially to be able to get attachments from msg files but then to get all the email fields out reliably. The repo hasn't been touched in seven years so I figure a PR is pointless, plus it's GPL3.
Using the Modules
MSG Files
See the example file in tests.
First read the file:
msg, err := msgparse.ReadMsgFile(inputFilePath, verbose)
if err != nil {
// handle error
}
If you set verbose to true, it will print out unrecognised fields contained in the message:
All 58 Decoded but Unknown Fields:
ID Value
0x1042 <[email protected]>
0x8013 "BT=5;II=<invalid>;SBMID=84;SBT=51;TFR=NotForking;Version=Version 15.20 (Build 6678.0), Stage=H8;UP=1..."
0x802d 09 Aug 2023 04:31:38.9045 (UTC)
0x4035 439CC38A41E94454952571BF8C41B540-SCOTT
0x801e LO4GBR01FT019.eop-gbr01.prod.protection.outlook.com
We're still collecting standard mappings to improve parsing - they are all kept in this Go struct.
You can get the headers and body out using:
authHeader, err := msgparse.GetHeaderByName(msg.GetPropertyByName("Message Headers"), authResults)
if err != nil {
// handle error
}
body := msg.GetPropertyByName("Message body")
Or you can request a specific header by name (as defined in the big map in message.go):
value := email.Header.Get("Return-Path")
Any attachments are available as a byte slice using msg.Attachments.
EML Files
See the example file in tests.
First read the file:
emlFile, err := emlparse.ReadFromFile(inputFilePath)
if err != nil {
// handle error
}
The returned emlFile is a struct holding the message (which contains the headers), body and attachments (as a byte slice):
type Eml struct {
Message *mail.Message
Body string
Attachments []msgparse.Attachment
}
With Make you can build email-dump, which processes either msg or eml files provided with -i:
./email-dump -i tests/Defender.msg -v
go build -ldflags="-s -w" -o email-dump dump/*.go
2024/03/04 16:16:55 Reading input file "tests/Microsoft.msg"
2024/03/04 16:16:55 Unknown entry type: "__nameid_version1.0"
2024/03/04 16:16:55 Found unknown field of type OLEGUID, ID: 0x7BA7
2024/03/04 16:16:55 Found unknown field of type OLEGUID, ID: 0x8005
2024/03/04 16:16:55 Unknown entry type: "__properties_version1.0"
2024/03/04 16:16:55 Unknown entry type: "__recip_version1.0_#00000000"
2024/03/04 16:16:55 Unknown entry type: "__properties_version1.0"
2024/03/04 16:16:55 Parsed 48 known properties and 90 unknown properties from email.
2024/03/04 16:16:55 Wrote 49 rows of data to sheet "Known Fields".
2024/03/04 16:16:55 Wrote 91 rows of data to sheet "Unknown Fields".
2024/03/04 16:16:55 Results output to 20240304_161655_msg_dump.xlsx
2024/03/04 16:16:55 Fin.