Skip to content

gmail-mbox-stats is a simple tool to analyze Gmail MBOX file.

Notifications You must be signed in to change notification settings

leodevbro/gmail-mbox-stats

Repository files navigation

gmail-mbox-stats

npm version PRs Welcome

gmail-mbox-stats is a very simple tool to analyze your Gmail mailbox.

  • Find the senders which sent you most mails.
  • Find the receivers where you sent most mails.

  • Find the senders with largest total size (MB) of attachments.
  • Find the receivers where you sent mails with largest total size (MB) of attachments.

  • Find the domains (@gmail.com, @live.com ...) which appear mostly in sender addresses.
  • Find the receivers where other senders (where sender is not you) sent most mails.

  • Find the addresses which are most often placed in CC by you.
  • Find the addresses which are most often placed in CC by others.

  • Find the addresses which are most often placed in BCC by you.
  • Find the addresses which are most often placed in BCC by others.

  • And more.


Quick intro of the CSV result files

Quick intro of the CSV result files


Video instruction:



Textual instruction:

  • Download Gmail data from Google Takeout (Preferably select 'Include all messages in Mail', it will include all mail, not just Inbox or just Sent/Spam/Archive/Trash). If your mailbox has 100K mails, the downloaded data can be 10 GB or more. So, be ready to deal with a large file. If it is too large, it may not be a single archive file, but multipart archive files, like split-files of ZIP.

  • Extract MBOX file from the Gmail data file(s).

  • Make sure you have installed NodeJS. It is available for Windows, Mac and also Linux.

  • Open terminal (preferably in the same folder where MBOX file is located) - in Windows/Mac/Linux. For Windows, the terminal should be PowerShell, not CMD.

  • run a command with this syntax:
    npx gmail-mbox-stats mymail="<your email address>" mboxpath="<mbox file path>"

    for example:
    npx gmail-mbox-stats mymail="[email protected]" mboxpath="./All mail Including Spam and Trash.mbox"
    the notation ./ means to find the file All mail Including Spam and Trash.mbox in the current folder of the terminal.









That's it.
Now just see the results:








It will take probably 5-10-15 seconds to analyze 1000 mails (messages),
about 100 seconds for 10K mails,
about 1000 seconds (10-15-20 minutes) for 100K mails and so on.

  • When it finishes, the terminal will log basic information like this:
Success.
Full count of messages: 18686

Messages where sender is --> me: 495
                                 Count of mails with at least one attachment: 148
                                 Total count of attachments: 231
                                 Total size of attachments: 120.800749 MB => Million Bytes
                                 Unique sender addresses: 1
                                 Unique sender domains: 1
                                 Unique receiver addresses: 285

Messages where sender is not me: 18191
                                 Count of mails with at least one attachment: 850
                                 Total count of attachments: 2409
                                 Total size of attachments: 231.40515899999997 MB => Million Bytes
                                 Unique sender addresses: 1473
                                 Unique sender domains: 804
                                 Unique receiver addresses: 94


Created new folder "mailStats_2024-11-15_21-28-13"


Start datetime: 2024-11-15_21-28-13
->End datetime: 2024-11-15_21-30-13

Full Execution Time: 2:00.411 (m:ss.mmm)


gmail-mbox-stats v1.2.3
Created by leodevbro (Levan Katsadze)
* [email protected]
* linkedin.com/in/leodevbro
* github.com/leodevbro
* facebook.com/leodevbro

If you feel like donating
* buymeacoffee.com/leodevbro
* ko-fi.com/leodevbro
  • Also, there will be a new folder named "mailStats" with execution start datetime,
    like this: mailStats_2024-11-15_21-28-13
    in the same folder where the MBOX file is located.

  • In the 'mailStats' folder, there will be generalStats.csv file. If you import it in Google Sheets it will look like this:


Example generalStats CSV In Google Sheets


  • In the 'mailStats' folder, there will be also two folders:
    forMailsWhereSenderIsMe - the stats for only the mails where sender is you.
    forMailsWhereSenderIsNotMeOrIsUnknown - the stats for only the mails where sender is not you, or sender is unknown.
    In both folders, there will be .csv files of stats. You can import them one by one in Google Sheets.

Here is what the full folder structure looks like:

▨ All mail Including Spam and Trash.mbox

📂 mailStats_2024-11-15_21-28-13
    ▦ generalStats.csv

    📂 forMailsWhereSenderIsMe
        ▦ me_attachmSizeReceiver.csv
        ▦ me_attachmSizeSender.csv
        ▦ me_attachmSizeSenderDomain.csv
        ▦ me_freqBcc.csv
        ▦ me_FreqCc.csv
        ▦ me_FreqReceiver.csv --- Here you can find the receivers where you sent most mails
        ▦ me_FreqSender.csv
        ▦ me_freqSenderDomain.csv
        ▦ me_freqSenderPlusName.csv

    📂 forMailsWhereSenderIsNotMeOrIsUnknown
        ▦ notMeOrUnkn_attachmSizeReceiver.csv
        ▦ notMeOrUnkn_attachmSizeSender.csv
        ▦ notMeOrUnkn_attachmSizeSenderDomain.csv
        ▦ notMeOrUnkn_freqBcc.csv
        ▦ notMeOrUnkn_FreqCc.csv
        ▦ notMeOrUnkn_FreqReceiver.csv
        ▦ notMeOrUnkn_FreqSender.csv --- Here you can find the senders which sent most mails
        ▦ notMeOrUnkn_freqSenderDomain.csv
        ▦ notMeOrUnkn_freqSenderPlusName.csv


Now, for example, let's import the file notMeOrUnkn_freqSender.csv in Google Sheets:

exampleCsv__notMeOrUnkn_freqSender In Google Sheets

Also, some other files:

me_freqReceiver.csv

exampleCsv__me_freqReceiver In Google Sheets


notMeOrUnkn_freqSenderDomain.csv

exampleCsv__notMeOrUnkn_freqSenderDomain In Google Sheets


notMeOrUnkn_freqSenderPlusName.csv

exampleCsv__notMeOrUnkn_freqSenderPlusName In Google Sheets


notMeOrUnkn_freqReceiver.csv

exampleCsv__notMeOrUnkn_freqReceiver In Google Sheets


notMeOrUnkn_freqCc.csv

exampleCsv__notMeOrUnkn_freqCc In Google Sheets



Thank you.

My name is Levan Katsadze (ლევან კაცაძე), 1995-03-03, from Tbilisi, Georgia (Not USA).

facebook logo youtube logo

If you feel like donating:

Buy Me A Coffee ko-fi




About

gmail-mbox-stats is a simple tool to analyze Gmail MBOX file.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published