Raw vs. Encoded Email Message Size — What’s the Difference?

January 21st, 2009

We are often asked about why our email message size limits are based on the “encoded message size” and not the “raw message size”.   To answer this question, let’s first understand what the difference is between these two concepts.

Raw Message Size

An email message is generally composed of:

  • The message content
  • Some file attachments
  • “Metadata” — who its going to, who its from, what the subject is, etc.

The raw message size is the sum of the file sizes of all of the attachments, plus the size of your message content.

Consider a paragraph of about 150 words with 2 attached files that are each 1 megabyte (MB) in size.  The “raw size” would be 2 MB plus the size needed for the 150 words (which is probably around 0.001 MB).

Encoded Message Size

The “encoded message size” is the size of the resulting email message with all of the attachments added and encoded using MIME.  This encoded size is the actual size of the message as it travels over the Internet and is always larger than the raw size because of the MIME overhead and because binary attachments are generally encoded using base64 encoding.  Base64-encoded files are usually about 137% the size of the original files.  In the example above, the encoded message would probably be in the neighborhood of 2.6MB in size.

Additionally, the encoded message contains all of the metadata about the message — and this information grows as the message travels from the sender to the recipient.

Why? Because each email server adds information to the metadata (the header of the message) that indicates what servers were passed through and documents anything that the server did to the message (such as scanning it for viruses or spam).

So, what encoded size can you expect your messages to be?

So, overall, you can expect that your large encoded messages are generally at least 33% larger than the size of the original files.  The size of small encoded messages relative to their raw counterparts can vary widely — but will fall far short of any realistic size limits.

Then why do servers limit messages based on encoded size?

The encoded size is the size that email servers see when transporting email messages across the Internet, so this is the natural size for the server to place limits on.  For the server to place limits on the “raw size” instead, the email servers would first have to accept and decode every large message they come across (or at least ones within a certain size range) to determine if the raw size is acceptable. While a limit on raw sizes may be conceptually easier for the end user, it is so much processing for email servers that it is not worth the small gain in usability by using a raw limit.  In fact, it’s easier to just make the encoded limit larger so that larger raw messages can make it through!

LuxSci has recently increased its encoded message limit from 50MB to 200MB on its servers because folks are sending larger and larger messages these days.  LuxSci’s WebMail system limits the size of sent messages to 70MB — raw size.  It uses the raw size because when you are sending messages, WebMail actually knows the raw size already, so this is conceptually easier for the user.  The limit is well below 100MB so that once these messages are encoded, they will still have an encoded size of less than 100MB, the encoded-message limit across LuxSci’s systems.