SPF and DKIM: The State of Domain-based Email Authentication – Part 1
Recent reports on cyber-security threats in the healthcare sector by Verizon, Symantec and Ponemon consistently make several observations:
- Email-borne malware is on the rise, with such malware delivered via spam or phishing;
- Small-to-medium sized businesses (from all sectors) have the highest rate of email-delivered malware;
- Most breaches are caused by negligent employees or contractors.
These conclusions are hardly surprising as email is now an increasingly common part of communications with protected health information (PHI) frequently exchanged amongst employees and patients within a practice, between medical providers, and medical providers and their business associates. The concern for the healthcare industry is the potential violation of the HIPAA privacy rule caused by email-related (and other) breaches, leading to disruptions from loss of data, compliance audits and possibly hefty fines.
We wrote about obvious measures medical providers can take to avoid HIPAA non-compliance in email exchanges such as opt-out email security. That addresses only one aspect of the threat landscape, though – the protection of PHI in email exchanges. Another aspect is more sinister, as it deals with external, malignant actors. These actors use various spoofing techniques to trick patients or employees of a medical practice to react incautiously, often impulsively, to emails supposedly coming from valid sources. These often lead to identity theft, where the damage is more far-reaching as the information given up is more long-lived and more widely used and cannot just be erased like revoking a misused credit card.
Examples of Email Spoofing
- As is often the case after the widespread news of a breach at a major healthcare insurer revealing patient data, a consumer receives an email supposedly from a well-known credit monitoring service offering to monitor his accounts and report suspicious financial transactions. The email contains a link to sign up. The consumer reaches what seems like a perfectly valid site (visually what he expects, no security problems noted by the browser) but run by a rogue operator who obtains a lot of personal details (DOB, SSN, etc.).
- An employee at a medical practice receives an email from an insurance company with an attachment supposedly containing information on a patient’s claim. She notices nothing wrong with the email and opens the attachment. A malware is installed. Or worse, the malware is ransomware that encrypts the employee’s computer hard drive[1].
In both cases, the received email appears to be from a legitimate source. After all, the visible From: address seems to match the company’s name and there are no other obvious visual or textual discrepancies that a harried office worker can be expected to notice. Spoofing the sender’s address and creating an authentic-looking message is an example of phishing – that is, tricking a user to revealing sensitive or confidential information.
This is the complementary side of the issue to which we devoted several posts (please see here, here, here and here) in the past months. In those posts, we concentrated on authentication of servers/receivers[2] via TLS. We described the threats to server authentication and steps to ensure that your computer is connecting with the correct end point/service and then ensuring that the data path to the authenticated server is secured against surveillance or tampering. The next series of posts, including this one, will be about the status of techniques on how to authenticate senders, specifically those of emails.
Several years ago, we wrote extensively about the email sender authentication problem and its solutions. We described about how spammers/hackers spoof the sender’s address, ways to protect yourself from forged/fake emails, and three technologies – Sender Policy Framework (SPF), DomainKeys Identified Mail (DKIM), and Domain-based Message Authentication, Reporting and Conformance (DMARC) – which address this problem.
In this new series of posts, we’ll pick up where we left off. After a description of each of these solutions, we’ll provide a status report on the state of their deployment. Also, we’ll look at any new technologies that are being added to the arsenal to fight email-based spam and phishing.
Sender Policy Framework (SPF)
Sender Policy Framework (SPF), IETF RFC 7208, is the means by which a sending user’s domain asserts the allowed IP addresses of servers (formally, Mail Transfer Agents (MTA)) that are legitimate senders of emails for that domain.
Recall that the user-visible From: in an email address (specifically, the From: defined in IETF RFC 5322) is entirely up to the user, and hence can be changed to say anything. Users should not be expected to validate this. However, the other “from” parameter, the MAIL FROM: field defined in IETF RFC 5321 and carried in the message envelope, is the responsibility of the sending mail server. SPF ties its unforgeable IP address to the IP addresses of allowed mail servers for the sender’s domain. The domain owner publishes the valid IP addresses of mail senders for its domain in a TXT resource record (RR) in the DNS. A receiving mail server checks this TXT RR for the MAIL FROM: domain to validate if the sending MTA’s IP address matches any record there. The domain owner indicates in the DNS TXT RR how a receiver may interpret a validation failure.
Despite SPF being implemented in almost all mail servers, it remains a fragile authentication check. That’s because sending domains have, for the most part, chosen not to require the rejection of messages based on failing the validation check. The domain owner has several choices for setting the SPF RR to indicate the treatment of authentication failure, as shown below using a somewhat simpler language than that in the specification:
-all | Fail | The email is a fake. |
~all | Soft Fail | Allow the email, but do something more. |
?all | Neutral | Nothing can be said about the email. |
+all | Pass | Allow all mail. |
Most sending domains set their preference to “~all” or “?all”, which cause receivers to not automatically reject messages based on a SPF authentication failure. The soft fail “~all” has the semantic that most mail is expected to come from the domains that are indicated in the DNS RR, but there may be instances when other domains might send legitimate mail. In other words, the sender is not willing to provide a definitive policy statement and the receiver is asked to treat this mail with caution – mark it for further scrutiny[3] before delivery. Of course, receivers can have their own, stricter policies but that might lead to legitimate mail not being delivered. Mail providers, on the whole, prefer to err on the side of delivery.
Many domain owners feel that a SPF soft fail is a safer choice, as there are several reasons why SPF authentication might fail even for legitimate mail. These include:
- The inability of an organization to identify all its mail servers. (This is particularly true for large organizations with complex organizational structures and far-flung sites, as well as the use of third parties to send marketing or bulk mail on their behalf.)
- Individuals who use their organization’s email address but send mail from another ISP. (Consider a doctor who responds to her practice’s emails from home, thus using her broadband provider’s MTA but using her practice’s domain name in the email address.)
- Emails which are forwarded, either deliberately by the recipient or if the recipient has turned on mail forwarding, or when addressed to a mailing list. (The above-mentioned doctor might, for instance, forward all practice-related emails to her home email for reading/responding when at home.) When forwarded, the new sender domain and IP address belongs to the original recipient’s MTA and not those of the original sender.
Google, as one of the world’s largest email providers, has provided some statistics on SPF adoption. Based on incoming mail to Gmail users in 2016, they find that “95.3% of incoming emails we receive come from SMTP servers that are authenticated using the SPF standard (up from 89.1% in 2013). Over 7.8 million domains (weekly active) have adopted the SPF standard (up from 3.5 million domains in 2013)”. Unfortunately, the view on a world-wide basis does not appear to be quite so rosy. A site, aptly named spf-all.com and devoted to encouraging the “-all” option for SPF records, provides a much lower uptake of SPF records. However, it is not clear when the data was published and the statistics may well have improved.
In our next post, we will describe how another standard, Domain-based Message Authentication, Reporting and Conformance (DMARC), allows the sender to provide a more definitive mail handling policy directive in case of SPF validation failure. As we shall show, DMARC provides for reporting back SPF failures to the sender’s MTA. This feedback allows the sending domain to modify the SPF record and adjust its validation policy thereby reducing the chance of improper rejections.
DomainKeys Identified Mail (DKIM)
The DomainKeys Identified Mail (DKIM) protocol, IETF RFC 6376, tackles a different issue – the forging of message headers and body by rogue intermediaries. In other words, unlike SPF, it does not authenticate a sender but rather asserts that all significant parts of a message that was issued from that domain remain unchanged. That is, by using DKIM the sending domain takes responsibility for the integrity of key parts of the message – the parts that can be forged.
It does so by having the sending MTA create a SHA-256 signature over selected headers and the body of an email and place the signature in a new email header, DKIM-Signature. The receiving MTA validates the signature by consulting the DNS TXT RR for the domain indicated in the header. (Of course, the sender’s domain placed the public key needed to verify the DKIM signature in its DNS TXT RR prior to its use.)
DKIM allows a domain to vouch for a message it did not directly send. This is where the usefulness of DKIM reveals itself. Many domains have other domains deliver emails for them – for example, a large medical insurance corporation might use an approved marketing firm to send health newsletters to consumers. The insurance company’s DNS record will contain the marketer’s public key so that the marketer can both create the newsletter and sign it using its private key. The receiver of the email notes the marketer’s public key in the insurance company’s DNS RR, which implies a business relationship. Thus, the validation of a DKIM signature in a message with a From: address from the insurance company’s domain even when sent by a third party provides the reassurance that the message was not forged and probably sent with the insurance company’s blessing[4].
Unfortunately, as with everything related to email, there are several “gotchas”. DKIM breaks when messages are sent via mailing lists. Most lists add to the message body (typically some information about the list and a way to unsubscribe) before forwarding to list members, which breaks the original DKIM signature. We’ll describe a new technique in our next post, Authenticated Received Chain (ARC), which is being developed to offer a solution for this.
Unlike SPF, DKIM by itself does not provide for any policy directive from the sender in the DNS RR for what to do if the signature calculated by the receiver does not match that provided in the DKIM header. A validated signature offers the reassurance that the signed headers (and, optionally, the body) were not modified in transit, and that the domain that signed the headers/body has the domain owner’s permission to send the message. However, a lack of validation proves nothing beyond an uneasy feeling! As with SPF, most providers err on the side of a soft fail.
Another issue is that it is cannot be determined simply by looking at a message without a DKIM header if the sender intended to only send DKIM-signed messages which should be checked before delivery. A malicious intermediary can easily remove the DKIM header, and the receiver would be none the wiser! This obvious flaw is mitigated by making use of DMARC, which, as we shall show in our next post, allows a sending domain to indicate if it provides DKIM signatures on all its messages and how to react to a missing DKIM signature or an invalid one.
Again, Google’s 2016 statistics of DKIM usage in inbound Gmail is that “86.8% of the emails we received are signed according to the (DKIM) standard (up from 76.9% in 2013). Over two million domains (weekly active) have adopted this standard (up from 0.5 millions 2013).” OpenDKIM.org also provides a wealth of data on DKIM support. Another site with up-to-date data on DKIM usage shows the healthcare vertical domains to be one of the poorest representatives of DKIM usage.
Next time – Tying these all together!
In our next post, we’ll describe the technology and status of DMARC – Domain-based Message Authentication, Reporting and Conformance – which ties setting sender policies on handling SPF and DKIM validations with a feedback loop that allows senders to know the effects of their policies and how these can be improved.
We’ll also touch on a new technology, Authenticated Received Chain (ARC), that deals with the mailing list problem which affects both SPF and DKIM based solutions. Unlike these two technologies, which are both Internet Standards that have gained many years of deployment experience, ARC is still a work in progress but gaining industry attention.
Stay tuned for more posts in the coming months on further techniques being progressed by the industry to systematically address many of the remaining weaknesses of the email system.
Notes
[1] The Department of Health and Human Services now requires ransomware incidents to be treated as reportable breaches of the HIPAA security rules. The Verizon threat report notes that ransomware accounts for 72% of malware incidents in the Healthcare industry.
[2] While the examples in those posts were about web server authentication, the threats and issues are exactly the same when using SSL/TLS to connect with a SMTP server to send mail or a POP/IMAP server to retrieve it. The use of SSL/TLS between the sender and receiver’s MTAs, if used, is different, though, because the two ends do not really validate each other’s certificates. But this last point is not really germane to the topic of this article and ways to address this will be covered in a future post.
[3] For instance, such a mail might be passed into a spam filter with an indication of the SPF validation failure. Or the message might be directly placed in a junk email folder rather than the Inbox, if the receiver wants a quick (but user friendly) solution.
[4] Note that DKIM validation cannot vouch that the actual content of the message was what the insurance company wanted sent!