What’s the latest with HTTPS and SSL/TLS Certificates?
We’ve written quite a lot in past FYI Blog posts about SSL/TLS certificates, the critical building block to secure communication on the Internet. We described what such certificates were, their use in securing the communications channel between a client (browser) and a server, different types of certificates and the pros and cons of using each.
Given the changes in the Internet landscape over the past five years, we feel it is time to revisit these topics. The technical details described in the earlier posts remain unchanged. What has changed, though, are the traffic patterns for HTTPS-based communications, additional vulnerabilities arising as a consequence and ways to mitigate these. This post will provide a general overview of certain changes in the Internet landscape over the past few years, while subsequent blog posts will describe some of the topics identified here in greater detail.
HTTPS Everywhere
There has been a growing movement to strive for encrypting all communications on the Web, not just those that require user credentials for access to content and services. As early as 2014, Google converted access to its own services (Gmail, Search, etc.) over HTTPS only, and announced that it would make use of HTTPS as an input in its search ranking algorithm (as a minor factor, initially, but broadly hinting at a greater role in future). Google’s stated motives were to “make the Internet safer more broadly”, which is plausible as most people access content via a Google search. Increasingly, popular sites such as Facebook and Twitter also took up this cause, leading to the steady penetration of HTTPS-by-default access. The initial cynicism from ISPs and other intermediaries on this trend, based on the belief that such a move was self-serving rather than altruistic as it allows these large players to hide valuable analytics data on user behavior[1], was overtaken by the Snowden leak and other high profile revelations of pervasive surveillance of end-to-end communications which helped get the public, both in the US and abroad, generally supportive of the all-encrypted Web program.
Google and other major industry players further greased the path towards fully encrypted communications by standardizing the next generation of HTTP, called HTTP/2, which removes major performance inefficiencies identified with HTTP over the years. While HTTP/2 does not require TLS, all browser vendors have chosen to implement HTTP/2 with mandatory TLS usage so that web sites that wish to provide the performance advantages offered by the new protocol must use TLS by default. Growth of HTTP/2 enabled sites show an upward trajectory, currently at about 15.5% including most of the popular web sites.
Thus, by early 2017, the Electronic Frontier Foundation (EFF) has been able to report that half of the Web’s traffic is now encrypted. EFF cites data from major browser vendors (Firefox, Chrome) which show that ~50% of web pages loaded using such browsers are now protected by HTTPS. There is one reason for this startling growth in the preceding three years. Larger players have, of course, the resources needed to make the necessary changes to their backend to make HTTPS the default, while smaller players have been helped by organizations such as the EFF and the Internet Security Research Group (ISRG) which provide tools and best practices to help web masters make the necessary conversions. The ISRG, in particular, created Lets Encrypt, an open and automated Certificate Authority (CA) which allows anyone to easily obtain a Domain Validated (DV) certificate – for free! This last point removes one major excuse for small web sites, so that the cost argument can no longer be used to not get on board with the program. In fact, the growth of Lets Encrypt certificates has been phenomenal, making it comparable (depending on how one counts the issued/active certificates) to those of the industry giants which, however, also offer more robust forms (Extended Validation(EV), Organization Validation (OV)) of certificates. However, the proliferation of such easily obtainable DV certificates adds an additional level of vulnerability that we shall discuss later in this post.
But what about the other 50% of the surveyed Internet that comprise servers which do not support HTTPS? Very few web sites are self contained and provide all the necessary content to the end user. It is quite common for a single page request to require access to other third-party servers to pull in images, scripts, and, most important, advertisements. Thus, even if the initial page requested is over HTTPS, the other connections to retrieve external content that populate the page might not be. A correctly configured browser will penalize such sites and mark them a non-secure, or refuse to download additional content from external unsecured servers. With cost no longer an excuse, the main reason for not migrating to HTTPS, beyond laziness, is the requirement to ensure that all of a site’s content is delivered via HTTPS. This requires a site’s content manager to work with partners hosting additional content to ensure that those partner sites also deploy HTTPS. This manual process, often time-consuming, is one reason for the slow uptake by the other half. These are often small to medium-sized sites with less traffic and fewer resources to spare for such a migration in the near term. (As an aside, such sites are also the sort that primarily benefit from a simple and free DV certificate.)
Returning now to the proliferation of free DV certificates, it is important to remember that this type of certificate requires minimal checks on the part of the issuer (the CA) of the organization requesting the certificate, with no warranties provided against misrepresentation to those who rely on such certificates. (Indeed, all that is required to obtain a Let’s Encrypt DV certificate is proof that the requesting organization “owns” the domain, which can be shown by either provisioning a DNS record for the domain in question or showing the ability, at the time of requesting the certificate, to add a given HTTP resource under that domain name.)
It is worth recalling that a properly signed server’s DV certificate presented during the initial SSL/TLS handshake only asserts that the given public key belongs to the specific domain, and that this binding is valid for a given time period. It says nothing about the organization which hosts or owns the domain name. While the cryptographic techniques used by HTTPS to set up a secure connection with a site can detect forged or fake certificates used when trying to secure the connection, it is not possible to detect perfectly valid certificates issued to fake organizations or those issued by a compromised CA. An example of this is the alarming number of Lets Encrypt DV certificates issued to domains containing the word “paypal”. (Paypal appears to be most phishers’ go-to choice, but other popular domains are also a target.) Such certificates pass all the necessary cryptographic checks, and a browser will display the reassuring green lock icon[2] when displaying a rogue page which has been formatted to resemble the Paypal portal. Very few users would look under the hood for the site’s certificate and run some checks. To deny the charge of laziness for not doing more to prevent such malicious domains from obtaining certificates, Lets Encrypt has offered a rationale for its copious provisioning of DV certificates, arguing for the importance of meeting its stated objective of ensuring greater penetration of HTTPS and noting the complexities of monitoring site content and ownership, which, it believes, a CA is ill equipped to do.
To be fair, other and well-known CA providers have also unintentionally provided certificates to malicious sites. We recently wrote a post about Google reducing its trust in certificates issued by Symantec and its subsidiaries. And these include EV certificates too, which require a much more rigorous level of verification of the requesting organization before such certificates are issued! As we pointed out in that post, the entire system of encrypted communications on the Web is dependent on the trust we place on the CAs. If the CA’s certificate issuing processes are compromised, the end user lives in a fool’s paradise where green lock icons are always indicative of safety.
What then can be done to restore that trust in the SSL/TLS certificate ecosystem that is so essential in this world of ever growing HTTPS Everywhere?
Quis custodiet ipsos custodes?
Who will guard the guardians themselves?
That question is as pertinent to us in this new world of HTTPS Everywhere and compromised (or, as some might suggest, lazy) CAs as it was in Juvenal’s times.
A key concern with the circulation of cryptographically valid certificates issued by a compromised CA is the length of time it takes for this information to circulate and for the certificates to be revoked and placed on blacklists. Our post on the Google-Symantec altercation describes how this matter dragged on for 3 years before the current resolution. One way of improving the timeliness of discovery of mis-issued certificates is an experimental proposal by Google called Certificate Transparency, which they have implemented and which has since been placed on the path to becoming an Internet Standard via the open, consensus-based input process of the Internet Engineering Task Force (IETF), the technical forum that creates standards (such as HTTP, TLS, etc.) for use on the Internet. Google has since created an eponymous organization and open source project to promote the implementation of Certificate Transparency.
The details of how Certificate Transparency works is complex, and we shall devote a future post to describing it in greater depth and what changes it imposes on the SSL/TLS ecosystem – the CAs, browsers and other involved (some new) parties. However, until then, a brief description of the overall architecture for Certificate Transparency (CT) follows.
In essence, Certificate Transparency works by having each CA publish its newly registered certificates to a publicly visible and auditable log. Any party, especially one whose domain name is apt to be spoofed, can study these logs and verify if any issued certificate uses the domain name (or parts thereof) without authorization. Such transparency is expected to lead to quicker mitigation of the harmful effects of mis-issued certificates, such as rapid updates of revocation lists. The normal resolution processes already in place are not changed –but the time lag to realize the error and take action is considerably shortened, ideally to hours instead of days, weeks or months. One can expect new business entities to arise, which, as a service, can monitor such logs on behalf of paying customers for unauthorized certificates mis-issued for a domain.
The critical component in the system, therefore, is the log; so much of the technical work is devoted to its structure and ensuring that its data remains cryptographically assured as entries are appended so that the log cannot be retroactively manipulated. Another entity, called an auditor, can verify if a log has corrupted data and also query whether a particular certificate has been logged. If the latter proves negative, this can be an indication that the certificate is fraudulent as all newly-issued certificates should – if this system is to work as intended – be logged.
From a client’s point of view, an additional parameter, obtained during the logging process, is included in the TLS handshake that allows it to verify that the offered certificate was added to the log at a certain time and was indeed issued for that particular certificate.
As you can see, the devil’s in the details, which requires a dedicated post to more fully explain the nitty-gritty as well as nuances of the solution.
To go back to the quotation that heads this section, Certificate Transparency provides at least one mechanism for oversight over the actions of the CAs, the guardians of the trust system that underlies the entire certificate ecosystem. Over time, it is hoped that its use will become commonplace and become a well-controlled and audited additional tool in the security infrastructure that so critically underpins the functioning of the web.
[1] Note that HTTPS-based communications allows intermediaries (such as your ISP) to only know which domain (and IP address) was visited, and not much else beyond that. ISPs can do deep packet inspection of unencrypted channels to provide “value-added” services, from their perspective.
[2] Some show the words “Secure”. We shall write a future post on the ways different browsers make the results of the certificate check visible.