How did code signing become important?
Let’s look at three interesting trends that have emerged in software development over the last decade.
One is that the number of companies that develop and release software has increased enormously. A key trigger for this has been the smartphone revolution and the associated need for companies to have their own mobile apps for their customers, as well as employees.
The second trend is that the primary means to distribute software, as well as the related patches and upgrades, has ubiquitously now become the internet. The advantages of using online software distribution, versus traditional methods such as CDs (Compact Discs) are significant: large scale, near instant software distribution at extremely low costs.
The third trend is the number of Independent Software Vendors (ISVs) who are primarily in the business of building software applications, has grown steadily over the years, with some research reports indicating a ten-fold growth over the last decade.
These trends explain how online software distribution has become the preferred method for companies developing and selling software. But does this method and its convenience-benefit also introduce business risk?
How does a user know whether the software being downloaded is from the original author (and not an impersonator)? How does the user know that the software has not been tampered with, and some malicious code inserted into the software? Code Signing provides the answer to these questions, by helping companies secure the software they release.
Apart from software developers and enterprise security specialists, it is critical for project managers, product managers, engineering managers and even senior management to be familiar with code signing. The reason is simple. Code Signing is an excellent safeguard against malware attacks. And malware is the most expensive type of attack on any enterprise: the average cost of a malware attack, as per a recent research report from IBM security, is $239 million which is 60 times more than the average cost of a data breach!
How code signing works
The “code” in code signing can mean executables, archives, drivers, firmware, libraries, packages and essentially any software that is intended for release and distribution to another party or user.
To understand how code signing works, it is important to have a basic understanding of Public Key Infrastructure (PKI). As defined in earlier articles on this blog, PKI is a set of roles, policies, hardware, software and procedures needed to create, manage, distribute, use, store and revoke digital certificates and manage public-key encryption. It is also important to understand the two primary objectives of code signing:
- Code Ownership:
Proving that the software code being downloaded and installed is from the original, authentic owner.
- Code Integrity:
Proving that the code has not been tampered with or changed in any way, e.g. with some malicious code (malware) being inserted in it.
The process of code signing involves four main steps which are described below.
- Key Generation
The basic requirement for code signing is to have a private key and the corresponding public key available. The public and private key pair can be generated using a (trusted) third party tool or software. A detailed explanation of key generation is out of scope of this article.
- Code Signing Certificate
You then need to apply for a code signing certificate with a Certificate Authority (CA), which is a trusted entity that issues digital certificates. The application needs to include your public key along with other organization identity details. Some of the well-known CAs include Verisign, Digicert, Symantec, GoDaddy, Comodo, Let’s Encrypt, and GlobalSign. The certificate that the CA issues includes information such as your (organization) identity, your public key, the certificate validity period, the digital signature of the CA, and other details.
The next step is hashing your code. Hashing is a one-way process where data of any size and type can be converted, through a mathematical algorithm, to fixed size data. The algorithm is called a hash function and its output i.e. the fixed size data is called a hash value or hash. The hash value is totally different from the original data and the original data cannot be deduced from the hash.
The hash value of the software is then encrypted or “signed” using the private key. The encrypted hash, along with the code signing certificate, is added to the software package that is now ready to be shipped or distributed.
The reason why the hash value is signed, and not the original software, is that the hash is a small amount of data (typically up to 512 bits) which can be encrypted very quickly, whereas the original software might be very large and might take a long time to encrypt. Also, there is no real need to encrypt the software code itself: the beauty of hashing is that if the original software code is modified even by a single bit, the hash value produced by the hash function is totally different.
What happens when the software is downloaded?
At the receiver side, the browser being used to download the software first checks that the certificate in the code being downloaded is authentic and from a trustworthy CA. This is possible since the public keys of most of the well-known CAs are already pre-installed with most browsers and operating systems.
If the certificate is not authenticated, the browser will alert you and depending on browser security settings, may or may not allow the download. If the user ignores this warning, and attempts to install the software, the operating system will issue an alert indicating that the software publisher could not be verified and effectively discourages the user from installing the software. This addresses the first objective of code signing i.e. establishing code ownership.
If the certificate is authenticated, the public key is extracted from the certificate and used to decrypt the encrypted hash available in the package. Next the actual downloaded software (minus the certificate and hash) is hashed again using the same hash function. This hash value is compared with the decrypted hash value. If they match, the software has not been altered. If an attacker has changed the software (e.g. by adding some malware) then the hashes will not match, the operating system will throw an alert and refuse to install the software. This addresses the second objective i.e. ensuring code integrity.
Today code signing is an essential part of the software development lifecycle. Without code signing, enterprises risk losing users and face enormous financial and reputation risks in case of malware attacks. It is therefore critical for software line managers as well as senior management to understand code signing and its importance to their organizations.