Gang members, drug dealers, and even protestors have been quick to adopt ways to screen their communications. This is why law enforcement agencies are seeing a rapid rise in the adoption of highly encrypted apps like Signal, which incorporate capabilities like image blurring to stop police from reviewing data. 

Decrypting messages and attachments sent with Signal has been all but impossible…until now.

Why Signal Is So Popular

Signal is an encrypted communication application designed to keep sent messages and attachments as safe as possible from 3rd-party programs. Signal not only uses end-to-end encryption for the data it sends, but the app also employs a proprietary open-source encryption protocol called “Signal Protocol™“. Apps like this make parsing data for forensic analysis extremely difficult.

The pandemic and its effects have driven unprecedented sign-ups for Signal, as people started communicating more online. According to analytics firm App Annie, Signal was downloaded more than one million times worldwide in May 2020.

Protestors have been embracing the app to communicate securely with their teams marshaling protestors, discussing tactics, and liaising with the police. Criminals are also using this application to communicate, send attachments, and making illegal deals that they want to keep discrete and out of sight from law enforcement. Because it encrypts virtually all its metadata to protect its users, efforts have been put forward by legal authorities to require developers of encrypted software to enable a “backdoor” that makes it possible for them to access people’s data.

Until such agreements are reached, Cellebrite continues to work diligently with law enforcement to enable agencies to decrypt and decode data from the Signal app using Cellebrite Physical Analyzer and following extractions performed by Cellebrite Advanced Services. Let’s take a closer look at how Cellebrite is making this possible.

Cracking The Code

Signal stores its data in the following structure:

Signal keeps its database encrypted using SqlScipher, so reading it requires a key. We found that acquiring the key requires reading a value from the shared preferences file and decrypting it using a key called “AndroidSecretKey”, which is saved by an android feature called “Keystore”.

Once the decrypted key is obtained, we needed to know how to decrypt the database. To do it, we used Signal’s open-source code and looked for any call to the database. After reviewing dozens of code classes, we finally found what we were looking for:

After finding this, we simply ran SqlCipher on the database with the decrypted key and the values 4096 and 1 for page size and kdf iterations. By doing so we managed to decrypt the database.

A new sub-node was then shown on PA under “signal.db”, called “signal.db.decrypted”. This is what our new database looks like now.

The messages are stored under the “signal.db.decrypted” file in a table called “sms” and the attachments are stored under the “app_parts” folder.

Linking the messages and the attachments requires parsing both the “sms” table and another table called “part.”

After linking the attachment files and the messages we found that the attachments are also encrypted. This time, the encryption is even harder to crack. We looked again into the shared preferences file and found a value under “pref_attachment_encrypted_secret” that has “data” and “iv” fields under it.

The “data” field contains an encrypted json file, that once decrypted, contains the decryption keys of the sent attachments.  This json contains three keys: “ClassicCipherKey“, “ClassicMacKey“, and “ModernKey“.

 The newer versions of Signal use the “ModernKey“. After getting the “ModernKey“ we went to a field in the “part” table called “data_random” for each row in the table.

Now we needed to turn the “ModernKey” and “data_random” values into a decryption key and an “IV” for the decryption to work. There are some cases where the “IV” has a value, but in this blog, we will only specify the common case, in which the “IV” is empty. So once again, we took a look into Signal’s open-source code, and found this:

This little piece of code told us exactly what we were looking for: how the decryption key and “IV” are generated from the “ModernKey” and “data_random”. The key is hashed using a HmacSHA256 algorithm in which a new hash is created with the “ModernKey” as the initialization vector and the hash is then computed on the “data_random”.

After getting the decryption key, we now needed to know what decryption algorithm to use. We went back to Signal’s open-source code and found this:

 Seeing that told us that Signal uses AES encryption in CTR mode. We used our decryption key with the AES encryption in CTR mode and decrypted the attachment files.

After decrypting the files, we got a new set of sub-nodes under each node in the “app_parts” folder.

These files are the attachments that were sent in the Signal messages. Now, using the link between messages and attachments we created when we parsed the messages, we can add the attachments to the conversation and see the chats as they were seen by the chat participants.

Decrypting Signal messages and attachments was not an easy task. It required extensive research on many different fronts to create new capabilities from scratch. At Cellebrite, however, finding new ways to help those who make our world a safer place is what we’re dedicated to doing every day.

To learn more about our Digital Intelligence solutions, visit Cellebrite.com.

Share this post