We don’t know what to call it right now, so we refer to it in the headline by a hybrid name Microsoft Office 365.
(“Office” as the collective term for Microsoft’s word processing, spreadsheet, presentation and collaboration apps is slated to be phased out in the next month or two to become just “Microsoft 365”.)
We’re sure people will continue to use individual app names (the word, Excel, PowerPoint and friends) and the suite’s moniker office For many years, even newcomers to software probably know this 365After dropping the ubiquitous Microsoft prefix.
As you may know, Office standalone apps (the ones you actually install locally so you don’t have to go online to work on your stuff) include their own option to encrypt saved documents.
This adds an extra layer of security if you later share any of those files, by accident or design, with someone you don’t expect to receive them – the surprise of accidentally sharing an attachment via email.
Unless and until you give the recipient the password they need to unlock the file, that’s a lot of chopped cabbage for them.
Of course, if you include the password in the body of the e-mail with an encrypted attachment, you’ve got nothing, but if you’re a little cautious about sharing passwords through different channels, you’ve bought yourself something extra. Safe and secure access to confidential content from rogues, snoops and ne’er-do-wells.
OME under the spotlight
Or have you?
According to researchers at Finnish cybersecurity company WithSecure, your data may be enjoying less protection than you might expect.
The feature used by testers is what they say Office 365 Message EncryptionOr OME In short
We did not reproduce their experiments here, because the main office, sorry, 365 products do not run natively on Linux, which we use for work. Web-based versions of Office tools don’t have the same feature set as full apps, so we’re unlikely to get any results that match how most business users of Office, ah, 365 have configured Word, Excel, Outlook. and friends on their Windows laptops.
As the researchers describe:
This feature is advertised to allow organizations to securely send and receive encrypted email messages between people inside and outside your organization.
But they also indicate that:
Unfortunately OME messages are encrypted in an insecure Electronic Codebook (ECB) mode of operation.
The ECB explained
Many encryption algorithms, especially Advanced Encryption Standard Or AES, which OME uses, is known as block cipherwhich scrambles multiple-byte chunks of data at a time rather than processing individual bits or bytes sequentially.
In general, this is thought to help with both efficiency and security, as each turn of the cryptographic crank-handle that drives the algorithm has more input data to mix into the cipher, and each turn takes you further. by the data you want to encrypt.
The core AES algorithm, for example, uses 16 input plaintext bytes (128 bits) at a time and scrambles the data under the encryption key to produce 16 encrypted ciphertext output bytes.
(Don’t get confused block size with the size of the key – AES encryption keys can be 128 bits, 192 bits, or 256 bits long, but all three key sizes operate on 128 bit blocks each time the algorithm is “cranked”.)
This means that if you choose an AES key (regardless of length) and then use the AES cipher directly on a piece of data…
…then Every time you get the same input part, you get the same output part.
Like a really massive codebook
That is why this mode of operation is called direct ECBshort for Electronic Code BookBecause it’s like having a huge code book that can be used as a lookup table to encrypt and decrypt.
(A complete codebook can never be created in real life, because you need to store a database containing 2.128 16-byte records For each possible key.)
Unfortunately, especially with computer-formatted data, repetition of parts of the data is often unavoidable, thanks to the file format used.
For example, files that regularly pad out data sectors so they line up on 512-byte boundaries (a common sector size when writing to disk) or 4096-byte boundaries (a common allocation unit size when reserving memory) often create files that are zero bytes long. run
Likewise, text documents that contain a lot of boilerplate, such as headers and footers on each page, or frequent mentions of the entire company name, will have a lot of repetition.
Each time a plaintext segment repeats on a 16-byte boundary in the AES-ECB encryption process, it will appear in the encrypted output. Just like the same cipher text.
So, even if you can’t formally decrypt a ciphertext file, you can draw immediate, security-defining conclusions from it, thanks to the fact that patterns in the input (which you might know, or guess, or guess) are preserved in the output. go
Here’s an example based on an article we published about nine years ago when we explained why Adobe’s now-infamous use of ECB-mode encryption to “hash” users’ passwords was not a good idea:
How pixels that are solid white in the input reliably form a repeating pattern in the output, and the blue areas remain somewhat regular, so that the structure of the original data is clear.
In this example, each pixel in the original file takes up exactly 4 bytes, so each left-to-right 4-pixel run in the input data is 16 bytes long, which aligns exactly with each 16-byte AES encryption block, thus pronounced. “The ECB Effect”.
Matching ciphertext patterns
Even worse, if you have two documents that you know are encrypted using the same key, and you have the plaintext of one of them, you can see the ciphertext. can’t Decrypt it, and try to match sections of it with patterns in your ciphertext can do Decrypt
Given that you already have the decrypted form of the first document, this approach is, not surprisingly, known as A known-simple attack.
Even if there is a match to seemingly innocent text, the inferences that adversaries can thus draw can be a goldmine for intellectual property spies, social engineers, forensic investigators, and more.
For example, even if you have no idea what the details of a document refer to, by matching known plaintext segments in several files, you can determine that an apparently random collection of documents:
- All were sent to the same recipient, If there is a common salutation at the top of each.
- Refer to the same project, If a unique identifying text string pops up.
- Classify the same security, For example, if the repeated text Company Confidential appears everywhere, it probably indicates a file of special interest.
what to do
Do not use ECB mode!
If you are using a block cipher, choose one Block cipher operating mode that:
- IV, or what is known as the initialization vector, consists of, Randomly and uniquely selected for each message.
- Deliberately orchestrates the encryption process So that every time the input is repeated it comes out differently.
If you’re using AES, that’s the mode you’ll probably want to choose these days AES-GCM (Galois counter mode), which not only uses IV to generate a different encryption data stream each time, but also calculates even if the key remains the same. Message authentication code (MAC), or keyed cryptographic hash, while simultaneously scrambling or unscrambling the data.
AES-GCM means not only do you avoid repeating ciphertext patterns, but you always have a “checksum” that tells you if the data you just decrypted has been tampered with.
Note that someone who doesn’t know what the ciphertext means will still be able to trust you with a wrong decryption without knowing (or caring) what type of output you end up with.
The MAC, calculated during the decryption process, based on the same key and IV, will help ensure that you have indeed extracted the plaintext you expected.
If you don’t want to use a block cipher like AES, you can choose a stream cipher Instead the algorithm creates a pseudorandom byte-by-byte keystream so you can encrypt data without processing 16 bytes (or whatever the block size may be) at a time.
Technically, AES-GCM converts AES into a stream cipher and adds authentication in the form of a MAC, but if you’re looking for a dedicated stream cipher designed specifically to work this way, we suggest Daniel Bernstein’s. ChaCha20-Poly1305 (the Poly1305 part is the MAC), as detailed in RFC 8439.
Below, we show what we got using AES-128-GCM and ChaCha20-Poly1305 (we’ve discarded the MAC code here), along with an “image” containing 95,040 RGBA bytes (330×72 4 bytes per pixel). Linux kernel pseudorandom generator.
Remember it’s only because of the data seems unstructured doesn’t mean it’s truly random, but if it doesn’t look random, yet claims to be encrypted, you should assume at least some structure is left behind and thus the encryption is suspect:
What will happen next?
According to WithSecure, Microsoft has no plans to fix this “vulnerability”, apparently for reasons of backward compatibility with Office 2010…
Legacy versions of Office (2010) require AES 128 ECB, and Office documents are still protected this way by Office apps.
The [WithSecure researchers’] A report is not considered or considered a violation to meet the bar for safety servicing. No changes were made to the code and therefore no CVE was issued for this report.
In short, if you currently rely on OME, you may want to consider replacing it with a third-party encryption tool for sensitive messages that encrypts your data independently of the apps that created those messages, and thus works independently of internal encryption. Codes in the office range.
That way, you can choose a modern cipher and a modern mode of cipher operation, without going back to the old-school decryption code created in Office 2010.
HOW WE MADE THE IMAGES IN THE ARTICLE Start with sop330.png, which you can create for yourself by cropping the cleaned-up SOPHOS logo from the topmost image, removing the 2-pixel blue boundary, and saving in PNG format. The image should end up at 330x72 pixels in size. Convert to RGBA using ImageMagick: $ convert sop330.png sop.rgba Output is 330x72 pixels x 4 bytes/pixel = 95,040 bytes. === Encrypt using Lua and the LuaOSSL library (Python has a very similar OpenSSL binding): -- load data > fdat = misc.filetostr('sop.rgba') > fdat:len() 95040 -- create cipher objects > aes = openssl.cipher.new('AES-128-ECB') > gcm = openssl.cipher.new('AES-128-GCM') > cha = openssl.cipher.new('ChaCha20-Poly1305') -- initialise passwords and IVs -- AES-128-ECB needs a 128-bit password, but no IV -- AES-128-GCM needs a 128-bit password and a 12-byte IV -- ChaCha20 needs a 256-bit password and a 12-byte IV > aes:encrypt('THEPASSWORDIS123') > gcm:encrypt('THEPASSWORDIS123','andkrokeutiv') > cha:encrypt('THEPASSWORDIS123THEPASSWORDIS123','qlxmtosh476g') -- encrypt the file data with the three ciphers > aesout = aes:final(fdat) > gcmout = gcm:final(fdat) > chaout = cha:final(fdat) -- a stream cipher produces output byte-by-byte, -- so ciphertext should be same length as plaintext > gcmout:len() 95040 > chaout:len() 95040 -- we won't be using the MAC codes from GCM and Poly1305 here, -- but each cipher produces a 128-bit (16-byte) "checksum" -- used to authenticate the decryption after it's finished, -- to detect if the input ciphertext gets corrupted or hacked -- (the MAC depends on the key, so an attacker can't forge it) > base.hex(gcm:getTag(16)) a70f204605cd5bd18c9e4da36cbc9e74 > base.hex(cha:getTag(16)) a55b97d5e9f3cb9a3be2fa4f040b56ef -- create a 95040 "image" straight from /dev/random > rndout = misc.filetostr('/dev/random',#fdat) -- save them all - note that we explicity truncate the AES-ECB -- block cipher output to the exact image length required, because -- ECB needs padding to match the input size with the block size > misc.strtofile(aesout:sub(1,#fdat),'aes.rgba') > misc.strtofile(gcmout,'gcm.rgba') > misc.strtofile(chaout,'cha.rgba') > misc.strtofile(rndout,'rnd.rgba') === To load the files in a regular image viewer, you may need to convert them losslessly back into PNG format: $ convert -depth 8 -size 330x72 aes.rgba aes.png $ convert -depth 8 -size 330x72 gcm.rgba gcm.png $ convert -depth 8 -size 330x72 cha.rgba cha.png $ convert -depth 8 -size 330x72 rnd.rgba rnd.png === Given that the encryption process scrambles all four bytes in each RGBA pixel, the resulting image has variable transparency (A = alpha, or transparency). Your image viewer may decide to display this sort of image with a checkerboard background, which confusingly looks like part of the image, but isn't. We therefore used the Sophos blue from the original image as a background for the encrypted files to make them easier to view. The overall blue hue is therefore not part of the image data. You can use any solid colour you like.