Codegate CTF Preliminary 2019: PyProt3ct (Reversing 27.8)
Writeup
What We Got
The provided zip archive contains three files:
chal.py
: a wrapper that reads the filecode
, a flag from stdin and calls a function fromplay.py
with those as argumentsplay.py
: an obfuscated script with multiple functions, all names are a combination of0
s andO
s (likeO0O0OOO00OO00O000
)code
: a 600 kB file with mixed binary and ASCII characters, a quick peek shows that there are also strings likeO0O0OOO00OO00O000
Tidying Up 🧹
As a first step I started to deobfuscate play.py
manually by renaming variables to meaningful names. It turned out that it is a virtual machine implementation that executes the code in the code
file. The last function loops over the instructions and executes them. Each of the other functions is an implementation of an instruction. They all receive a memory object which is a python dict
. There are special memory slots, for example the program counter sits at key 1000
and the user input sits at flag
. There is also a memory slot that holds the return value when our code returns. The overall goal will most likely be to make the code return a truthy value. We then know that the user input we gave it is the correct flag.
The VM Architecture
To get an idea of what the actual program does, we first have to understand how the VM works, what instructions there are, etc.
Instruction Format
Each instruction consists of a 1 byte opcode and 1 byte arguments count. There can be up to 4 arguments which are then read one after another by first reading 1 byte that describes the argument’s length and then actual argument itself. Arguments are stored in the memory slots 2001
to 2004
.
Instructions
Opcode | My name for it | Args | Meaning (pseudo code) |
---|---|---|---|
0x00 | const | 2 | mov <mem_slot_dst>, <literal_value> |
0x02 | set_item | 3 | mov <mem_slot_dst>[<offset>], <mem_slot_src> |
0x08 | jge | 3 | cmp <mem_slot_a>, <mem_slot_b>; jge <address> |
0x0b | jmp | 1 | jmp <address> |
0x22 | get_item | 3 | mov <mem_slot_dst>, <mem_slot_src>[<offset>] |
0x29 | eval_call_w_arg | 2 | evals the first argument, calls the result with the second argument and saves the return value in the result slot |
0x2c | eval_call | 1 | same as above, but calling without an argument |
0x31 | jl | 3 | cmp <mem_slot_a>, <mem_slot_b>; jl <address> |
0x48 | get_result | 1 | mov <mem_slot_dst>, result_slot |
0x4a | return | 1 | return <mem_slot_src> |
0x51 | je | 3 | cmp <mem_slot_a>, <mem_slot_b>; je <address> |
0x52 | mov | 2 | mov <mem_slot_dst>, <mem_slot_src> |
0x5b | jne | 3 | cmp <mem_slot_a>, <mem_slot_b>; jne <address> |
0x63 | subrange | 4 | mov <mem_slot_dst>, <mem_slot_src>[<offset_start> : <offset_end>] |
0x66 | add | 3 | add <mem_slot_dst>, <mem_slot_a>, <mem_slot_b> |
0x6f | or | 3 | or <mem_slot_dst>, <mem_slot_a>, <mem_slot_b> |
0x83 | and | 3 | and <mem_slot_dst>, <mem_slot_a>, <mem_slot_b> |
0x97 | shl | 3 | shl <mem_slot_dst>, <mem_slot_a>, <mem_slot_b> |
0xab | shr | 3 | shr <mem_slot_dst>, <mem_slot_a>, <mem_slot_b> |
0xba | xor | 3 | xor <mem_slot_dst>, <mem_slot_a>, <mem_slot_b> |
A good thing to note here is that the VM apparently is a Harvard architecture (data and code are stored separately) and there is no direct way of modifying the code, so we won’t have to deal with self modifying code. The only exceptions are the eval/call instructions which could also be dangerous because they can execute arbitrary python code, so I decided to only run the whole thing in a docker container.
The Actual Program
To get some insight into the code, I wrote a simple disassembler based on the code of the VM. The first lines of its output looked like this:
1 | 0x00000000: OOOOO0OOO0O0OOOO00O0O00OOO0OO0O0000O0OOO000000OOO0OO00OOOO00O0OO0000OOOO0OOO000 = int(0x348c0) |
Taking Out the Trash 🗑️
Here we go again… I made the disassembler name the memory slots var_*
instead of that 💩. I then used some bash voodoo to find slots that are in fact constants because they are only assigned once and renamed them to const_*
. I can now finally look at the code without my eyes starting to bleed 🙂.
1 | 0x00000000: const_0 = 0x348c0 |
Looking at the disassembly (4072 lines), I noticed that there are many constant assignments and conditional jumps based on constant values (which makes them unconditional jumps). Since that is basically all dead code, I filtered out those instructions. That leaves us with the core code (I already renamed some mem slots here):
1 | 0x0007e160: total = const_3066 + const_3068 |
I also put log statements into the VM code to see the memory values during runtime. The value in parentheses behind a memory slot is its current value.
1 | 0x0007e160: total = const_3066 (0x0) + const_3068 (0x0) |
The Logic 💡
The core logic of the program looks like this:
1 | if len(flag) != 8: |
The Solution
At this point I probably stared too many hours into the PyProt3ct abyss, so I decided that in order to solve the challenge I had to write a Z3 script. So I did that. And it said unsatisfiable. Which was quite demotivating.
Fortunately, @tihmstar came by right in that moment and quickly saw that all the operations can be reversed and we could simply use that to compute the flag from the expected value at the end:
1 | #!/usr/bin/env python |
That gave us the flag d34dPY27
. Using it as the user input for the original VM script gave us a :)
, so our calculation was right!