verify that the write caused by `set_tid_address` is constrained by pkeys #292

fw-immunant · 2023-09-25T23:17:48Z

Split out of #233.

DESCRIPTION
For each thread, the kernel maintains two attributes (addresses) called set_child_tid and clear_child_tid.
These two attributes contain the value NULL by default.
   set_child_tid
          If a thread is started using clone(2) with the CLONE_CHILD_SETTID flag, set_child_tid is set to the
          value passed in the ctid argument of that system call.

          When  set_child_tid  is  set, the very first thing the new thread does is to write its thread ID at
          this address.

My concern is that if a hostile compartment A starts a thread with clone() passing the address of memory owned by victim compartment B as the set_child_tid address, the write to this address may succeed even though the clone() syscall was issued by compartment A. I believe this write is performed by the kernel inside the implementation of clone(), which means that it may or may not respect pkeys depending on how it is implemented.

We should test this; if the write does ignore pkeys, we need to filter calls to clone() to either ensure that the set_child_tid and clear_child_tid addresses are owned by the compartment (and ensure that the latter address stays thusly owned until thread ends) or simply forbid the relevant CLONE_CHILD_SETTID/CLONE_CHILD_CLEARTID) flags.

The text was updated successfully, but these errors were encountered:

fw-immunant · 2023-09-25T23:41:06Z

It looks like these writes are performed by put_user; grep the kernel for {set,clear}_child_tid, e.g.: https://elixir.bootlin.com/linux/latest/source/kernel/sched/core.c#L5314

Anyone know whether put_user respects pkeys? I think given that it's not doing a complex dance to circumvent the MMU like /proc/self/mem does (described here) that it likely does respect them.

fw-immunant · 2024-01-11T21:02:06Z

I just wrote a test program (below):

#define _GNU_SOURCE

#include <assert.h>
#include <fcntl.h>
#include <sched.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

void print_unix(char* s) {
	write(1, s, strlen(s));
}

static int thread_body(void* arg)
{
	print_unix("child thread ran\n");
	return 0;
}

#define PAGE_SIZE 4096
#define STACK_SIZE (PAGE_SIZE * 1024)	/* Stack size for cloned child */

/* circumvent pkeys for debugging */
unsigned char read_proc_self_mem_byte(void* ptr)
{
	unsigned char buf[32] = {0};
	int fd = open("/proc/self/mem", O_RDWR);
	pread(fd, buf, sizeof(buf), (uint64_t)ptr);
	return buf[0];
}

int read_proc_self_mem_int(void* ptr) {
	char out[sizeof(int)];
	for(int i=0; i<sizeof(out); i++) {
		out[i] = read_proc_self_mem_byte((char*)ptr+i);
	}
	int read = -1;
	memcpy(&read, &out, sizeof(int));
	return read;
}

int main(int argc, char** argv)
{
	/* allocate memory to protect with pkey */
	void *mem = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
	memset(mem, 0x5a, PAGE_SIZE);
	int pkey = pkey_alloc(0 /* reserved */, PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE);
	/* comment out this line to see the program fail its assertions */
	pkey_mprotect(mem, PAGE_SIZE, PROT_NONE, pkey);

	/* allocate thread stack */
	char* stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
				 MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
	if (stack == MAP_FAILED)
		return 1;

	char* stack_top = stack + STACK_SIZE;

	int clone_flags = CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SYSVSEM
		| CLONE_SIGHAND | CLONE_THREAD
		| CLONE_SETTLS | CLONE_PARENT_SETTID
		| CLONE_CHILD_CLEARTID;

	int tid = 500;
	int* tid_clear_addr = (int*)mem;
	int* tid_addr = (int*)((char*)mem+8);
	char* tls = malloc(4096 * 64);

	pid_t pid = clone(thread_body, stack_top, clone_flags, argv[1], tid_addr, tls, tid_clear_addr);
	printf("clone() pid %lld\n", pid);
	if (pid < 0)
		return 1;

	usleep(1000);

	printf("*tid_clear_addr=%08x\n", read_proc_self_mem_int(tid_clear_addr));
	assert(read_proc_self_mem_int(tid_clear_addr) == 0x5a5a5a5a);
	printf("*tid_addr=%08x\n", read_proc_self_mem_int(tid_addr));
	assert(read_proc_self_mem_int(tid_addr) == 0x5a5a5a5a);

	return 0;
}

Looks like we're safe; these writes silently fail. Comment out the pkey_mprotect call and see the program fail its assertions due to the writes succeeding.

fw-immunant mentioned this issue Sep 25, 2023

Syscall prioritization #233

Closed

2 tasks

fw-immunant added security syscalls threads labels Sep 25, 2023

fw-immunant closed this as completed Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

verify that the write caused by `set_tid_address` is constrained by pkeys #292

verify that the write caused by `set_tid_address` is constrained by pkeys #292

fw-immunant commented Sep 25, 2023 •

edited

Loading

fw-immunant commented Sep 25, 2023

fw-immunant commented Jan 11, 2024

verify that the write caused by set_tid_address is constrained by pkeys #292

verify that the write caused by set_tid_address is constrained by pkeys #292

Comments

fw-immunant commented Sep 25, 2023 • edited Loading

fw-immunant commented Sep 25, 2023

fw-immunant commented Jan 11, 2024

verify that the write caused by `set_tid_address` is constrained by pkeys #292

verify that the write caused by `set_tid_address` is constrained by pkeys #292

fw-immunant commented Sep 25, 2023 •

edited

Loading