Xous Process Creation

May 30, 2022 • 9 min

A fundamental feature of modern computing is the ability to run programs. In many embedded systems, the list of processes is fixed at startup. In other systems, including mobile phones and desktop computers, new processes can be spawned on a running system.

Until now, Xous has been unable to create new processes. Instead, it relied on the bootloader to create memory maps and load processes from disk into those newly-created memory maps. All the kernel had to do was import that mapping information from the bootloader and set up the main thread.

However, with the latest version of Xous, the RISC-V implementation of the CreateProcess syscall has landed, giving us the ability to create processes at runtime.

What is a Process?

A process consists of two fundamental pieces of data:

A memory space
At least one thread context

Each process has its own memory space, which prevents one process from crashing another by corrupting its program. No Core Wars going on here. Giving unique memory spaces also allows us to keep each process in the same memory space. Unlike most embedded platforms where code is relocated at compile time or must be built as position-independent code, processes in Xous effectively have the entire 32 bits of address to themselves.

Thread contexts define one unit of execution within a process. A process with no threads cannot do any useful work. Contexts include the entire register state, allowing the operating system to switch a thread in and out of execution. As part of a context, the thread must have an area of memory dedicated to the stack, as well as an area of memory from which to execute code.

A thread context begin execution at the “entrypoint” and continue indefinitely.

What is a Memory Space?

A memory space is an isolated mapping of memory that is unique per-process. It is defined in the RISC-V Privileged Spec v1.10. This is a page-table approach, meaning memory is chopped up into about 4 million four-kilobyte sections. For efficiency, this is split into a two-tier page system, because the vast majority of the pages are unmapped and it doesn’t make sense to allocate 16 megabytes for the entire address space.

The “mapping” is stored in the CPU’s satp register as a truncated 22-bit address with some extra information tacked on. This address must point to a physical page of memory where the Level 1 page table is. This table has 1024 entries, each of which is four bytes, taking a total of 4096 bytes.

Each entry in this Level 1 “root” page table can point to a Level 0 “leaf” page table, a four megabyte “Megapage”, or it can be invalid and point to nothing.

Each entry in a Level 0 “leaf” pagetable can point to a four-kilobyte page, or it can be invalid and point to nothing.

Additionally, page pointers have all sorts of flags to set readonly, read-write, and “kernel” or “userspace” accessible.

Normally the kernel takes care of setting up level 0 pagetables for us when we call MapMemory. If a program requests access to a new area of memory, the kernel will allocate the appropriate pages and construct the chain for us.

Bare Minimum Memory Space

The bare minimum memory space contains four pages of correctly-configured memory:

The root pagetable, pointed to by satp
A leaf pagetable covering megapage at 0xff80_0000 where the root pagetable and thread context live
A leaf pagetable covering the megapage at 0xff40_0000 where the leaf pagetables reside, granting read/write access to pages (1), (2), and (3)
A shared entry in the final megapage of the root pagetable mapping the kernel into 0xffc0_0000

This final mapping allows us to actually activate this mapping without a crash, since changing mappings also changes memory we’re executing from.

Note that for completeness we also map in the thread context.

Creating a Memory Space

Creating a memory address poses a challenge. The system allocator only knows how to allocate memory inside of an existing process. In order to create a new process, we must manually build up a minimum set of pagetables within the parent process and then “detach” those pages.

There’s a neat little diagram lurking in the kernel source that describes the minimum memory mapping. I created this because I was having trouble keeping the mapping straight in my mind, and drawing diagrams always seems to help:

///                             +----------------+
///                             | Root Pagetable |
///                             |      root      |
///                             +----------------+
///                                      |
///                  +-------------------+----------------+
///                  |                   |                |
///               [1021]              [1022]           [1023]
///                  v                   v                v
///          +--------------+    +--------------+     +--------+
///          | Level 0/1021 |    | Level 0/1022 |     | Kernel |
///          |   pages_l0   |    |  process_l0  |     |        |
///          +--------------+    +--------------+     +--------+
///                  |                       |
///          +-------+---------+             +-------+---------------+
///          |                 |                     |               |
///       [1021]            [1022]                  [0]             [1]
///          v                 v                     v               v
///  +--------------+  +--------------+     +----------------+  +---------+
///  | Level 0/1021 |  | Level 0/1022 |     | Root Pagetable |  | Context |
///  +--------------+  +--------------+     +----------------+  +---------+

The Kernel entry is the easiest: All we have to do is copy entry 1023 from our mapping into the new one:

let krn_pg1023_ptr = (PAGE_TABLE_ROOT_OFFSET as *const usize)
		.add(1023)
		.read_volatile();
root_temp_virt.add(1023).write_volatile(krn_pg1023_ptr);

Even if we make no other changes, this will allow us to activate this memory space and continue execution within the kernel. However, since we have not yet mapped the pagetables themselves, this process will not be able to map any additional memory. In order to accomplish this, we need to grant the new process read/write access to the root pagetable at offset 0xff80_000, as well as to the individual pagetables at 0xff40_0000.

Once we do that, we have a fully-functional memory space we can have the kernel switch into and continue setting up the process.

Creating a Thread Context

A thread context contains all the information necessary to run a thread. Processes must have at least one thread. And we’ll probably want some arguments. And some code to run. And other nice features like arguments and environment variables someday.

Like most other things in Xous, we solve these problems with the use of Messages.

First, the CreateProcess call includes a pointer to some memory to be Sent to the child process. Unlike normal MemoryMessages, the CreateProcess call not only includes the source pointer and length, but also the destination address. This allows the parent process to load the stub at a known, fixed address. Additionally, the CreateProcess call includes parameters for the size and location of stack.

When the kernel creates the process it needs to fill in all of the register values. Most of these will be zero, however some notable registers need to not be zero. The stack pointer, for example, needs to point at the end of stack. Furthermore, the stack needs to actually be allocated.

The program counter register should point at the entrypoint, which is taken from the process args.

Finally, there are arguments to the entrypoint function. These are stored in the argument registers a0 through a7. Xous only sends four arguments, meaning a0 through a3 contain values, and the rest will be set to zero.

Everything is a Message

The CreateProcess call only supports sending a single .text section. This is far from sufficient even for the most basic of programs. What about other data sections such as .data or .bss? What about fancy things like C++ constructors or multiple memory regions? What about loading more complicated binary formats such as ELF or PE?

The Xous kernel doesn’t contain any logic to implement any sort of loader. All it knows how to do is to create a memory space and specify the initial program section. The kernel also has a trick up its sleeve: It creates a Server in the newly-created child process, connects it to the parent, and passes the Server ID as arguments to the entrypoint.

The idea is you create a simple stub program that contains an entrypoint that looks like:

#[no_mangle]
pub extern "C" fn init(a1: u32, a2: u32, a3: u32, a4: u32) -> ! {
    let server = xous::SID::from_u32(a1, a2, a3, a4);
    while let Ok(xous::Result::Message(envelope)) =
        xous::rsyscall(xous::SysCall::ReceiveMessage(server))
    {
        match envelope.id().into() {
            StartupCommand::WriteMemory => write_memory(envelope.body.memory_message()),
            StartupCommand::FinishStartup => finish_startup(server, envelope),
            StartupCommand::PingResponse => ping_response(envelope),

            _ => panic!("unsupported"),
        }
    }
    panic!("parent exited");
}

Then you write pages of memory into the target program by repeatedly sending the WriteMemory opcode. When you’re done, call FinishStartup to shutdown the server and jump to the entrypoint.

By taking this approach the kernel can avoid needing to know anything about any sort of binary format. This will make it easier to develop loaders in the future, because the entire kernel will not need to be recompiled. It also keeps the amount of code in the kernel low.

Further Work

While we are now able to spawn processes, there is still more work to do. For starters, all newly-spawned processes are currently listed as being child processes of PID 1. This is because they would otherwise never get scheduled, because the kernel only schedules children of PID 1. The idea here is to have a parent process use some of its quantum to schedule its own children, for example by having a thread that does nothing but loop through its child processes yielding execution to those processes. This is a simple change to make, which requires modifying existing opcodes.

A bigger issue is how thread descheduling works. Descheduling a thread immediately returns control to its parent, which as of right now is always PID 1. This has not been a problem since PID 1 is always able to be run, but what happens if we try to schedule a process that has no available threads? Currently, the system deadlocks.

These two changes are not insurmountable, and will make for a more interesting system in the future.

Conclusion

In the end, Xous has now gained the ability to launch new processes. This has been a great validation in the approach of “Everything-is-a-message”, and has been an interesting dive into the depths of the kernel.

The Betrusted project, including the Xous operating system, are made possible thanks to financial assistance from NLNet and the NG10 Privacy & Trust Enhancing Technologies Fund. Thank you to them for their support.

This work was funded by NLNet.