Writing a Wayland client from scratch in Zig

Lately I've been enjoying writing things from scratch. Writing a Wayland client from scratch has been the most frustrating and rewarding experience of my last two weeks. The number of new things that I've learned was more than worth all the frustrations. It took two weeks but at this point I'm not sure if it was actually more. It's part of my building a debugger in Zig series.

If you'd like to skip the explanation and see the code you can find it here. It's a work in progress but it works. You'll also find links to all the resources that helped in the comments on top of the file.

What is Wayland?

It's a communication protocol for serving windows on displays on Linux and is designed to be an improvement on X11. Which was the original window system for my debugger but since then I upgraded my OS and the OS now uses Wayland instead of X11. You can render X11 windows in Wayland (I think) but I saw an opportunity to write my own client which would be in Zig just like the rest of the code. Which makes things a lot more easier for me.

Connecting to Wayland server

Wayland uses a similar server client modal as X11, although don't take my word for it as there are quite a few differences. We use a unix socket to connect to the wayland server. The file path for the socket is a combination of two environment values. Namely: XDG_RUNTIME_DIR and WAYLAND_DISPLAY. Once you get these you can prepare the address and connect to the socket. This is how I did it:

    const socket_path = try std.fs.path.joinZ(allocator, &.{ xdg_runtime, wayland_display });
    defer allocator.free(socket_path);

    var addr = std.posix.sockaddr.un{ .path = undefined };
    if (socket_path.len >= addr.path.len) {
        std.log.err("Path len: {d} | Socket path len: {d}\n", .{ addr.path.len, socket_path.len });
        return error.PathTooLong;
    }

    var path = [_]u8{0} ** addr.path.len;
    @memcpy(path[0..socket_path.len], socket_path);
    addr.path = path;

    const socket = try std.posix.socket(std.posix.AF.UNIX, std.posix.SOCK.STREAM, 0);
    try std.posix.connect(socket, @ptrCast(&addr), @sizeOf(@TypeOf(addr)));

There are a few important things to keep in mind, I'm saying this because I forgot them one too many times. If you do something wrong Wayland simply disconnects the socket but sometimes it doesn't which was the root cause of all my frustrations. They were entirely my fault as I forgot some important details from the documentation.

Requests & Events

Whenever you send a request to Wayland there are a few things to keep in mind. These vary based on the data type. The same rules apply to the events that are received from the server.

Data Types

The wire protocol is a stream of 32-bit values, encoded with the host's byte order (e.g. little-endian on my x86 machine).There are a few data types but these are the ones I've needed so far and is enough to get a window up on you screen.

int, uint: both 32, u32 and i32 in Zig
fd: sending file descriptors is not straight forward as they are process specific (one of the many things I learned the hard way), I'll explain this later
string: anytime you send or receive a string it's preceded by a length of the string (including the null terminator) then the string followed by a padding to take up 32 bits
arrays: exactly like strings except the null terminator

These are the ones I needed to get a window on the screen.

Sending/Receiving data to/from the server

Each request and event starts of with a Header. It has an object_id this usually let's Wayland know which interface we are talking about. You can find the entire protocol here. There is also one on your machine but I can't remember where you get it from. The last two fields are opcode, this specifies the operation within that interface. This is just the index of the request in the protocol. Example: in the interface wl_display the opcode for the request get_registry is 1 as it's the second element. I'm not sure how I got this information but it is correct. The last field is the message_size which includes the size of the header. This is my header type:

const Header = packed struct {
    object_id: u32,
    // Note: opcode is the index of the request within the interface in the protocol xml
    op: u16,
    // Note: size includes size of header and message
    message_size: u16,
};

Errors

Some interface requests can return an error, they have their own opcode based on their index (events have their index) in the protocol. Reading the error itself is quite straightforward. This is the error type:

    const Args = struct {
        object_id: u32,
        code: u32,
        message: []const u8,
    };

Sending the very first request

The very first request you send to the server is the get_registry request in the wl_display interface. It takes one argument called new_id which is 2. As Object ID 1 is pre-allocated as the Wayland display singleton. Client can assign ids in the range of [1, 0xFEFFFFFF]. I tried setting the new_id to 3 but the server did not like that.

The server can return an error if you messed something here, if there are no errors it'll return a list of interfaces that are available on your server.

Binding objects

Before you can use an object you need to let the server know what id you intend to use for this interface. So far I only needed to bind three interfaces WL_COMPOSITOR, XDG_WM_BASE and WL_SHM. So anytime after you call a request within these interface you'll need to send their bound object_id to the server. You simply check if the server returned these interface names and in response you send a new_id back to bind it to the object.

Getting a window

By this point you'll have things setup and is fairly easy, except sending a file descriptor. We'll get to that shortly. You can read the details about the requests here and there a multiple ways to do this based on your Window manager on your system too.

Here is the order of steps:

Create a surface
Get an XDG_Surface (as recommended by the Wayland documentation)
Get toplevel_surface (this is where you'll see your pixels)
Set window geometry
Commit surface

This is the first commit we will be doing. After the commit you'll get a few events for each of the objects above. The main one that you need to respond to for the rest to work is the configure event. You need to acknowledge this event with a ack_configure request. If you don't get this event it means you've messed up something and be prepared to go through the hex dump of you requests.

Creating Shared memory file

After you have acknowledged this configure event. We can now send the shared memory to the server that the server will draw onto the window. Using a shared memory is really efficient as you avoid copying memory. In order to do that we need to create a shared memory file. This is quite easy you simply create a new file in /dev/shm with some specific flags and permissions. Which I blatantly copied from the C implementation of shm_open so that I did not need to import any C code into Zig. Once you have the file you can ftruncate it to twice the buffer size (WindowHeight * Stride) and then mmap. Make sure to set the SHARED flag.

    const shm_fd = try std.posix.open(
        "/dev/shm/handmade.debugger",
        // Note: just doing what shm_open does here so that I don't need to use the c lib: https://codebrowser.dev/glibc/glibc/rt/shm_open.c.html#55
        .{ .ACCMODE = .RDWR, .NOFOLLOW = true, .CLOEXEC = true, .CREAT = true, .TRUNC = true },
        0o600,
    );

    try std.posix.ftruncate(shm_fd, SHMPoolSize);

    const pool_data = try std.posix.mmap(
        null,
        SHMPoolSize,
        std.posix.PROT.READ | std.posix.PROT.WRITE,
        .{ .TYPE = .SHARED },
        shm_fd,
        0,
    );

Now we can send create a shared memory pool which is slightly different compared to the other requests because of the file descriptor that we need to send which as I mentioned before is process specific. I have the faintest clue of how this works but you need to create an io vector, and a posix.msghdr_const and then send the fd using posix.sendmsg. You'll need to find out why elsewhere as I'm not fully certain how this actually works or why. You can find the method here.

Finally

That was the last hard bit. The rest is quite easy:

Create a wl_shm_pool buffer
Write some data to your buffer (output of mmap earlier), these are the pixels you'll see in your window later
Attach the surface
Mark the surface as damaged. This lets the server know that the data in the window is ready for drawing
Commit the surface

That's it! If you have made it this far you should see a window on your screen. If not don't worry it took me a lot of attempts to get this working. You can refer to my code if you get stuck, just note that it will be updated in the coming days.