Lately I've been doing some network programming in C
, just cause. I'm enjoying the language a lot more than I did 12 years ago. It's my new hobby language and surprisingly it's indirectly making me a better Go
programmer.
Huge shout out to Low Level Learning & Beej's Guide to Network Programming, without which it would have taken me months to figure half these things out.
I got the idea for this post as I was using strace while writing some network code in C
. You can run man strace
to know more about it. It's a powerful tool that allows you to trace system calls. Strace was the main tool that I was using to see, what was happening when I ran my network code. You run strace by passing it the executable that it generated from the code. So you can run the strace with the output of go build *.go
.
I've used the simplest server to try and understand what really happens when you call ListenAndServe
under the hood. I copied the example from the standard library. Here it is, just in case you wan to follow along:
func main() {
helloHandler := func(w http.ResponseWriter, req *http.Request) {
io.WriteString(w, "Hello, world!\n")
}
http.HandleFunc("/hello", helloHandler)
log.Fatal(http.ListenAndServe(":8080", nil))
}
I'm using go verion go 1.22.3
, although it shouldn't matter here. We'll mostly only be running the executable with strace.
There is a lot going on here, I won't be going through all of it but the underlying principles are quite simple and I'll be using the simplest versions of the system calls to explain the main principles. But if you are curious you can call man
on any of the system call you see in the output of STRACE
and you can get a much better understanding of what they do.
System calls
At the system level, these are the most basic calls that need to happen to listen and serve requests from an IP and Port:
- Get a Socket file descriptor (fd)
- Bind the socket fd to a port
- Listen & Accept
- Poll
- Close
These system calls allow you access network functionality of any Unix like OS including Windows. Upon calling the functions above, it allows the kernel to take over and do all the work for you.
Very brief look at data encapsulation
A quick look at the layers of data encapsulation, the kernel does most of the work for you but it's still good to know.
When your computer recieves the packet above:
- The hardware strips the Ethernet header
- The kernel strips the IP and TCP header
- The Go program recieves the data with HTTP headers and data
There's a lot more to it than that but the overview is enough for us to move on with what happens when you call ListenAndServe
.
1. Socket fd
Everything in Unix is a file. So, when you call the socket(...)
method, it returns a file descriptor that allows you to read
and write
to the socket just like a file. An fd of -1 indicates an error.
If you follow the ListenAndServe
method you'll eventually reach the system socket call, you'll see that it requires three parameters:
s, err := sysSocket(family, sotype, proto)
If you run man socket
the parameters are called socket(int domain, int type, int protocol)
i. Family or domain
: you can specify AF_INET
(ipv4), AF_INET6
(ipv6) or AF_UNSPEC
(you don't care if it's either ipv4 or ipv6). Where AF stands for Address family. There are a lot more address families but I have no idea what they do.
ii. Sotype or just type
: TCP(SOCK_STREAM) / UDP(SOCK_DGRAM)
iii. protocol
: the particular protocol to be used with the socket. I honestly don't know exactly what it does but, I use the output of getaddrinfo()
to map protocols. There is a file with a list of protocols, you can check them out here: /etc/protocols
.
You'll also notice a call to setsockopt
which allows to set certain options like to be able to re-use the port after the application has exited. That's the only thing I've used it for anyway, but it does more things.
2. Bind to port
As you'll see in the screen shot there is a call to bind which binds the port to your application. Bind has the following signature:
func Bind(fd int, sa Sockaddr) (err error)
Where fd
is the socket fd from the previous step. Sockaddr
has mainly two peices of information:
a. the address family
b. the port in network endian (see man htons
, host to network short for more details).
3. Listen & Accept
This where you wait for incoming connections to handle them as per your application needs. The process is two step:
Step 1: func Listen(sfd int, backlog int) (err error)
sfd
is the socket file descriptor from socket()
call, backlog
is the number of connections allowed in the incoming queue, on the STRACE call above it is 4096.
Step 2: func Accept(fd int) (nfd int, sa Sockaddr, err error)
Accept takes a connection from queue and well accepts them. If you notice it returns another file descriptor, which is only used for this single connection. Any data that you wish to read
or write
from this client is done using this file descriptor.
If you check man accept
you'll notice that the second argument is sockaddr
which is an empty variable used to capture the address details of the incoming connection. Which in the go standard library is caught using var rsa RawSockaddrAny
. You can find the method here.
4. Poll
If you run the accept
method above, you'll notice that it is blocking. This means you can only accept one connection at a time. This is where poll
comes in, which is an improved version of select
and a lot easier to use. In the STRACE
output you'll notice the use of epoll
instead, which according to the man page scales better that poll
.
Poll, allows to listen on multiple file descriptors.
5. Close
Similar to regular files, you close the socket and connection file descriptor after you are done. You do this by calling close
and passing the fd
that you wish to close. On closing the resources associated with the fd
are freed.
Finally
This is a simplistic explanation of the all the processes that are called to be able to receive connections on a port. There are parts of process that I don't fully understand yet, like the use of fcntl
to manipulate the file descriptor.
This guide by Brian “Beej Jorgensen” Hall is one of the best detailed introductions to network programming I've ever read. If you find this stuff interesting I'd highly recommend it. He uses C
to explain the concepts but you can find the equivalent code for go
in the standard library, the function names are identical and so is the functionality.