It's been a common theme as I've learning a language that goes to a lower level than the one before. For instance when I learned and started writing Go more often I started to understand things like Memory and Pointers in a lot more depth. It was also thanks to this GigaChad of a paper too. At the same time I'm glad I started out with JavaScript as early on I wanted to be able to see things happening on the screen, it helped me understand how I can make things like website that I myself use on a daily. It helped me get interested. As opposed to when I tried by Writing a lot of C six years before that. I simple could not understand how it related to things in the real world.
Similarly as I have been writing Zig for the past six or seven months my understanding of Go has been getting a lot more deeper. The first thing you learn quite quickly is that strings
in Go are just syntactic sugars for bytes. It sounds obvious but I had to solve a lot of problems in Zig to really understand that. Two weeks ago I finally understood how interfaces might actually be implemented in Zig. I'll try to do my best to explain this with an example. I've been working on an interpreter, again but this time in Zig. Just for fun. It's when I stumbled on the discover of how interfaces could be implemented in Go. I don't think this is how it's actually implemented but I have a mental modal that helps me understand this now.
The interface
This might make more sense by following an actual example so I'll take this one from the Interpreter book. At some point when you are writing an interpreter, you need to implement a parser, don't worry this post is not about parsers. It's an example I have readily available thats all.
Before parsing tokens (like let
, return
, ;
etc) an AST needs to constructed. To do this we separate the pieces of code into statements and expressions. This is what that interface would look like:
type Node interface {
TokenLiteral() string
}
type Statement interface {
Node
statementNode()
}
type Expression interface {
Node
expressionNode()
}
If you remember in Go, you implement an interface by having methods with the same signature as the ones defined in the interface. For example a LetStatement
would look something like this:
// Ex: let x = 5; let -> token | x -> identifier | value -> 5
type LetStatement struct {
Token token.Token
Name
*Identifier
Value Expression // anoth
}
func (ls *LetStatement) statementNode() {}
func (ls *LetStatement) TokenLiteral() string { return ls.Token.Literal }
You don't need to know what all the types are, but you get the idea. LetStatement
implements Statement
as it has the methods statementNode
and TokenLiteral
defined on it. It's quite straightforward and something you do in Go quite often. I've written an implemented countless interfaces in Go but when I had to do the same in Zig. I had no idea how to actually implement it. This blog post was extremely useful along with looking at the source code for the implementation of the mem.Allocator interface in Zig.
Turns out there was a few ways to do this in Zig. I tried all of them but found Tagged unions to be the easiest. To get the same behavior in Zig, you need to let the compiler know which method to call when it encountered ?.TokenLiteral()
. Let's look at what this looks like. This is what my Zig implementation of the above looks like:
pub const Node = union(enum) {
statement: Statement,
expression: Expression,
program: Program,
fn token_literal(self: Node) []const u8 {
switch (self) {
null => unreachable,
inline else => |impl| return impl.token_literal(),
}
}
};
pub const Statement = union(enum) {
let_statement: LetStatement,
return_statement: ReturnStatement,
fn token_literal(self: Statement) []const u8 {
switch (self) {
null => unreachable,
inline else => |impl| return impl.token_literal(),
}
}
};
pub const Expression = union(enum) {
identifier: Identifier,
fn token_literal(self: Expression) []const u8 {
switch (self) {
null => unreachable,
inline else => |impl| return impl.token_literal(),
}
}
};
pub const LetStatement = struct {
token: token.Token,
name: Identifier,
value: Expression,
pub fn token_literal(self: LetStatement) []const u8 {
return self.token.literal;
}
};
As you might notice both Node
and Statement
have a few types registered. This allows me to coerce the type when calling token_literal
on any of the types that has it implemented. The switch
statement will try to resolve the type that the method token_literal
was called on. inline else
is like a catch all, as I've named all the methods the same it just saves me some typing. If I did not want to use that, I would just explicitly call the methods based on the type of Node
.
I don't know about you but I absolutely love such moments when I get to see under the hood of something so unassuming. Something I've been using for so long but never peeked under the hood of. I'm not sure my explanation was clear enough but hopefully it makes some sense.