I’m in cognitive overload. At work I’ve been solo on a project that’s big enough that every time I turn to look at a different piece of it, the last thing I worked on gets swapped out of my memory and I have to relearn the new current thing.
At the turn of the month we drove a total of 28 hours over 3 days to Oklahoma and back for a wedding. With four childrens. I’m ready for more! Almost recovered. This was massively disruptive as everything I was thinking about with regard to evalrus got swapped out of my brain completely, including enthusiasm for it.
I now have my very own not-for-sale-outside-the-Indian-subcontinent copy of The Dragon Book. It’s a lot more accessible than I remember! I often think I’d get a lot more out of my CS degree now than I did 18 years ago.
And now, some thoughts on managed memory and Rust semantics.
If anybody cared to look at memory.rs
they might have considered that
Ptr<T>
references memory in an Arena
but the lifetime of Ptr<T>
is
not limited to the lifetime of the Arena
it is connected to. I’ve let
some possible use-after-free unsafety leak out.
I thought about this, and tried adding an explicit lifetime to Ptr<T>
, and
thought about it some more. These lifetimes are viral and start cluttering
everything up. I don’t like it, yet it would be the right thing to do.
I’m not going to do it. Here’s why:
If I tie a Ptr<T>
to the lifetime of an Arena
, the compiler can
reasonably assume that a borrow of a Ptr<T>
can last the lifetime of
the Arena
.
If I want to implement a copying collector, an object that is moved has an unpredictable lifetime from Rust’s point of view. The object continues to exist but any references to it would be invalid pointers.
If I implement a copying collector, I don’t want to be able
to take long term references to Ptr<T>
s anywhere - I have to be able to identify
every Ptr<T>
and update it to point to the new object location after
it has been moved.
It seems to me that there’s something of a fundamental incompatibility between lifetimes and runtime garbage collection, especially if objects can be relocated. I don’t know what the answer is, if any. A compromise that leaks unsafety under specific circumstances may be the best outcome.
Part of my problem here is that I still don’t fully grasp the power of lifetimes and Rust’s type system. I come from a C/C++ and Python background so I’m used to unsafety. Creating safe abstractions is still a new challenge.
Here’s a light memory management problem that had my brain tied in pretzels for a bit.
A symbol has a name represented by a string, but should be refered to in the interpreter by an address for simplicity and performance sake. Each symbol should be unique, there shouldn’t be a duplicate of any, so comparing any two symbols of the same name should, under the hood, compare their pointers to find equality.
The simple problem in my code is that a Symbol
should be stored in an
Arena
- runtime managed memory. But where should it’s str
representation
live? Additionally, I need to map str
s to Symbol
s bidirectionally. That
suggests a HashMap
but a HashMap
is entirely Rust-managed.
I finally arrived at a solution. There are probably others, possibly better ones.
where a Symbol
holds a copy of the raw &str
fat pointer representation of the
String
key.
The entire impl
of Symbol
is
SymbolMap
s implementation is also simple:
As the comments say, I decided to make name lookups the fast path and creating new symbols the slow path.
Symbols are helpfully immutable - SymbolMap
doesn’t allow modifying a Symbol
name after it has been created. This means that the internal pointer and size
of the name won’t ever change and we can safely take copies of them for
the Symbol
type. So long as the HashMap
outlives any Arena
s containing
Symbol
s we should be ok. Enforcing that relationship at compile time?
Your suggestions most desirous.
At least now the RPL prints out symbol names correctly and that is very good!
If you look through the source code, you’ll see that I abstracted the Arena
interface out into an Allocator
trait. This will make it easier to refactor
memory management down the road.
Not sure. Still trying to regain momentum.