Ok, I've been reading this (cs.bgu.ac.il/~comp151/wiki.fil) and I think I'm starting to get it. So I have my rules, but I want them to not have any regex-like parts to them. That's ok, because LR doesn't care about recursion.

But after trying to write out my own explanation to confirm I understand it I'm just confusing myself rip. MOAR READING. I also need to look up "clojure" in a dictionary...

I'm determined to understand this stuff eventually.

Ohh ok, so it's a state machine, and each state holds a set of items (rule + location within it). You start at the start, then look at the next token and move to the state containing all the items that match gaining that token. When you get to an end, you reduce the stack of tokens you've built up into the rule they match and then use the result as the first action when starting again.

So what ends up happening is you kind of bunch up the list of tokens into a tree until there's the one root node.

Show thread

Going to make a test project for this and see if I can evolve it enough for me to just drop it into the O project and stuff my hideous AST spec into it.

Show thread

Current progress: Have a tiny AST spec with 4 rules. I build the rule list, then generate all the items from the rules, and I've just generated state 0. Next will be to write the bit that'll build up all the next states. I'm expecting the tricky bit will be to detect duplicates and re-route to them. Hopefully that'll be as easy as comparing the items.

Show thread

Another update: I've got it generating the states and connecting them up.
The issue: I seem to be getting some duplicate options e.g. if the result of state 1 is an expression then it can go to either state 5 or 3, but they're both different.

So next I need to add an "end" item so mitigate that.

Show thread

Ok, so now I'm noticing that every rule in this document has a concrete symbol at the end, rather than just another rule. However, I can't tell if that's a requirement from the docs (still reading) so I either need to figure out if it is (and if so, panic), or figure out what logic is avoiding them.

Show thread

Oh fuck ok if I put in the same rules as in the example I get an infinite loop...

Show thread

Aaaaaahhh right, so what's wrong with the program at the moment is that it generates options per item but doesn't consider that there will inevitably be duplicate options, so it generates the results for an option, then considers the duplicate separately.

So I need to remove duplicates when generating options, and fix how it's finding what items correspond to each option.

Show thread

Hoooollly shiiiiit I feel like a fucking genius now.

The state numbers don't match, but I've otherwise got my program generating a duplicate state graph to the example!

Now I'm going to test non-static rule ends to see if anything happens there.

Show thread

Oh this is starting to get better and better. So there do indeed end up states where some items are terminal and others aren't, but those states would default to a reduce action rather than an error, because in these states, not having a valid next option is fine.

I just need to make sure to catch the edge case in which multiple rules complete in the same state, but that should theoretically never happen. If it does though, I can have it take the longest match I guess.

Show thread

Next up:
- More tea
- Put this in the
- Start coming up with the parsing bit rather than the generating bit.

Sign in to participate in the conversation
Mastodon for Tech Folks

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!