Follow

Retro computing programming language stuff 

A fun project I've done some thought on is a small JavaScript-like REPL for the Atari 800, like interacting with old BASIC or LOGO.

The 6502 CPU and its surrounding machine are very limited to modern eyes :) but I think it's possible to make a workable toy implementation.

RAM constraints are key, as is the lack of { curly braces } in the ATASCII character set. ;)

Retro computing programming language stuff 

Still just collecting notes on this one. I expect runtime performance to suffer in favor of RAM usage when it comes to data structures.

Retro computing programming language stuff 

classic Atari LOGO implements garbage collected list, number, and symbol/string types as well as bytecode functions, which is not too far off from the base needs of something on the rough level of ES3.

IIRC LOGO implemented lists in the classic LISP way (linked lists); adding mutable arrays & hashmaps seems doable.

it's a lot of memory indirection, which is *slow* on the 6502, but what you gonna do :D

Retro computing programming language stuff 

anyway, I promised myself that when Portland Retro Gaming Expo came back I'd buy a vintage Atari 800 or 800 XL ;)

it's on for October:
retrogamingexpo.com/

we'll see what the covid/masking situation looks like by then wooo *urgh*

Retro computing programming language stuff 

(it's a nice large well ventilated space though. woo oregon convention center)

Retro computing programming language stuff 

the reference manual for atari logo describes the memory layout, even!

archive.org/details/AtariLOGOR

complex memory objects like "floating point numbers" or "symbols/strings" are split up over linked lists of fixed-size nodes :D

bad thing: need for lots of pointers wastes total available memory

good thing: no worries about fragmentation after GC

Retro computing programming language stuff 

fixed-size nodes are great for LISP-y linked lists, but the lack of constant-time indexing for strings and arrays is bad for a JS-like language.

i'm considering variable-sized allocations with a simple compacting GC, which should be straightforward to implement.

making the GC smarter could perform better but at greater code size. maybe a simple separation between fixed-size (number, object struct) and variable-size allocations (string, hashmap, array)?

Retro computing programming language stuff 

I looked into a compacting GC for a variable-sized heap, and while it sounds doable it feels a) really fiddly and b) like it's gonna bloat my code size, and I don't have a lotta bytes to spare ;)

Instead of spending bytes on large allocations to expand property lists and arrays into, then compacting stuff, maybe just stay simple and use fixed-size nodes and binary trees...

Retro computing programming language stuff 

Preserving original property insertion order for enumeration *and* having a hash-based semi-random-access lookup feels ugly though.

Either I have to give up the ordering property, or revise my data structure ideally without doubling its size. ;)

Retro computing programming language stuff 

A list of property descriptors (in insertion order for enumeration) and a binary tree (in key hash order) probably isn't _too_ awful; I don't have to repeat the property descriptors, just the pointers to them.

Retro computing programming language stuff 

(This stuff is kinda fun both to relive the limited computers of my childhood *and* to explore the low-level data structures I didn't get as much detailed introduction to in my self-taught programming career compared to what you'd get in a uni CS program)

Retro computing programming language stuff 

In the node-based model, a 2-property object {x:160, y:120} takes 35 bytes, plus the storage for the actual strings and numbers.

5 * (2 + log2(n) + 2n) bytes

Hash collisions cost extra nodes.

obj - props - proptree - propdesc(1) - string "x"
\ \ \ number 160
\ \ propdesc(2) - string "y"
\ \ number 120
\ list - propdesc(1)
\ list - propdesc(2)

Retro computing programming language stuff 

Using variable length structures would be only 20 bytes, but when adding a third property, the hashmap & list would have to be resized and the first two props copied over. If adjacent RAM isn't available, that leaves a 15-byte hole in memory which might have to be compacted later.

Retro computing programming language stuff 

can shave off some more nodes here:
* list doesn't need a list node at the tail, can save 5 bytes
* tree isn't needed if only 1 or 2 props
* list isn't needed if only 1 prop

Retro computing programming language stuff 

im in ur lisp nodes, making javascript objects

Retro computing programming language stuff 

Alas, following pointers is a tedious process on the 6502: 16-bit addresses require two separate 8-bit reads, and the dearth of registers means you have to copy the address into the zero page (lowest 256 bytes of memory, thus addressable with an 8-bit pointer). Then you can use the indirect indexed addressing mode to load bytes out of the referenced object.

Roughly the zero-page area gets used kinda like a large, slow register file. ;)

Retro computing programming language stuff 

pointers being bigger than the word size suuuuuuuuuucks

Retro computing programming language stuff 

ooh i just got an evil idea

instead of a 5-byte node with a GC marker bit and 7-bit type tag, have 4-byte nodes; move the GC bits to a bitfield keyed off the pointer, and store a 2-bit context-specific type tag *in the pointers*

JS values could distinguish between primitive, number, string, and object/function.

Object internals would have propdesc, proplist, proptree, and propmeta ([tree,list] pair).

might work?

Retro computing programming language stuff 

IMO this is too clever for its own good. ;) Still gonna explore it a bit but leaning towards a 5-byte node.

I can also simplify the GC by using two more bits of the tag byte to indicate whether bytes 2/3 & 4/5 are pointers, so it doesn't have to grok all the types.

I can then make a sort of tiered bitfield out of the rest of the type tags, so eg 'object' and 'function' can be both distinguished easily (for calls) and identified together easily (for prop access)

Retro computing programming language stuff 

so first: i can use a free list in the free blocks to allocate new nodes very fast

second: if i take a second bit for GC state to do tri-color marking, i can do the mark phase in chunks, without needing a large stack for worst-case traversals

this may be beneficial for interactive performance :D

however it leaves me with only 4 bits for the type tag, which is tight, or drop the ref tag bits and distinguish pointers some other way (or safely do imprecise gc)

Retro computing programming language stuff 

only the sweep needs to pause the world, doing a full walk of the heap (load one byte per node, check the top bits) then for each if it's newly freed add it to the free list by saving the old free position and resetting the type/gc tag

In a pinch, the free list could be supplemented in chunks, but the type tag must be resaved as free space.

Retro computing programming language stuff 

first draft of a sweep routine comes out to 1,030 cycles per page (256 bytes, with 51 5-byte nodes), or about 66,000 cycles for a 16 KiB heap.

That should be about 27 ms on the Atari (1.79 MHz for NTSC), which is a touch under two 60 Hz video frames.

Not bad, but maybe improvable.

Retro computing programming language stuff 

untested 6502 assembly for the sweep func :D

gist.github.com/brion/95e8b78c

Retro computing programming language stuff 

whoops got division wrong lol

currently at 37ms for 16 KiB heap

Retro computing programming language stuff 

down to about 28 ms with an unrolled loop and self-modifying code to shave a few cycles off the hot zone ;)

gist.github.com/brion/95e8b78c

Retro computing programming language stuff 

agh i left some of that in inconsistent state, i don't think the loop works quite right. :D

i'll fiddle with it later, it's late. ;)

night night all

Show newer

Retro computing programming language stuff 

@brion Hoping you can find a decent unit to bring home.

Also I remember the C compilers on the Atari 8bit used $( and $) for the curly braces, but that was less-than-ideal.

Retro computing programming language stuff 

@craigmaloney I kinda want to make a custom font (you can swap out the character bitmaps) but that doesn't help the keyboard ;)

Retro computing programming language stuff 

@brion Heh, do a $) and have it auto-correct to }. That would be ... something.

Retro computing programming language stuff 

@craigmaloney @brion Trigraphs!

Retro computing programming language stuff 

@mansr @craigmaloney ooh that reminds me -- tilde is also missing in ATASCII :D

Retro computing programming language stuff 

@brion PHP says Hi.

Retro computing programming language stuff 

@saramg luv 2 use associative arrays :D

Retro computing programming language stuff 

@brion distinct late 90s vibe to that!

Retro computing programming language stuff 

@brion cut my teeth on 6502 assembler in the early 80s. Very fond memories.

Retro computing programming language stuff 

@brion also, you don’t have to use zero page to index off a pointer, but you might find the other way … upsetting.

Retro computing programming language stuff 

@GoatSarah i sincerely hope this involves self-modifying code ;)

Retro computing programming language stuff 

@brion That's exactly what it involves!

We didn't have a lot of RAM in a VIC-20, and it wasn't very fast. We made do with what we had.

Retro computing programming language stuff 

@brion a while ago, i wrote a GC with the mark bits pulled out, inspired by the one in micropython... here's the design in case it gives you more ideas: github.com/robey/mwgc#how-it-w

downside: bit manipulation on the 6502 is LSR LSR LSR hell

my other loose idea (not implemented) was to have a strict layout for objects: all pointers up front, no guessing...for example, the first byte of an object could be 2 bits for GC, 3 for (scaled) len, 3 for pointer count

Retro computing programming language stuff 

@robey thanks, that sent me down some further paths that might let me do incremental garbage collection instead of what I expect will be second-long pauses from a 'stop-the-world' mark-and-sweep :D

Show newer

Retro computing programming language stuff 

@brion Have you thought about CDR coding at all? That would save you following pointers at least some the time, as well as the space for the pointers. It was originally used for linked lists, but it should be possible to do similar for a tree, though the coding would need to be more complex.

Retro computing programming language stuff 

@freakazoid hmm, possibly interesting. i guess i'd want to pre-allocate blocks expecting expansion (which can be potentially freed in a GC if needed elsewhere).

worth considering!

Retro computing programming language stuff 

@brion i'm scratching my head at this... are they saying each node is 3 bytes of overhead and 2 bytes of data?

Retro computing programming language stuff 

@robey in worst case yes! i believe it breaks down to:

1 byte for type tag & a GC marker bit
2 bytes data payload
2 bytes tail pointer (or if type cannot have a tail, additional 2 bytes of payload)

so atari floating-point numbers are 6 bytes, split over two node types: one with the 2-byte exponent and a pointer to the tail, and one with the 4-byte mantissa

this takes 10 bytes for 2 nodes, so 40% used on administrative overhead and 60% on payload :D

Retro computing programming language stuff 

@brion wow yeah there are so many possibilities for optimizing that! :D

Retro computing programming language stuff 

@brion Fixed size nodes sounds like PicoLisp. Having only one allocation size makes GC a lot easier.

Retro computing programming language stuff 

@freakazoid oh yeah i was dreading the GC times i was gonna get doing a compacting collector at 1.79 MHz ;)

@brion @nGFX has a similar project for this for Commodore/VIC: genesis64.ngfx.de/

@vwbusguy @nGFX that's cool too! I'm hoping to run on the vintage hardware tho :)

@vwbusguy @nGFX call me a superscalar, cause i gotta keep all my pipelines full with silly project ideas ;)

@brion @nGFX one of my goal projects that I haven't done yet is writing a web server in QBasic (via #QB64 ), then containerizing it and porting my blogs to it.

@vwbusguy @nGFX nice :D i had a lot of good times with QBasic back in the day!

@vwbusguy @nGFX hmm, I should see if some combination of these things can get some of my old QuickBASIC and QBasic programs online without manually porting to JS ;)

*adds to the pipeline*

@brion @nGFX It should almost certainly work with #QB64. If not, you might also check out FreeBASIC, but if it's QuickBASIC/QBasic, you'll likely have the best luck with qb64.

@brion @nGFX qb64 also has a project called qbjs that converts #QBasic to #Javascript in browser, but it's still pretty early in development.

Sign in to participate in the conversation
Mastodon for Tech Folks

This Mastodon instance is for people interested in technology. Discussions aren't limited to technology, because tech folks shouldn't be limited to technology either!