Specification for the Funcy Virtual Machine (FVM), a stack-based bytecode
interpreter for Funcy.
Format version 2
.
Copyright © 2022-2023 Chris Roberts (Krobbizoid).
- Architecture
- Data Types
- Bytecode Files
- Execution
- Opcodes
- HALT (
0x00
) - NO_OPERATION (
0x01
) - JUMP (
0x02
) - JUMP_NOT_ZERO (
0x03
) - JUMP_ZERO (
0x04
) - CALL (
0x05
) - RETURN (
0x06
) - DROP (
0x07
) - DUPLICATE (
0x08
) - PUSH_U8 (
0x09
) - PUSH_S8 (
0x0a
) - PUSH_U16 (
0x0b
) - PUSH_S16 (
0x0c
) - PUSH_U32 (
0x0d
) - PUSH_S32 (
0x0e
) - LOAD_LOCAL (
0x0f
) - STORE_LOCAL (
0x10
) - UNARY_DEREFERENCE (
0x11
) - UNARY_NEGATE (
0x12
) - UNARY_NOT (
0x13
) - BINARY_ADD (
0x14
) - BINARY_SUBTRACT (
0x15
) - BINARY_MULTIPLY (
0x16
) - BINARY_DIVIDE (
0x17
) - BINARY_MODULO (
0x18
) - BINARY_EQUALS (
0x19
) - BINARY_NOT_EQUALS (
0x1a
) - BINARY_GREATER (
0x1b
) - BINARY_GREATER_EQUALS (
0x1c
) - BINARY_LESS (
0x1d
) - BINARY_LESS_EQUALS (
0x1e
) - BINARY_AND (
0x1f
) - BINARY_OR (
0x20
) - PUT_CHR (
0x21
)
- HALT (
The FVM uses several regions of memory to execute FVM bytecode:
- Program memory, or
pm
is a read-only array of bytes that stores FVM bytecode and program data. The size ofpm
may be variable, but it should be at least the size of the currently executed program. The program must originate at index0
. - The instruction pointer, or
ip
is an integer that stores an index ofpm
. It is used for fetching FVM opcodes and program data. - Stack memory, or
sm
is a stack of signed integer words that stores locals and the inputs and outputs of most operations. The size of a stack word is undefined, but 32 bits is recommended. There is no defined limit for the size ofsm
. The index of the top word ofsm
increases as words are pushed to it. - The frame pointer, or
fp
is an integer that stores an index ofsm
and defines the base of the current call frame.fp
stores the index of a word insm
, not a byte. Offsets fromfp
insm
are used to access locals. - The execution flag, or
ef
is a boolean value that stores whether the FVM is executing a program. - The exit code, or
ec
is an integer that stores an exit code for when execution stops.
A fetch is a fundamental operation of the FVM that is used to read data from
pm
. There are several types of data named by this specification:
Name | Size | Description |
---|---|---|
u8 |
1 byte | An 8-bit unsigned integer. The type of FVM opcodes. |
s8 |
1 byte | An 8-bit two's complement-signed integer. |
u16 |
2 bytes | A 16-bit little-endian unsigned integer. |
s16 |
2 bytes | A 16-bit little-endian two's complement-signed integer. |
u32 |
4 bytes | A 32-bit little-endian unsigned integer. |
s32 |
4 bytes | A 32-bit little-endian two's complement-signed integer. |
To fetch data, the byte at pm[ip]
is read. ip
is then incremented. This is
repeated until the required number of bytes have been read. The result of this
is that ip
will point to the byte immediately after the fetched data.
To store FVM bytecode in a file, a header is included to identify the file as FVM bytecode, ensure the file is being read as binary data, and identify the version of FVM bytecode being used. Bytecode files store the following sequence of data:
Type | Description |
---|---|
u8 |
0x83 : Ensures bit 7 is set. Function symbol in ANSI. |
3 * u8 |
0x46 0x56 0x4d : FVM identifier. |
2 * u8 |
0x0d 0x0a : \r\n , tests for line ending conversion. |
u8 |
0x1a : Stops file display on some systems. |
u8 |
0x0a : \n , tests for reverse line ending conversion. |
u32 |
Format version. 0x02 0x00 0x00 0x00 (2) for this version. |
u32 |
size value. The number of bytes of FVM bytecode. |
size * u8 |
The FVM bytecode to load into pm . |
Any trailing data is unused and has no effect. If the bytecode file is too
small for the size
value it will fail to load.
Any file extension may be used for FVM bytecode files, but .fyc
is
recommended for compiled Funcy code. The file extension .fvm
is recommended
for other uses that target the FVM.
The following sequence is used to execute a program. The FVM should only begin
executing a program if ef
is false
:
- The program is loaded into
pm
starting at index0
. Header data from bytecode files is excluded. ip
is set to0
.sm
is cleared.fp
is set to0
.ec
is set to0
.ef
is set totrue
.
Then, the FVM steps through 0 or more of the following cycles until ef
is
false
:
- Fetch a
u8
opcode. - Execute the opcode. See Opcodes for details.
When execution finishes, the value of ec
may be used as an exit code.
There are several illegal states that may be encountered during execution. These are undefined and depend on the FVM's implementation:
- Data is fetched from out of bounds of
pm
. - An undefined opcode is fetched.
sm
grows to an unsupported size.- A word is popped from
sm
while it is empty. - A word is accessed from out of bounds of
sm
. - A modulo or divide by
0
operation is performed.
An opcode is a u8
value in FVM bytecode that represents an operation. Each
opcode executes a sequence of operations. Opcodes have defined values, but may
change between format versions:
- Pop a word,
exitCode
fromsm
. - Set
ec
toexitCode
. - Set
ef
tofalse
.
- Do nothing.
- Pop a word,
jumpAddress
fromsm
. - Set
ip
tobranchAddress
.
- Pop a word
jumpAddress
fromsm
. - Pop a word
compareValue
fromsm
. - Set
ip
tojumpAddress
ifcompareValue
is not equal to0
.
- Pop a word
jumpAddress
fromsm
. - Pop a word
compareValue
fromsm
. - Set
ip
tojumpAddress
ifcompareValue
is equal to0
.
- Pop a word,
argCount
fromsm
. - Pop a word,
callAddress
fromsm
. - Pop the top
argCount
words fromsm
in order asargs
. - Push the value of
fp
tosm
. - Set
fp
to the top index ofsm
. - Push the value of
ip
tosm
. - Set
ip
tocallAddress
. - Replace
args
at the top ofsm
in order.
- Read a value,
oldFP
fromfp
's value. - Set
ip
tosm[oldFP + 1]
. - Set
fp
tosm[oldFP]
. - Pop a word,
returnValue
fromsm
. - Discard all words from
sm
with an index greater than or equal tooldFP
. - Push
returnValue
tosm
.
- Pop and discard a word from
sm
.
- Peek a value,
value
from the top ofsm
. - Push
value
tosm
.
- Fetch a
u8
value. - Push the value to
sm
.
- Fetch an
s8
value. - Push the value to
sm
.
- Fetch a
u16
value. - Push the value to
sm
.
- Fetch an
s16
value. - Push the value to
sm
.
- Fetch a
u32
value. - Push the value to
sm
.
- Fetch an
s32
value. - Push the value to
sm
.
- Pop a word,
offset
fromsm
. - Push
sm[fp + offset]
tosm
.
- Pop a word,
offset
fromsm
. - Peek a value,
value
from the top ofsm
. - Set
sm[fp + offset]
tovalue
.
- Pop a word,
address
fromsm
. - Push
pm[address]
tosm
.
- Pop a word,
value
fromsm
. - Push
-value
tosm
.
- Pop a word,
value
fromsm
. - Push
int(value == 0)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
x + y
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
x - y
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
x * y
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
x // y
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
x % y
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x == y)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x != y)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x > y)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x >= y)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x < y)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x <= y)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x != 0 and y != 0)
tosm
.
- Pop a word,
y
fromsm
. - Pop a word,
x
fromsm
. - Push
int(x != 0 or y != 0)
tosm
.
- Peek a value,
value
from the top ofsm
. - Put the character with the value
value
to standard output.