Skip to content

Commit

Permalink
Add source location info to operations
Browse files Browse the repository at this point in the history
This allows us to carry the source filename, line and, most importantly, the
line number forward to the second pass so that semantic errors have some useful
debugging information.

The implementation, adding a method to Operation interface and embedding
SourceInfo in each operation is... kind of messy. I'll have to think about how
to pass the data around more cleanly. In particular, I'm not sure how well it
works for manually created operations without source code.

Squashed commit of the following: acee70b...28b2a50
  • Loading branch information
smoynes committed Oct 8, 2023
1 parent 4386b54 commit 570320d
Show file tree
Hide file tree
Showing 14 changed files with 394 additions and 173 deletions.
137 changes: 76 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,80 +1,96 @@
# `ELSIE`: A pedagogical LC-3 emulator #

> The path is made in the walking of it. -- Zhuangzi
This is `ELSIE`, a virtual machine for the LC-3: a little computer that is
simple, comprehensive, and imaginary.

The computer was designed as a learning tool for undergraduate
computer-engineering students. It is described in detail in an excellent
computer-engineering students and is described in detail in an excellent
textbook, Patt and Patel's *Introduction to Computing Systems: From Bits & Gates
to C/C++ and Beyond*[^1], 3ed.

## LC-3 Background ##

The LC-3 instruction set architecture includes:

- a single data type: signed integers stored as 16-bit words
- word-addressable RAM with 16-bit address space
- several general purpose registers
- rudimentary arithmetic and logic operations
- memory-mapped I/O
- hard- and software interrupts
- a privileged mode with a basic operating system
- an instruction set compact enough to fit on a single page

While similar in many respects to the x86 ISA, the LC-3 is radically simpler in
almost every way. Unlike the sprawling x86, with thousands of instructions,
dozens of addressing modes, multicore execution, an intricate memory model, many
advanced features, and over 40 years of history etched into silicon, the LC-3
remains a tractable system that is comprehensible by an individual. It is a lot
closer to PDP/11 machines than anything you have in your home or pocket.
- a single data type: signed integers stored as 16-bit words;
- word-addressable RAM with 16-bit address space;
- several general purpose registers;;
- rudimentary arithmetic and logic operations;
- memory-mapped I/O;
- hard- and software interrupts; and
- an instruction set compact enough to fit on a single page.

Far from an abstract machine, the text begins with transistors and digital logic
and describes in detail the entire computer architecture including the
control-unit state-machine, data and I/O paths. It is fascinating.

While similar in many respects to the x86 or ARM ISAs, the LC-3 is radically
simpler in almost every way. Unlike the sprawling x86, with thousands of
instructions, dozens of addressing modes, multicore execution, an intricate
memory model, and over 40 years of history etched into silicon, the LC-3 remains
a tractable system that is comprehensible by an individual. It is a lot closer
to PDP/11 machines than anything you have in your home or pocket.

## Project Goals ##

Personally, there remain many computer things that baffle me. Despite ten
thousand hours of computing, I feel lost and uncertain when it comes to some of
the rudiments of the field:

- computer architecture
- assembly programming
- operating systems
- computing history

This slightly absurd project is for experiential learning[^facecloud]: reading,
writing, conversations, asking questions, finding answers, and solving problems.
It is to be hoped that by using the old ways, by holding the master craftsman's
tools, and building something cute and useless, I will gain a better
understanding of computing.

Hardware simulators already exist for the LC-3 architecture, of course. The
textbook publishers provide one and there are many others freely available
online. This one is admittedly a mere reinvention of the wheel. Nevertheless,
the design and engineering process sometimes reveals something fundamental about
either ourselves or our world.

I have lots ideas for experiments to bring into the lab. I've thought about:

- simulating the LS-3 ISA in software
- writing some programs in assembly, translating to machine code
- running some programs written by others
- building development tools like an assembler, linker, loader, and step-wise
debugger
- building a simple compiler for high-level language[^2]
- extending the ISA with new instructions, data types, or a math co-processor
- adding new I/O devices like a serial console, a tape storage, block storage,
or network emulators and adapters
- expanding the operating system, with `TRAP` extensions, runtime library,
application services, or even a microkernel
- concurrency and parallelism, _e.g._ communicating sequential processes,
preemptive multitasking, multicore execution
- computer architecture;
- assembly programming;
- operating systems;
- computing history.

This project is not novel: hardware simulators already exist for the LC-3
architecture, of course. The textbook publishers provide one and there are many
others freely available online.[^4] This one is admittedly a mere reinvention of
the wheel. Nevertheless, the design and engineering process sometimes reveals
something fundamental about either ourselves or our world, so it is worth
retreading the path.

This slightly absurd project's purpose is to explore these topics through
experiential learning: gaining a deeper understanding of how a thing is done by
simply doing the thing. It is to be hoped that by trying the old methods, by
holding the master craftsman's tools, by building something both cute and
useless, I will gain a better understanding of the essence of computing. If
nothing else is achieved than learning a bit, exploring some ideas, and hearing
a few good stories, it will have been worth it.

I have lots ideas for experiments and projects to bring to my workbench. I've
thought about:

- simulating the LS-3 ISA in software;
- building development tools like an assembler, linker, loader, and step-wise
debugger;
- writing some programs in assembly, translating to machine code;
- running some programs written by others;
- building a simple compiler for high-level language[^2];
- extending the ISA with new instructions, data types, or a math co-processor;
- adding new I/O devices tape storage, block storage, or network emulators and
adapters;
- expanding the operating system, with new system calls, a runtime library,
IPC services, or even a microkernel;
- concurrency and parallelism, _e.g._ co-operative sequential processes,
preemptive multitasking, multicore execution.

Admittedly, some of these experiments are pretty straightforward, while others
appear daunting and complex; some are immediate goals, but most are mere thought
experiments. Trying to do all of that work might be a path in madness.

In the meantime, feel free to browse the code or follow the project if you enjoy
the absurdist theatrics of a curious software engineer. As ever, I seek to
understand the essence of computing and to embody _Shokunin Kishitsu_ (職人気質),
the artisan's spirit.
## Get in Touch ##

You are welcome to reach out if you'd like to join me on this exploration.
You are welcome to reach out if you're a fellow learner, if you find this
project useful (or buggy), or you have any questions or feedback. You can start
a [discussion](https://github.com/smoynes/elsie/discussions) on this project or
contact me directly through my GitHub profile.

> The path is made in the walking of it. -- Zhuangzi
Please feel free to browse the code or follow the project if you enjoy the
absurdist theatrics of a curious software engineer. As ever, I simply seek to
understand the essence of computing and to embody _Shokunin Kishitsu_ (職人気質),
the artisan's spirit.

----

Expand All @@ -100,14 +116,14 @@ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Focus right now:

ASM: assembler
- have a very rough parser
- generates code for a few opcodes
- not much error handling
- compatible with textbook
- not enough error handling
- missing some opcodes

On deck:

- KBD: interrupts
- BIOS:
- BIOS: interrupt routines

See [TODO.md](`TODO.md`) for more ideas.

Expand All @@ -126,6 +142,5 @@ This work is dedicated to the MCM/70[^3] and its pioneering designers.
[^1]: https://www.mheducation.com/highered/product/introduction-computing-systems-bits-gates-c-c-beyond-patt-patel/M9781260150537.html
[^2]: Leading to the creation of another dynamically-typed, interpreted language -- it is inevitable.
[^3]: https://en.wikipedia.org/wiki/MCM/70
[^facecloud]: With some dedication and blind ambition, I will finally realize my lifelong ambition: to develop
FACECLOUD™️, an ad-supported, privacy-preserving, social-media smart-contract for personal clouds
based, of course, on the LC-3 ISA! 💰💰💰
[^4]: You may find references to some other tools and some useful resources in
[./REFERENCES.txt](`REFERENCES.txt`)
8 changes: 5 additions & 3 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,21 @@
- [.] ASM:
- [.] code generation: ~7/16
- [.] directives
- [ ] .STRINGZ: strangz
- [ ] .END
- [ ] .EXTERNAL
- [x] .STRINGZ: strangz
- [x] .BLKW: block words
- [x] .DW: define word
- [x] .FILL: fill word
- [x] .ORIG: origin
- [.] cli command
- [.] document grammar
- [x] cli command
- [.] document grammar more completely
- [x] memory layout
- [x] cleaner error handling
- [x] simple parser: regexp
- [x] symbol table
- [ ] LOAD: object loader
- [ ] LINK: code linker
- [ ] DUMP: hex encoder
- [ ] CLI: sub commands for vm, tools, terminal, shell
- [ ] LOG: program output to STDOUT, logging output to STDERR (unless in
Expand Down
29 changes: 25 additions & 4 deletions internal/asm/asm.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,13 +109,21 @@ const badSymbol uint16 = 0xffff

// SyntaxError is a wrapped error returned when the parser encounters a syntax error.
type SyntaxError struct {
Loc, Pos uint16
Line string
Err error
File string // Source file name.
Loc uint16 // Location counter.
Pos uint16 // Line counter, zero value if now known.
Line string // Source code line, zero value if not known.
Err error // Error cause.
}

func (pe *SyntaxError) Error() string {
return fmt.Sprintf("syntax error: %s: line: %d %q", pe.Err, pe.Pos, pe.Line)
if pe.Err == nil && pe.Line == "" {
return fmt.Sprintf("syntax error: loc: %0#4x", pe.Loc)
} else if pe.Err == nil && pe.Line != "" {
return fmt.Sprintf("syntax error: line: %q", pe.Line)
} else {
return fmt.Sprintf("syntax error: %s: line: %0#4X %q", pe.Err, pe.Pos, pe.Line)
}
}

// OffsetError is a wrapped error returned when an offset value exceeds its range.
Expand Down Expand Up @@ -174,4 +182,17 @@ type Operation interface {
// Generate encodes an operation as machine code. Using the values from Parse, the operation is
// converted to one (or more) words.
Generate(symbols SymbolTable, pc uint16) ([]uint16, error)

// Source returns information about the source code location of an operation.
Source() SourceInfo
}

// SourceInfo holds information on the source of an operation.
type SourceInfo struct {
Filename string
Pos uint16
Line string
}

// Source returns a copy of the source information.
func (s SourceInfo) Source() SourceInfo { return s }
14 changes: 11 additions & 3 deletions internal/asm/gen.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,12 @@ import (
)

// Generator controls the code generation pass of the assembler. The generator starts at the
// beginning of the parsed-syntax table, generates code fore each operation, and then writes the
// beginning of the parsed-syntax table, generates code for each operation, and then writes the
// bytes to the output (usually, a file).
//
// During the generation pass, any syntax or semantic errors that prevent generating machine code
// are immediately returned from WriteTo. The errors are wrapped SyntaxErrors and may be tested and
// retrieved using the errors package.
type Generator struct {
pc uint16
symbols SymbolTable
Expand Down Expand Up @@ -63,9 +67,13 @@ func (gen *Generator) WriteTo(out io.Writer) (int64, error) {
encoded, err = code.Generate(gen.symbols, gen.pc)

if err != nil {
src := code.Source()
err = &SyntaxError{
Loc: gen.pc,
Err: err,
File: src.Filename,
Loc: gen.pc,
Pos: src.Pos,
Line: src.Line,
Err: err,
}

break
Expand Down
Loading

0 comments on commit 570320d

Please sign in to comment.