Install | Play | Test | References
Warning: This implementation is experimental and incomplete. Everything is subject to change without warning.
is an experimental implementation of OMeta with Racket as a host. OMeta is a beautiful instrument for experimenting with designs for programming languages - a convenient way to implement lexers, parsers, tree-transformers and extensions thereof.
From original 2007 paper by Alessandro Warth and Ian Piumarta:
OMeta, a new object-oriented language for pattern matching. OMeta is based on a variant of Parsing Expression Grammars (PEGs) - a recognition-based foundation for describing syntax - which we have extended to handle arbitrary kinds of data.
This here is a recursive interpreter that closely follows operational semantics described in Alessandro Warth's dissertation. It dispatches on tagged lists that represent core OMeta grammar - a deliberately simpleminded approach - think exploratory programming to help one grok and intern ideas. Toy or not you'll find most features you'd expect having read original papers:
- matching on strings
- matching on structured data (lists)
- semantic actions and predicates
- parameterized rules
- inheritance
- foreign rule invocation
- directly left-recursive rules (with nesting)
- higher-order apply
- pattern-matching on rule parameters
- indirectly left-recursive rules
given the experimental nature of this repo it is unlikely to make appearance in Racket packages or collections. Simply clone it and require ometa.rkt from your source files. Assuming your ometa-racket files and your source file are in the same directory:
#lang racket
(require "ometa.rkt")
ometa | define-ometa | omatch | define-ometa-namespace
Technically ometa-racket has enough meat to write a lexer+parser combo for OMeta proper, which uses C-like syntax. I'm working on it. Until then we'll stick with s-expressions. Some of you may even notice just how much it resembles Olin Shivers' SRE regular-expression notation. Such is the power of truly beautiful ideas - we keep reinventing them. Current syntax is verbose because I'm only using a low-level core language.
Core grammar for OMeta parsing expressions (e):
e == (empty)
(atom a)
(apply A)
(seq e1 e2)
(alt e1 e2)
(many e)
(~ e)
(bind x e)
(-> t)
(list e)
Extended grammar:
(apply A) => (apply A arg ...) ; where non-terminal A == (^ RULE [PARENT])
(seq e1 e2) => (seq* e ...) ; sequence of arbitrary length
(alt e1 e2) => (alt* e ...) ; arbitrary number of alternatives
(many e) => (many+ e) ; one or more
OMeta code is a first-class value, which you can bind to variables, pass around and close over in data:
(define test-program
(ometa
(rule-name e)
...))
Or use a shorthand notation:
(define-ometa test-program
(rule-name e)
...)
Match against a stream. Stream can be a string or a list. For now these are not lazy sequences and will be fully consumed before matching. Namespace argument is optional:
(omatch test-program
rule-name
stream
namespace)
Extend OMeta namespace if you want to refer to your top-level bindings. Don't forget to pass it to omatch. You definitely want this when inheriting from other parsers or invoking foreign rules:
(define-ometa-namespace ns)
std | integers | words and decimals | flatten a list | scanners |
Without a standard library we'd have to tediously copy/paste the most basic rules to every OMeta project. Let's not do that! Why else have inheritance and access to foreign rules:
#lang racket
(require "ometa.rkt")
;; a reflective hook so that you have access
;; to your top-level bindings from ometa code
(define-ometa-namespace ns)
(define-ometa std
(char (seq* (bind c (apply anything))
(->? (char? c))
(-> c)))
;; why bake character-classes into a language when
;; its so expressive that adding them is trivial
(char-range x y
(seq* (bind c (apply anything))
(->? (and (char? c)
(char<=? x c y)))
(-> c)))
(letter (alt* (apply char-range #\a #\z)
(apply char-range #\A #\Z)))
(digit (apply char-range #\0 #\9))
(number (many+ (apply digit)))
(spaces (many+ (atom #\space))))
Let's write a left-recursive rule for parsing integers:
(define char->number (compose string->number list->string list))
(define-ometa integers (<< std) ;inherit from std
(int (alt* (seq* (bind n (apply int))
(bind d (apply (^ digit))) ;invoke a `digit' parent rule
(-> (+ (* n 10) (char->number d))))
(seq* (bind d (apply (^ digit)))
(-> (char->number d))))))
(car (omatch integers int "567" ns)) ;;=> 567
Let's extend rules in std. We should be able to parse decimals and identifiers with _
:
(define-ometa token (<< std)
(letter (alt* (atom #\_) ; accept underscore
(apply (^ letter)))) ; invoke parent rule
(id (many+ (apply letter)))
(number (alt* (seq* (bind pre (apply (^ number)))
(atom #\.)
(bind post (apply (^ number)))
(-> `(,@pre #\. ,@post)))
(apply (^ number)))))
(car (omatch token id "hello_Id" ns)) ;;=> "hello_Id"
(car (omatch token number "57.877" ns)) ;;=> "57.877"
Let's pattern-match on structured data:
(define input '(1 (2 (3 4)) (((5)) 6)))
(define-ometa flat
(flatten (seq*
(list (bind xs (apply inside)))
(-> xs)))
(inside (alt*
(seq* (list (bind xs (apply inside)))
(bind ys (apply inside))
(-> (append xs ys)))
(seq* (bind x (apply anything))
(bind xs (apply inside))
(-> (cons x xs)))
(seq* (apply end)
(-> null))))
(end (seq*
(~ (apply anything))
(-> null))))
(car (omatch flat flatten input)) ;;=> '(1 2 3 4 5 6)
Let's go mental and combine scannerless and scannerful parsing to parse assignments:
(define-ometa toks (<< std)
(eq (seq* (atom #\=)
(-> (make-immutable-hash `((kind . =) (value . "="))))))
(num (seq* (bind n (apply (^ number)))
(-> (make-immutable-hash `((kind . num) (value . ,(list->string n)))))))
;; not just rules inherited from `std'
;; let's invoke one from `token'
(id (seq* (bind ls (apply (^ id token)))
(-> (make-immutable-hash `((kind . id) (value . ,(list->string ls)))))))
(scanner (seq* (apply (^ spaces))
(alt* (apply eq)
(apply num)
(apply id)))))
(define-ometa assignments (<< toks)
(token k (seq* (bind t (apply (^ scanner)))
(->? (equal? (hash-ref t 'kind #f) k))
(-> (hash-ref t 'value))))
(assign (seq* (bind a (apply token 'id))
(bind b (apply token '=))
(bind c (apply token 'num))
(-> (string-append a b c)))))
(car (omatch assignments assign " my_var = 56" ns)) ;;=> "myvar=56"
>$ racket ometa-tests.rkt
will run rackunit test-suites.
- OMeta - original paper by Alessandro Warth and Ian Piumarta
- Experimenting with Programming Languages - a dissertation by Alessandro Warth
- ometa-js playground - try OMeta in the browser
- Alessandro Warth's OMeta page