"The Tsetlin machine is a new universal artificial intelligence (AI) method that learns simple logical rules to understand complex things, similar to how an infant uses logic to learn about the world. Being logical, the rules become understandable to humans. Yet, unlike all other intrinsically explainable techniques, Tsetlin machines are drop-in replacements for neural networks by supporting classification, convolution, regression, reinforcement learning, auto-encoding, language models, and natural language processing. They are further ideally suited for cutting-edge hardware solutions of low cost, enabling nanoscale intelligence, ultralow energy consumption, energy harvesting, unrivaled inference speed, and competitive accuracy."
This project implements the Graph Tsetlin Machine.
- Features
- Installation
- Tutorial
- Demos
- Example Use Case
- Graph Tsetlin Machine Basics
- Paper
- CUDA Configurations
- Roadmap
- Licence
- Processes directed and labeled multigraphs
- Vector symbolic node properties and edge types
- Nested (deep) clauses
- Arbitrarily sized inputs
- Incorporates Vanilla, Multiclass, Convolutional, and Coalesced Tsetlin Machines
- Rewritten faster CUDA kernels
pip3 install graphtsetlinmachine
or
python ./setup.py sdist
pip3 install dist/GraphTsetlinMachine-0.3.1.tar.gz
In this tutorial, you create graphs for the Noisy XOR problem and then train and test the Graph Tsetlin Machine on these.
Noisy XOR gives four kinds of graphs, shown below:
Observe how each node has one of two properties: A or B. If both of the graph's nodes have the same property, the graph is given the class label
The task of the Graph Tsetlin Machine is to assign the correct class label to each graph when the labels used for training are noisy.
Start by creating the training graphs using the Graphs construct:
graphs_train = Graphs(
10000,
symbols = ['A', 'B'],
hypervector_size = 32,
hypervector_bits = 2
)
You initialize the graphs as follows:
-
Number of Graphs. The first number sets how many graphs you are going to create. Here, you prepare for creating 10,000 graphs.
-
Symbols. Next, you find the symbols A and B. You use these symbols to assign properties to the nodes of the graphs. You can define as many symbols as you like. For the Noisy XOR problem, you only need two.
-
Vector Symbolic Representation (Hypervectors). You also decide how large hypervectors you would like to use to store the symbols. Larger hypervectors room more symbols. Since you only have two symbols, set the size to 32. Finally, you decide how many bits to use for representing each symbol. Use 2 bits for this tutorial. You then get 32*31/2 = 496 unique bit pairs - plenty of space for two symbols!
-
Generation and Compilation. The generation and compilation of hypervectors happen automatically during initialization, using sparse distributed codes.
The next step is to set how many nodes you want in each of the 10,000 graphs you are building. For the Noisy XOR problem, each graph has two nodes:
for graph_id in range(10000):
graphs_train.set_number_of_graph_nodes(graph_id, 2)
After doing that, you prepare for adding the nodes:
graphs_train.prepare_node_configuration()
You add the two nodes to the graphs as follows, giving them one outgoing edge each:
for graph_id in range(10000):
number_of_outgoing_edges = 1
graphs_train.add_graph_node(graph_id, 'Node 1', number_of_outgoing_edges)
graphs_train.add_graph_node(graph_id, 'Node 2', number_of_outgoing_edges)
You are now ready to prepare for adding edges:
graphs_train.prepare_edge_configuration()
Next, you connect the two nodes of each graph:
for graph_id in range(10000):
edge_type = "Plain"
graphs_train.add_graph_node_edge(graph_id, 'Node 1', 'Node 2', edge_type)
graphs_train.add_graph_node_edge(graph_id, 'Node 2', 'Node 1', edge_type)
You need two edges because you build directed graphs, and with two edges you cover both directions. Use only one edge type, named Plain.
In the last step, you randomly assign property A or B to each node.
Y_train = np.empty(10000, dtype=np.uint32)
for graph_id in range(10000):
x1 = random.choice(['A', 'B'])
x2 = random.choice(['A', 'B'])
graphs_train.add_graph_node_property(graph_id, 'Node 1', x1)
graphs_train.add_graph_node_property(graph_id, 'Node 2', x2)
Based on this assignment, you set the class label of the graph. If both nodes get the same property, the class label is 0. Otherwise, it is 1.
if x1 == x2:
Y_train[graph_id] = 0
else:
Y_train[graph_id] = 1
The class label is finally randomly inverted to introduce noise.
if np.random.rand() <= 0.01:
Y_train[graph_id] = 1 - Y_train[graph_id]
See the Noisy XOR Demo in the example folder for further details.
The Graph Tsetlin Machine supports rich data (images, video, text, spectrograms, sound, etc.). One can, for example, add an entire image to a graph node, illustrated for MNIST images below:
Here, you define an image by adding its white pixels as properties to the node. Each white pixel in the grid of 28x28 pixels gets its own symbol Wx,y.
Note that with only a single node, you obtain a Coalesced Vanilla Tsetlin Machine. See the Vanilla MNIST Demo in the example folder for further details.
By using many nodes to capture rich data, you can exploit inherent structure in the data. Below, each MNIST image is broken down into a grid of 19x19 image patches. A patch then contains 10x10 pixels:
Again, white pixel symbols Wx,y define the image content. Additionally, this example use a node's location inside the image to enhance the representation. You do this by introducing row Ry and column Cx symbols.
These symbols allow the Graph Tsetlin Machine to learn and reason about pixel patterns as well as their location inside the image.
Without adding any edges, the result is a Coalesced Convolutional Tsetlin Machine. See the Convolutional MNIST Demo in the example folder for further details.
The above two examples did not require edges. Here is an example where the edges are essential.
The task is to decide how many 'A's occur in sequence. The 'A's can appear at any time, preceded and followed by spaces. The below graphs model the task:
From the perspective of a single node, the three classes Y=0 (one 'A'), Y=1 (two 'A's), and Y=2 (three 'A's) all look the same. Each node only sees an 'A' or a space. By considering the nodes to its
Remark. Notice the two types of edges:
See the Sequence Classification Demo in the example folder for further details.
This example increases the challenge of the Noisy XOR problem by using images of handwritten '0's and '1's instead of the symbols
Again, the white pixels of the images become the node properties (illustrated by the images themselves above). To solve this task, the Graph Tsetlin Machine must both learn the appearance of handwritten '0's and '1', while relating them according to the XOR relation under the guidance of noisy class labels.
See the Noisy XOR MNIST Demo in the example folder for further details.
Graph Tsetlin Machines process multimodal data in complex structures. Here is an envisioned example use case from a hospital:
The nodes in the figure capture various kinds of health data, such as ECG and the medical narrative in Electronic Health Records. The different types of edges specify the relationships between the data: Measurement edges relate medical tests to a patient, Condition edges relate diseases to patients, and so on. Machine learning tasks in this setting include: forecasting, alerting, decision-making, situation assessment, risk mitigation, knowledge discovery, and optimization.
The Graph Tsetlin Machine is based on message passing. As illustrated below, a pool of clauses examines each node in the graph. Whenever a clause matches the properties of a node, it sends a message about its finding through the node's outgoing edges.
When a node receives a message, it appends the message to its properties. In this manner, the messages supplement the node properties with contextual information.
The above message passing enables logical reasoning with nested (deep) clauses. We here use the Sequence Classification Demo to study the reasoning procedure step-by-step, employing a single clause
1) Input Graph. The example input is a graph with three consecutive
2) Initial Features. The Graph Tsetlin Machine next describes each node using Boolean features
Feature
3) Clause Without Message Literals. To produce the first round of messages, clause
The reason is that the truth values of
4) Partial Clause Matching; 5) Message Passing; 6) Updated Features. In these steps, the Graph Tsetlin Machine matches the partial clause against the nodes. This matching gives one truth value per node. Value
Note that the truth values are set to
7) Full Clause With Message Literals; 8)Full Clause Matching; 9) Evaluation; 10) Classification. The message literals of the clause (marked in red) can now be activated for full clause matching:
The result is an updated truth value per node. Finally, the truth values are ORed together to give the final classification
The number of message rounds decides the depth of the reasoning. Three layers of reasoning, for instance, consist of local reasoning, followed by two rounds of message passing, illustrated below:
Initially, the clauses only consider the nodes' properties (marked in black).
- In the first round of message passing, matching clauses send out their messages. These messages supplement the receiving node's properties (marked in red).
- In the second round, the clauses examine the nodes again, now taking into account the first round of messages. Based on this revisit, the clauses produce the second round of messages, marked in blue.
This process continues until reaching the desired depth of reasoning, in this case depth three.
Finally, the Tsetlin Automata Teams update their states based on how the clauses handled the classification task at hand. Notice from the figure how each team operates across the nodes' properties as well as the incorporated messages. In this manner, they are able to build nested clauses. That is, a clause can draw upon the outcomes of other clauses to create hierarchical clause structures, centered around the various nodes. Hence, the power of the scheme!
A Tsetlin Machine for Logical Learning and Reasoning With Graphs. Ole-Christoffer Granmo, Youmna Abdelwahab, Per-Arne Andersen, Paul F. A. Clarke, Kunal Dumbre, Ylva Grønninsæter, Vojtech Halenka, Runar Helin, Lei Jiao, Ahmed Khalid, Rebekka Omslandseter, Rupsa Saha, Mayur Shende, and Xuan Zhang, 2024. (Forthcoming)
tm = MultiClassGraphTsetlinMachine(
...
grid=(16*13,1,1),
block=(128,1,1)
)
tm = MultiClassGraphTsetlinMachine(
...
grid=(16*13*4,1,1),
block=(128,1,1)
)
- Rewrite graphs.py in C or numba for much faster construction of graphs
- Add autoencoder
- Add regression
- Add multi-output
- Graph initialization with adjacency matrix
Copyright (c) 2024 Ole-Christoffer Granmo
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.