Enhanced Neural Arithmetic Logic Unit
It is modified implementation of Neural Arithematic Logic Unit as discussed in this paper
- PyTorch
- Numpy
- Torchviz
Neural accumulator is specifically used for either addition or subtraction
It learns the operation of add/sub as feed forward
no_input : 2
no_output : 1
w_hat.shape : (1, 2)
m_hat.shape : (1, 2)
x.shape : (n, 2)
W.shape : (1, 2)
return :z (x * W.T) + bias
return.shape : (n, 1)
sigmoid(m_hat)-------converges to---->(1, 1)
sigmoid(m_hat) will converge to (1, 1) to make dot product of matrices like summation of inputs
tanh(w_hat)-------converges to---->(1, 1)--------addition
tanh(w_hat)-------converges to---->(1, 1)--------subtraction
tanh(w_hat) will either converge to (1, 1) or to (1, -1) depending if the operation is addition or subtraction respectily
class NAC(torch.nn.Module):
def __init__(self, parameter):
super().__init__()
self.no_inp = parameter.get("no_input")
self.no_out = parameter.get("no_output")
self.w_hat = torch.nn.Parameter(torch.Tensor(self.no_out, self.no_inp).to(DEVICE))
self.m_hat = torch.nn.Parameter(torch.Tensor(self.no_out, self.no_inp).to(DEVICE))
torch.nn.init.xavier_normal_(self.w_hat)
torch.nn.init.xavier_normal_(self.m_hat)
self.bias = None
def forward(self, x):
W = torch.tanh(self.w_hat) * torch.sigmoid(self.m_hat)
return torch.nn.functional.linear(x, W, self.bias)
MU with the help of NAC performs higher order operations like multiplication and division
F = sigmoid(f) will either converge to 0 or 1 giving the input of add/sub and mul/div respectively
It works as gate or barrier to perform operation of add/sub or mul/div
a : Stores output of add/sub of given inputs
m : Stores output of mul/div of given inputs
m = self.nac(torch.log(torch.abs(x) + self.eps))
m = torch.exp(m)
PU performs even higher order operations like power(incuding roots)
p = self.mu(torch.stack((torch.log(torch.abs(x[:, 0])), x[:, 1])).T)
p = torch.exp(p)
For all basic mathematical operations, we are sampling from numbers from uniform distribution
def data_generator(min_val, max_val, num_obs, op):
data = np.random.uniform(min_val, max_val, size=(num_obs, 2))
if op == '+':
targets = data[:, 0] + data[:, 1]
elif op == '-':
targets = data[:, 0] - data[:, 1]
elif op == '*':
targets = data[:, 0] * data[:, 1]
elif op == '/':
targets = data[:, 0] / data[:, 1]
elif op == 'p':
targets = np.power(data[:, 0], data[:, 1])
return data, targets