ExtremeTech Explains: What is a Neural Net?

As Moore’s Law approaches its endgame, the technosphere has looked to different and more diverse approaches to computing. We can’t just depend on driving clock speeds higher and higher. Nor can we continue making transistors increasingly smaller. But comparisons abound between computers and the human brain. After all, it’s all about computational power, and that’s one area where our brains are still better than computers. So, researchers have started designing computer systems that look and act more like brains. We call these systems neuromorphic systems, neural networks, or simply neural nets.

As we increase our understanding of the human connectome, our ability to understand the brain’s phenomenal information throughput grows in tandem. We now have a partial network diagram of the human brain. Consequently, neuromorphic computing tools have exploded in popularity. Like we predicted back in 2016, Artificial Intelligence (AI) is powering the current technological revolution in human healthcare. It also appears that neural nets may be the next frontier in the advance of computing technology as a whole.

This guide is intended to be a brief but thorough primer on neural networks. If you’re left with questions after reading, please let us know in the comments: we may address your questions in an update or a future article.

So, without further ado…

What is a Neural Network?

A neural network, also known as a neuromorphic system, is a system of hardware and/or software that emulates some aspect of the brain. Neural nets typically elaborate on three core components: A neural network is a system of neurons or nodes, plus their connections and weights. It must also possess some kind of propagation function. Some neural networks are capable of observing their own results, and changing their approach to a task.

Why Mimic the Brain?

There are two big reasons to choose the brain as a template for information processing. Firstly, brains and computers are already similar in a lot of ways. For example, brains and computers are both mesh networks of multiple layers, responsible for high-volume information processing. They both have working memory whose buffer can falter or overflow. They both have “cold storage” — the brain organizes information in a rough semantic hierarchy across the surface of the brain, similar to how a hard drive stores data in different places on its platter. And they both consist of a network of nodes, whose importance is individually expressed by the “weights” of their connections. Their nodes communicate using a series of electrical spikes that has data embedded in the sequence.

Secondly, brains are really good at what they do. They’re fast, light, and very low-power. The human brain is thought to have a petascale information throughput — much greater than any single PC we’ve ever built. (Distributed supercomputers don’t count.) They’re also efficient: the human brain operates on about twenty watts.

Jack of All Trades, Master of Many

This winning combination is partially because of the brain’s physical structure, and partially because of the unique way information travels through the nervous system. Brains work in series and parallel, binary and analog, all at the same time. When neurons fire, those impulses travel down axons in series. But neurons are arranged in cortical columns, and cortical columns are arranged into brain regions. In this way, whole brain regions of many cells can work on the same task, in parallel.

Neurons are strongly binary when it comes to how they convey their messages. A neuron’s signal is made of electrical spikes organized in the time domain. A neuron is either spiking or it is not. (The frequency varies, but the amplitude does not, just like an FM radio.) Spikes are composed of a tiny traveling zap. Spike trains are time-delineated sequences of electrical waves that contain fragments of data. It’s almost like a Morse code receiver. But neurons also do analog uptake by tallying up all the overlapping, real-time flickers of input from dendrites. It’s like a physiological Fourier transform.

Neurons communicate with one another by receiving, interpreting, and then propagating a tiny wave of electricity down the length of the axon. It’s extremely low-power, because that little sizzle of depolarization across the membrane is performed by simply cooperating with the electrochemical gradient outside the cell. Cortical neurons, in particular, have a profoundly branching structure. They connect in a “many-to-one” fashion to their neighbors who are physically next to them, and to “upstream” neurons who come before them in the information flow. They also connect in a “one-to-many” manner when communicating to downstream colleagues. To manage all these surfaces, they analyze the “weight” of connections in order of importance by building synapses.

Neuromorphic Hardware

Neuromorphic chips approach the brain’s function as an emergent property of its physical structure. But neurons in neuromorphic hardware are not the same as neurons in a brain. One neuron is not the same as one transistor. Instead, a hardware neuron is made of many transistors, like a biological neuron has many ports in its cell membrane.

TrueNorth

IBM’s TrueNorth, produced in 2014, was a manycore neuromorphic CMOS chip hosting a convolutional neural net. TrueNorth had its own software ecosystem, including a bespoke programming language, libraries, and a whole IDE.

Loihi

Intel’s Loihi ecosystem includes a hardware neural net and its associated software framework. Currently, we’re on the second generation of Loihi, the neuromorphic chip. Lava is the software access path to Loihi’s powers.

While it is made using conventional semiconductor materials and will be manufactured in the future on the Intel 4 process node, Loihi is organized very differently than the silicon we’re all used to. Loihi’s physical architecture mimics the physical organization of the brain, but on a smaller scale. The chip has up to a million neurons: individual entities in the network, each with 128kB of memory attached. That pool of information is the chip’s analog for synapses; it reflects the state of the neuron’s connectivity at any given time. It is supervised by adjacent x86 cores, which impose an external clock to correct the neurons’ firing rhythms. The supervisor cores also periodically force the neurons to check their memory against the rest of the group, or recalculate the strength of their connections.

Intel Loihi 2

This structure matches the hierarchical and parallel aspects of the brain’s organization. In addition, Loihi 2 revised its approach to firing. Loihi 1 fired its neurons in binary: one or zero, nothing in between. But Loihi 2 encodes its spikes as integers, which lets the spikes carry metadata. Spikes with integer values can emulate the catalogue of different electrochemical signals a neuron can send or receive. Furthermore, upstream neurons, closer to the input layer, can exert some influence on neurons downstream.

Developers can interface with Loihi and Lava using Python. (This is starting to sound like quite a tropical adventure.) The Loihi system will eventually be available to researchers, but consumer applications are a low priority.

Major Types of Neural Nets

Neural nets, with their layers and redundancy, excel at handling highly parallel tasks. They also help with tasks that require the user to ingest a huge amount of data in order to identify patterns within it — this is often called “drinking from the firehose.” To get the benefits of big data, we have to be able to process it at a useful speed. Neural nets also excel at manipulating data with metadata or many dimensions.

There are many different individual neural net projects, but they all fall within a few different families of function. Each algorithm is built for a different type of problem, and they all engage in subtly different kinds of machine learning. Here, we’ll discuss four major subtypes of neural networks: convolutional, recurrent, generative adversarial, and spiking.

Convolutional Neural Nets

Convolutional neural nets (CNNs) are “feed-forward” systems, which means that the flow of information is one-way. This type of neural network was mentioned in media reports of AIs / neural nets that can perform operations, but can’t explain how they arrived at their answers. It’s not because they’re being stubborn; convolutional neural nets simply aren’t built to show their work. They consist of an input layer, one or more hidden layers, and an output layer or node.

CNNs are often used for processing images. Because they’re feed-forward, mathematically, CNNs do great work on two-dimensional arrays of data, such as images and other matrices. Under the hood, these neural nets are applying a long mathematical formula that lets them perform operations on not just two numbers or terms or equations, but a whole body of data, like upscaling an image.

Being feed-forward, however, entails a certain amount of dead reckoning. One way this shows itself is in the way a CNN can perform image recognition and then use its newfound understanding to produce distorted, trippy images derived from its training dataset. In 2016, MIT released an AI that could harness this runaway behavior to “spookify” images, producing a torrent of nightmare fuel just in time for Halloween.

Recurrent Neural Nets

In contrast to the feed-forward approach, recurrent neural nets do a thing called back-propagation. Back-propagation is the act of relaying information from deeper to shallower levels in the neural net. This type of algorithm is capable of self-improvement.

Recurrent neural nets perform back-propagation by making connections to other neurons in the system — on a scale up to and including having every neuron connected to every other neuron. This redundancy can produce highly accurate results, but there is a ceiling of diminishing returns, not unlike super-sampling anti-aliasing (SSAA). As the algorithm makes more and more passes over data it’s already processed, there’s less and less it can do. In anti-aliasing, going from 2xAA to 4xAA can produce clearly noticeable results, but it’s tough to tell the difference between 8x and 16xAA without a practiced eye.

This type of neural network can be trained using gradient descent, a method of analysis that makes a three-dimensional landscape of possibilities. Desired or undesired results make the “terrain” in the landscape. As we’ve said before, gradient descent isn’t the best neural network training method, but it is a powerful tool. Recurrent neural nets can give gradient descent a boost by maintaining some memory of what the changing landscape used to be.

Spiking Neural Nets

As we’ve seen above, neuromorphic design comes in both physical and digital formats. Instead of a stream of binary data running constantly through a single CPU, spiking neural nets can be software, hardware or both. Decentralized cores, whether physical or logical, fire in a cadence called a “spike train” to convey their signal.

Spiking neural nets often move information using the “leaky integrate-and-fire” model. Each neuron in a spiking net has a weight, which represents a rolling average of that neuron’s recent activity. More activations push the value higher. But the weight is “leaky,” in the sense of having a hole in a bucket. As time passes, every neuron’s network weight falls. It’s a little like forgetting. Biologically, not every neuron is active at all times. You can contrast it against the permanently saturated wreath of connections in a recurrent neural net.

Spiking nets are not so good at gradient descent, nor optimization problems of that kind. However, they’re great at modeling biological functions. As spiking nets become more complex, they can encode more information within a series of spikes. This allows a much closer computational pass over the diverse functions of the nervous system. We’ve already used spiking neural nets to simulate the nervous systems of C. elegans roundworms and Drosophilia fruit flies. Now, researchers are attempting to simulate a human cortical column in real-time.

Another possible direction for spiking neural network research is into additional levels of abstraction. Researchers are working on creating a spiking net wherein each individual neuron contains its own tiny neural network.

Generative Adversarial Neural Networks (GANs)

One type of neural net with rising popularity is the generative adversarial neural network, or GAN. GANs are another evolution of artificial intelligence, frequently used to alter or generate images. The “adversarial” part means that these neural nets are built to compete with themselves.

Just as Cerberus had three heads, within a GAN there are often two separate neural nets with their own intentions, one generative and one discriminative. The generative model produces a result, often an image. Then the generative side tries to “fool” the discriminative model, to see how close it can get to a desired output. If the discriminative side isn’t fooled, the result is discarded. The results of this trial, both the success of the generative side and also the content it made, are filed away. sometimes the learning is supervised, and sometimes it is not. But in either case, after each round of judgment, the GAN goes back to the drawing board and tries again. Together, both sides iterate toward success.

Deepfakes

GANs are capable of producing deepfakes: unique, photorealistic images of people who don’t exist in the real world. To do this, the neural net looks through many photos of real humans, gathering data on how we differ from one another, and on the ways in which we resemble one another. (In effect, this is brute-force phenotyping.) Then, once the GAN’s discriminative side is ready, the generative side can start trying to produce its own original work. One example is Nvidia’s StyleGAN, which can produce images of startling, deceptive realism. There’s even a derivative project that challenges viewers to identify whether a given StyleGAN picture of a person is real or fake.

The results of a GAN’s labor can be so realistic, in fact, that in 2019 the state of California (the home of both 2257 forms and Hollywood) instituted a law banning the use of technology like GANs to create deepfake pornography of a person without their consent. The state also outlawed the distribution of manipulated video of political candidates within two months of an election. DARPA is trying to keep abreast of this A/V arms race by instituting an entire division to study GANs, and find ways to defeat them.

Yikes…

While this all sounds very stressful, there are uses for GANs that don’t involve scraping the internet for public-facing profile photos. One application is particle physics; physicists require exquisite certainty in their measurements before they are willing to say they’ve found a new particle or explained a phenomenon.

Another place GANs excel is game theory. Presented with a list of rules and priorities, GANs can assess the likely choices of participants, and use that probability spread to predict the endgame. This type of neural net is also under study for use improving astronomical images, and predicting gravitational lensing.

Summer 2021 saw the release of CodexAI, a generative neural net capable of improving its own software. The model can translate natural language to code. It can also generate snippets of intelligble code, after being fed all of Github. While CodexAI is a full-fledged neural net in its own right, it could also work as part of a much larger, hierarchical system. CodexAI’s behavior resembles the first faltering sparks of a lone neuron as it establishes its first synapses. It also shows the limitations of the technology. Neural networks can learn to correct their assumptions, but the reach of AI still exceeds its grasp. Integrating multiple diverse models is the path of the future.

Where Do Neural Networks Fail?

Neural networks are great at fulfilling specific and well-constrained requests, but they can be overeager. The great strength of computers is the speed at which they can perform repetitive operations. These rapid iterations also make it possible to over-train a neural net. When that happens, its dead reckoning goes totally awry. An over-trained AI can produce some remarkably strange images, and it’s not very useful for predictive purposes — for example, weather forecasting.

Ultimately, though, these sundry weaknesses are minor concerns compared to the problem of growth. To get more powerful a neural net has to get bigger, and therein lies the rub. Neural nets can’t scale infinitely. Their scaling efficiency is actually worse than a regular datacenter, because of the very thing that makes neural nets so capable. The central concept of a layered neural network, its layered depth and redundancy, demands an exponentially increasing amount of power. Thus far we’ve been using brute force to achieve our ends, which works — to a point.

This problem with power scaling is why Intel is using Loihi’s low power consumption as a primary selling point. Eventually, the combined challenges of power use and thermal dissipation will put a hard limit on our ability to just link up more of these chips to make bigger and more sophisticated AIs.

Final Thoughts

The difference between a neural network and an artificial intelligence is largely a matter of opinion. Is a neural net an artificial intelligence in and of itself? Or is an AI made of subordinate neural networks? The only difference is the level of abstraction at which the speaker chooses to make the distinction.

One thing everyone seems to agree on is that neural nets can’t do what they do without data. Big data. As edge computing and data science take off, a whole new realm of information opens to our analysis. There is a staggering amount of raw data produced every day. It is up to us to find creative and clever ways to use it.

Feature image by Mike MacKenzie (Flickr)

Now Read:

Source From Extremetech
Author: Jessica Hall