Neural Networks — Explained.

9 min readApr 19, 2021

Neurons.

I’m sure you’ve heard of this term.

If not, I think it’s safe to say you’ve been living under a rock. Neurons have been the talk of the town — especially with optogenetics being a super hot field right now.

Neurons are the fundamental units of the brain and the nervous system. They’re responsible for carrying information throughout the human body. Neurons use both electrical and chemical signals to help coordinate the necessary functions of life.

Okay — neurons sound pretty dope. But what’s this whole thing about neural networks?

I’m glad you asked 😉

Machine Learning — Surface-level explanation

In order for us to understand how neural networks work — we first need to understand Machine Learning.

It basically means what it says — it’s making our computer, learn.

Let’s do a quick Google search for the definition of learn:

Google — the answer to all my questions 😍

Notice the word “experience’.

Hmm…interesting.

Let’s back it up a bit and start with Human Learning (or Classical Programming), and then move onto Machine Learning afterwards.

In classical programming, I would create a set of instructions/directions for the computer in order for it to solve a specific problem. In this case, the programmer would have to understand all the aspects of the problem that I am trying to get the computer to solve, and also know exactly how I can get to the solution.

For example, let’s just say that I am trying to write a program that can decipher a triangle from a square. One way I could get to the solution is by counting the number of corners. If the program detects 3 corners, it will classify it as a triangle. If it sees 4 corners, it will classify it as a square. We’ve gotten to the solution of this problem.

But what about machine learning?

In a very abstract sense → Machine Learning = learning from examples

Machine learning is a branch of artificial intelligence (AI) that is focused on building applications that learn from data and improve their accuracy over time without being programmed to do so.

So essentially, machine learning is the development of computer programs that take data and automatically learn and improve from experience without being explicitly programmed.

If we took the same example I used from above — in machine learning — we would design a learning system where we would input a bunch of shapes and their class (triangle or square). The computer would then be capable of learning the properties used to differentiate between the two (the number of corners) on its own.

The goal is that once the machine has learnt all these properties, we can give it a random image — one that the computer has never seen before — of a new triangle/square, and the computer should be able to classify the shape correctly.

Neural Networks in the Brain

Neurons collect signals from others through dendrites. Each neuron has multiple dendrites. The neuron sends out spikes of electric activity through the axon which can split into thousands of branches. At the end of each branch, there is a synapse that converts the activity from the axon into electrical effects that stop or excite the activity on the contacted (target) neuron. When a neuron recieves exitatory input that is large compared to its inhibitory input, it sends a spike of electric activity (an action potential) down its axon.

Dendrites are responsible for recieving chemical signals from the axon terminals of other neurons. They convert these signals into small electric impulses and transmit them in the direction of the cell body.

The idea behind a neural network (in programming) is to simulate a bunch of densely interconnected brain cells — but inside a computer, so that you can get the computer to learn things, recognize patters, and essentially make decisions in the way a human brain would.

Neural networks in programming in some sense allow us to “copy” (in a sense) how the brain functions, but of course — in a less complex way 🧠

Neural Networks in Programming

First — what’s a neuron?

In programming, a neuron is basically a function. This means that it takes some input, applies some logic, and then outputs the result.

Essentially, a neuron learns.

A neuron will recieve one or more input signals. These input signals are from either the raw data set or from neurons positioned at a previous layer of the neural network.

Neurons take a group of weighted inputs, apply an activation function, and then return an output. The neuron then applies an activation function to the “sum of weighted inputs” from each incoming synapse and passes the result on to all the neurons in the next layer.

What are weighted inputs?

They control the signal between two neurons. A weight decides how much influence the input will have on the output. They act as the parameter within a neural network that transforms input data within the network’s hidden layers. As an input enters the node, it gets multiplied by a weight value and the resulting output is either observed or passed to the next layer in the neural network.

Let’s look at an example:

Let’s say I am trying to understand the relationship between the number of pages in a book, and the number of pages people actually read from that book.

I’m going to collect many examples of number of pages in books — which will be x — and how many pages people actually read from those books — which will be y. I expect that there will some relationship between them — which will be f.

I need to tell the program the relationship I expect to see (positive correlation, negative correlation, straight line, etc.) and the machine will understand the actual line it needs to draw.

The goal — the machine can apply the relationship f it found as well as how many pages I can expect people to actually read.

So now for the million dollar question— what’s a neural network?

You guessed it — a neural network is a network of functions (f).

The neural network has a bunch of different functions that are all connected. Credits.

How does it work?

A neural network recieves inputs and passes them along to be processed by a bunch of different experts— which are just clusters of neurons that have been trained to handle that specific input.

After these inputs have been processed by one group of experts, their outputs are analyzed and put into a new group of experts. The specific experts that are used depend on what the output was. Each outcome maps to a different group of experts.

Each expert behaves differently. When a new type of task is encountered, a new cluster of experts are trained specifically for that task.

Maybe an example might help you understand this a bit more:

Let’s say that a neural network is trained to watch a video of a basketball and predict its motion. The neural network learns to recognize “sphere”, “orange”, “textured” as basketball, and then sends its data to a cluster of expert neurons that handle ‘falling objects’.

The neural network treats the ball as a falling object and expects it to drop with a parabolic motion. The experts, after many examples of falling basketballs, can accurately predict the motion of the basketball.

Then the same network is given images of a baseball. Now the program will recognize “sphere” (like with the basketball) but will also recognize “white” and “stitched”. This is a new case — which grants a new cluster of expert neurons. After some time, these new neurons learn to predict the motion of a baseball.

These experts independently rediscovered parabolic motion when they identified the baseball.

The parabolic motion of the baseball should then be compared to that of the basketball. If the neural network sees that the two clusters of experts behave the same way, then in any instance where baseballs are identified — can be fed to the same ‘falling objects’ that the basketball uses.

Various types of Neural Networks

Now I know that you’re thinking — aren’t there different types of neural networks? I swear I’ve heard people saying that they’re artificial.

You’re right! There are various types of neural networks you can create (CNNs, ANNs, RNNs, etc.) and each of them have different purposes, strengths and weaknesses. Let’s take a look.

Artificial Neural Networks (ANNs)

ANNs use different layers of mathematical processing layers to make sense of the information that it’s fed. ANNs usually consist of dozens to millions of artificial neurons (units) that are arranged in these layers.

ANNs are also known as a Feed Forward Neural network because inputs are processed only in the forward direction (unlike RNNs which we’ll get to soon). In feed-forward networks, data is transferred from the input layer to the hidden layers and then finally to the output layer. There aren’t any cycles/loops of data transfer.

Uses:

Predicting stock trends
Predicting diseases/cancer
Classifying/recognizing faces, etc.

Advantages:

Storing information on the entire network
Ability to work with incomplete knowledge
Having fault tolerance
Having a distributed memory

Disadvantages:

Hardware dependence
Unexplained behaviour of the network
Determination of proper network structure

Convolutional Neural Networks (CNNs)

CNNs use a variation of multilayer perceptrons and contains one or more convolutional layer that can either be entirely connected or pooled (gathering together small sets of data that are assumed to have the same value of a characteristic).

It is a Deep Learning algorithm that can take an input image, assign learnable weights and biases to various aspects/objects in the image, and then be able to differentiate one from another.

Fun fact: The architecture of CNNs are similar to the connectivity patterns of neurons in the human brain and was inspired by the organization of the visual cortex. Individual neurons respond to simuli only in a restricted region of the receptive field (region of visual space where changes in luminance influence the activity of a single neuron).

Uses:

Image classification
Facial recognition
Analyzing documents

Advantages:

Very high accuracy in image recognition problems
Automatically detects the important features without any human supervision
Weight sharing

Disadvantages:

CNNs do not encode the position and orientation of object
Lack of ability to be spatially invariant to the input data
Lots of training data is required

Recurrent Neural Networks (RNNs)

With RNNs, outputs can be fed as inputs — meaning that inputs can be fed in both the forward and backward direction. This means that the output that I recieve from the first step, can be fed as an input for the next step. RNNs save the output of processing nodes and feed the result back into the model.

For example, if we were required to predict the next word of a sentence, we would need to know the previous words. This is where RNNs become useful, because they learn to predict the outcome of a layer by remembering the previous outputs.

Each node in the RNN model acts as a memory cell and continues the computation and implementation of operations. If the RNN’s prediction is incorrect, then the system self-learns and continues working towards the correct prediction during backpropagation (messenger telling the network whether or not the net made a mistake when it made a prediction).

Uses:

Text-to-speech conversions

Advantages:

An RNN remembers each and every information through time. It is useful in time series prediction only because of the feature to remember previous inputs as well. This is called Long Short Term Memory.
Recurrent neural network are even used with convolutional layers to extend the effective pixel neighborhood.

Disadvantages:

Gradient vanishing and exploding problems
Training an RNN is a very difficult task
It cannot process very long sequences if using tanh or relu as an activation function

Credits go this article.