ABSTRACT
The research work in this dissertation presents a new perspective for obtaining solutions of initial value problems using Artificial Neural Networks (ANN). We discover that neural network based model for the solution of Ordinary Differential Equations (ODEs) provides a number of advantages over standard numerical methods. First, the neural network based solution is differentiable and is in closed analytic form. On the other hand most other techniques offer a discretized solution or a solution with limited differentiability. Second, the neural network based method for solving a differential equation provides a solution with very good generalization properties. In our novel approach, we consider first, second and third order homogeneous and nonhomogeneous linear ordinary differential equations, and first order nonlinear ODE. In the homogeneous case, we assume a solution in exponential form and compute a polynomial approximation using SPSS statistical package. From here we pick the unknown coefficients as the weights from input layer to hidden layer of the associated neural network trial solution. To get the weights from hidden layer to the output layer, we form algebraic equations incorporating the default sign of the differential equations. We then apply the Gaussian Radial Basis Function (GRBF) approximation model to achieve our objective. The weights obtained in this manner need not be adjusted. We proceed to develop a Neural Network algorithm using MathCAD 14 software, which enables us to slightly adjust the intrinsic biases. For first, second and third order non-homogeneous ODE, we use the forcing function with the GRBF model to compute the weights from hidden layer to the output layer. The operational neural network model is redefined to incorporate the nonlinearity seen in nonlinear differential equations. We compare exact results with the neural network results for our example ODE problems. We find the results to be in good agreement. Furthermore these compare favourably with the existing neural network methods of solution. The major advantage here is that our method reduces considerably the computational tasks involved in weight updating, while maintaining satisfactory accuracy.
TABLE OF CONTENTS
Title Page i
Certification ii
Declaration iii
Dedication iv
Acknowledgement v
Table of Contents vi
List of Tables viii
List of Figures xi
Abstract xiii
CHAPTER
1: INTRODUCTION 1
1.1 Definition of
a Neural Network 1
1.2 Statement of
the Problem 3
1.3 Purpose of the
Study 4
1.4 Aim and
Objectives 4
1.5 Significance
of the Study 5
1.6 Justification
of the Study 6
1.7 Scope of the
Study 6
1.8 Definition of
Terms 6
1.9
Acronyms 7
CHAPTER
2: REVIEW OF RELATED
LITERATURE 9
CHAPTER 3: MATERIALS
AND METHODS 14
3.1 Artificial Neural Network 14
3.1.1 Architecture 14
3.1.2 Training feed forward neural network 15
3.2 Mathematical Model of Artificial Neural Network 15
3.3 Activation Function 16
3.3.1 Linear activation function 18
3.3.2 Sign activation
function 18
3.3.3 Sigmoid
activation function 19
3.3.4 Step activation
function 19
3.4 Function
Approximation 19
3.5 General Formulation for Differential Equations 21
3.6 Neural Network
Training 22
3.7 Method of Solving First Order Ordinary Differential Equations 23
3.8 Computation of the Gradient 35
3.9 Regression Based Learning 49
3.9.1 Linear regression: A simple learning algorithm 50
3.9.2 A neural
network view of linear regression 50
3.9.3 Least squares
estimation of the parameters 51
CHAPTER 4: RESULTS AND DISCUSSION 53
4.1 First and Second Order Homogeneous Ordinary Differential Equation 53
4.2 First and Second Order Non-Homogeneous Ordinary Differential
Equations 73
4.3 Third Order Homogeneous and Non-Homogeneous ODE 103
4.4 First and Second Order Linear ODE with Variable Coefficients: 112
4.5 Nonlinear
Ordinary Differential Equations (The Riccati Form of ODE) 122
4.6 Solving Nth order linear ordinary differential equations 131
4.7
Simulation 132
4.8
Discussion 144
CHAPTER 5:
SUMMARY, CONCLUSION AND Recommendation 145
5.1 Summary
145
5.2 Conclusion
146
5.3 Recommendations 146
5.4 Contribution
to knowledge 146
References 147
LIST OF TABLES
LIST
OF FIGURES
CHAPTER 1
INTRODUCTION
1.1 DEFINITION OF A NEURAL NETWORK
A neural network is fundamentally a
mathematical model, and its structure consists of a series of processing
elements which are inter-connected and their operation resemble that of the
human neurons. These processing elements are also known as units or
nodes. The ability of the network to process
information is embedded in the connection strengths, simply called weights,
which, when exposed to a set of training patterns, adapts to it. (Graupe 2007).
The human brain consists of billions of nerve cells
or neurons, as shown in Figure 1.1a. Neurons communicate
through electrical signals which are short-lived impulses in the electromotive
force of the cell wall. The neuron to neuron inter-connections are
intermediated by electrochemical junctions called synapses, which are
located on branches of the cell known as dendrite. Each neuron receives a
good number of connections from other neurons, and there is constant
incoming of multitude of signals, which each neuron receives and eventually
gets to the cell body. Here, they are summed together in a way that if the
resulting signal is greater than some threshold then the neuron will generate
an impulse in response, coming from an electromotive force. This particular
response is transmitted to other neurons through the axon which is a
branching fibre. (Gurney, 1997). See figures 1.1a and 1.1bs.
Neural
network methods can solve both ordinary as well as partial differential
equations. And it relies on the function approximation seen in feed- forward
neural networks which results in a solution written in an analytic form. This
form employs a feed forward neural network as a basic approximation element. (Principe et al., 1997).
Training of the neural network can be done either by any optimization technique
which in turn requires the computation of the derivative of the error with
respect to the network parameters, by regression based model or by basis
function approximation. In any of these methods, a neural network solution of
the given differential equation is assumed and designated a trial solution which
is written as a sum of two parts, proposed by Lagaris et al., (1997). The first part of the trial solution satisfies the conditions
prescribed at the initial or boundary, and contains non-of the parameters that
need adjustment. The other part contains some adjustable parameters that
involves feed- forward neural network and is constructed in a way that does not
affect the conditions. Through the construction, the trial solution, initial or
boundary conditions are satisfied and the network is trained to satisfy the
differential equation.
Fig.1.1 a Biological
Neuron (Carlos
G., Online)
Figure 1.1 b An Artificial Neuron (Yadav
et al., 2015)
It is this architecture in Figure
1.1b, and style of processing that we hope to incorporate in neural networks
solution of differential equations.
1.2 STATEMENT
OF THE PROBLEM
In
this research, we propose a new method of solving ordinary differential
equations (ODEs) with initial conditions through Artificial Neural Network
(ANN) based models. The conventional way of solving differential equations
using artificial neural network involves updating of all the parameters,
weights and biases, during the neural network training. This is caused by the
inability of the neural network to predict a solution with an acceptable
minimum error. In order to reduce the error, the error function is minimized.
Minimizing the error function demands finding its gradient. This gradient
involves the computation of multivariate partial derivatives of the error
function with respect to all the parameters, weights and biases, and the
independent variable. This is quite involving as we shall demonstrate later for
first order differential equation. It is even more difficult when solving
second or higher order ODE where you need to find the second or higher order
derivative of the error function. This research work involves systematically
computing the weights such that no updating is required, thereby eliminating
the herculean task in finding the partial derivative of the error function.
1.3 PURPOSE
OF THE STUDY
The
main purpose of embarking on this research is to explore an avenue or an
approach to reducing the herculean task involved in weight updating in the
process of neural network training of the parameters associated with the
minimization of the error function, which in turn involves multivariate partial
derivatives with respect to all the parameters and the independent variables.
1.4 AIM AND
OBJECTIVES
Aim: The aim of this work is to solve both linear
and nonlinear ordinary differential equations using Artificial Neural Network
(ANN) model, by implementing the new approach which this study proposes. We
shall achieve the aim through the following objectives:
Objectives:
We shall systematically
(i)
compute the weights from
input layer to hidden layer using regression based model
(ii)
compute the weights from
hidden layer to output layer using Radial Basis Function (RBF) model
(iii)
slightly adjust the
biases using Mathematical Computer Aided Design (MathCAD) 14 software algorithm
to achieve the desired accuracy
(iv)
develop a neural network
that will incorporate the nonlinearity found in such ODE as the Riccati type
(v)
Suggests a way of
tackling nth
order ODE
(vi)
Compare our results with
analytical results and some other neural network result.
(vii)
Simulate our results to
show how they agree with other solutions
1.5 SIGNIFICANCE
OF THE STUDY
A
neural network based model for solving differential equations provides the
following advantages over the standard numerical methods:
a.
The neural network based
solution of a differential equation is differentiable and is in closed analytic
form that can be applied in any further calculation. On the other hand most
other methods like Euler, Runge-Kutta, finite difference, etc, give a discrete
solution or a solution that has limited differentiability.
b.
The neural network based
method for solving a differential equation makes available a solution with
fantastic generalization properties.
c.
Computational complexity
does not increase rapidly in the neural network method when the number of points
to be sampled is increased while in the other standard numerical methods
computational complexity increases rapidly as we increase the number of
sampling points in the interval. Most other approximation methods are observed
to be iterative in nature, and the step size fixed before the beginning of the
computation. ANN offers some reliefs to overcoming some of these repetition of
iterations. Now, after the ANN has converged, we may use it as a black box to
get numerical results at any randomly picked points in the domain.
d.
The method is general and
can be applied to the systems defined on either orthogonal box boundaries or on
irregular arbitrary shaped boundaries.
e.
Models based on neural
network offers an opportunity to handle difficult differential equation
problems arising in many sciences and engineering applications.
f.
The method can be
implemented on parallel architectures. (Yadav et al. 2015)
1.6 JUSTIFICATION OF THE STUDY
The new approach we are proposing in this research
will eliminate the computation of partial derivative of the error function
thereby reducing the task involved in using neural network to solve
differential equations.
1.7 SCOPE
OF THE STUDY
This
study covers first, second and third order linear and first order nonlinear ODE
with constant and variable coefficients. It is also extended to the nth order
linear ODE, all with initial conditions. It does not include ODE with boundary
conditions, other nonlinear ODE with the product of the dependent variable and
its derivative, and partial differential equations.
1.8 DEFINITION OF TERMS
Nodes: are computational units which
receive inputs, and process them into output.
Synapses: are connections between neurons. They determine the information flow
which exists between nodes.
Weights: are the respective signaling strength. The ability of the network to process
information is stored in connection strength, simply called weights.
Neurons: are the primary signaling units of the central nervous system and each
neuron is a distinct cell whose several processes arise from its cell body. A
neuron is the basic processor or processing element in a neural network. Each
neuron receives one or more input over its connections and produces only one
output.
Architecture: is the pattern of connections between the neurons which can be a
multilayer feed forward
neural network architecture. (Tawfiq & Oraibi, 2013).
When a neural network is in layers, the neurons are
arranged in the form of layers. There are a minimum of two layers: an input
layer and an output layer. The layers between the input layer and the output layer,
if they exist, are referred to as hidden layers, and their computation nodes
are referred to as hidden neurons or hidden units. Extra neurons at the hidden
layers raise the network’s ability to extract higher-order statistics from
(input) data. (Alaa, 2010).
Training: is the process of setting the weights and biases from the network, for
the desired output.
Regression: is a least-squares curve that fits a particular data.
Goodness of Fit (R2): is a terminology used in regression analysis to tell us how good a
given data has fit the regression model
Neural Network: is interconnection of processing elements, which resemble that of human
neurons.
Artificial Neural
Network: is a simplified mathematical model of
human brain, also known as information processing system.
Activation function: is a threshold or transfer function (non-linear operator), which keeps
the cell’s output between certain limits as is the case in the biological
neuron.
Axon: conducts electric signals down its length.
Bias: is a parameter which helps to speed up convergence. The addition of
biases increases the flexibility of the model to feed the given data. Bias
determines if a neuron is activated. The performance of an activation function
ought to be propagated forward through the network. The bias term in the
network determines whether or not this will happen. The absence of bias hinders
this forward propagation, leading to undesirable outcome.
1.9 ACRONYMS:
ANN
– Artificial Neural Network
BVP
– Boundary Value Problem
CPROP - Constrained-Backpropagation
FFNN
– Feed Forward Neural Network
GRBF
- Gaussian Radial Basis function
IVP
– Initial Value Problem
MathCAD
– Mathematical Computer Aided Design
MLP
– Multi Layer Perceptron
MSE
– Mean Squared Error
NN
– Neural Network
ODE
– Ordinary Differential Equation
PDE
– Partial Differential Equation
PDP - Parallel Distributed Processing
PE - Processing Elements
RBA
– Regression Based Algorithm
RBF
– Radial Basis Function
RBFNN
– Radial Basis Function Neural Network
SPSS
– Statistical Package for Social Sciences
Login To Comment