ABSTRACT
In this research work we propose a feed forward neural network for solving constrained optimization problems with inequality and equality constraints. We employ penalty functions that transformed a constrained problem into a single unconstrained problem. The constraints are placed into the objective function via a penalty parameter in such a way that penalizes any violation of the constraints. The penalty function constructed is actually an energy function for the neural network. Assuming differentiability, a local minimum of the penalty function is found by using a dynamic gradient scheme which provides a system of differential equations for the input neurons corresponding to the variables of the given optimization problem. These can then be solved to give solutions which converge ultimately to the optimal solution of the constrained optimization problem. We discovered that our approach is easier and fast and reduces the vigorous steps and assumptions in other methods. Mathcad 14 was used in executing the program.`
TABLE OF CONTENTS
Title
page i
Declaration
ii
Certification
iii
Dedication iv
Acknowledgement v
Table
of contents vi
List
of Figure ix
Abstract x
CHAPTER 1: INTRODUCTION
1.1 Background of the Study 1
1.2
Definition (Activation Function) 4 1.3 Mathematical
Analysis of the Perceptron 8
1.4 Training
Rule 9
1.5 Multi- Layer Perceptron 10
1.6 Statement
of the Problem 12
1.7 Motivation
of the Study 13
1.8 Objective
of the Study 13
1.9 Significance
of the Study 14
1.10 Limitations
of the Study 14
1.12 Definition of Terms 14
CHAPTER 2: LITERATURE REVIEW
2.1 Review
of Related Literature 16
2.2 Application of Ann 18
CHAPTER 3: METHODOLOGY
3.1 Problems involving the Khun-Tucker Condition 20
3.2 Illustrative Examples 23
3.3 Constraint Qualification 32
3.4 Why we
use Neural Network 32
3.5 Penalty Function Methods 33
3.6 Optimization
with Artificial Neural Network 36
CHAPTER 4: RESULTS AND DISCUSSION
4.1
Demonstrating the Application of our Method with Examples 41
4.2 Simulation of the Ode of Problem 1 and
Graphical Profile 43
4.3 Simulation of the Ode of Problem 2 and
Graphical Profile 53
4.4 Simulation
of the Ode of Problem 3 and Graphical Profile 64
CHAPTER
5: CONCLUSION AND RECOMMENDATIONS
5.1 Summary
75
5.2 Conclusion
75
5.3 Recommendation 75
References
76
LIST OF FIGURES
1.1 Mathematical
structure of ANN 4
1.2
The perceptron 8
1.3 Network
training 10
1.4
Structure of the multilayered perceptron network with single
hidden layer. 12
CHAPTER
1
INTRODUCTION
1.1 BACKGROUND OF THE STUDY
The process of minimizing or maximizing a problem is
called optimization or mathematical programming as we can see in some texts.
Optimization is all about finding the best possible way in a given set of
circumstances. Most problems in life always appear as non-linear multivariate
problems when expressed mathematically. Sometimes, these problems may have some
conditions which its decision variables must satisfy and in this case, we say
the problem is constrained, otherwise unconstrained. Optimization involves
choices informed by the study of the situations and parameters affecting those
situations in order to either minimize cost/efforts or maximize benefits. It
could be a cost minimization function, profit maximization function,
transportation problem or an assignment problem. According to (Avriel, 2003)
the function to be optimized is called an objective function, a loss function
(minimization), an indirect utility function (minimization), an utility
function (maximization), a fitness function (maximization), or in certain
fields, an energy function, or energy functional. According to (Chong et al,
200), optimization is central to any problem involving decision making, whether
in engineering or in economics. The task of decision making entails choosing
between various alternatives. This choice is governed by our desire to make the
"best" decision. The measure of goodness of the alternatives is
described by an objective function or performance index. Optimization theory
and methods such as penalty function method, Lagrange multiplier method,
augmented Lagrange multiplier for inequality constraints, quadratic
programming, gradient projection method for equality and inequality constraints
etc, deal with selecting the best alternative in the sense of the given
objective function.
Optimization problems can be linear or non-linear,
convex or concave, constrained or unconstrained, etc. However the case, we are
concerned in this work with solving some non-linear constrained optimization
problems using artificial neural network approach
Artificial neural network models are
based on the neural structure of the brain. The brain learns from experience
and so do artificial neural networks. Previous research has shown that
artificial neural networks are suitable for pattern recognition and pattern
classification tasks due to their non-linear nonparametric adaptive-learning
properties. Artificial Neural Network is widely used in various branches of
engineering and science and their unique property of being able to approximate
complex and non-linear equations makes it a useful tool in quantitative
analysis.
Neural networks adopt a different approach to problem solving as
compared with conventional methods to solution (Smith, 1990). Conventional
methods to solution use an algorithmic approach i.e. the computer follows a set
of instructions in order to solve a problem. Unless the specific steps that the
computer needs to follow are known the computer cannot solve the problem. That
restricts the problem solving capability of conventional methods to solution to
problems that we already understand and know how to solve. Neural networks
process information in a similar way the human brain does. The network is
composed of a large number of highly interconnected processing elements
(neurons) working in parallel to solve a specific problem. Neural networks
learn by example (Shriver,1988). They cannot be programmed to perform a
specific task. The examples must be selected carefully otherwise useful time is
wasted or even worse, the network might be functioning incorrectly. The
disadvantage is that because the network finds out how to solve the problem by
itself, its operation can be unpredictable. On the other hand,
conventional methods to solution use a
cognitive approach to problem solving; the way the problem is to be solved must
be known and stated in small unambiguous instructions. These instructions are
then converted to a high level language program and then into machine code that
the computer can understand. These machines are totally predictable; if
anything goes wrong it is either due to a software or hardware fault. Neural
networks and conventional algorithmic methods to solution are not in
competition but complement each other. There are tasks that are more suited to
an algorithmic approach like arithmetic operations and tasks that are more
suited to neural networks. Even more, a large number of tasks, require systems
that use a combination of the two approaches (normally a conventional methods
to solution is used to supervise the neural network) in order to perform at
maximum efficiency.
The true power and advantage of neural
networks lies in their ability to represent both linear and non-linear
relationships and in their ability to learn these relationships directly from
the data being modelled. Traditional linear models are simply inadequate when
it comes to modelling data that contains non-linear characteristics. In this
paper, one model of neural network is selected among the main network
architectures used in engineering. The basis of the model is neuron structure
as shown in Figure. 1.1. These neurons act like parallel processing units (Stefan
etal, 1980). An artificial neuron is a unit that
performs a simple mathematical operation on its inputs and imitates the functions
of biological neurons and their unique process of learning. The ANN attempts to
recreate the computational mirror of the biological neural network, although it
is not comparable since the number and complexity of neurons used in the
biological neural network is many times more than those in an artificial neural
network. ANN is comprised of a network of artificial neurons known as (nodes).
These nodes are connected to each other, and the strength of their connections
to one another is assigned a value based on their strength. If the value of the
connection is high, then it indicates that there is a strong connection. Within
each nodes, a transfer function( activation function) is built in. There are
three types of neurons in an ANN, input
nodes, hidden nodes and output nodes
The input nodes takes in information,
the information is presented as activation values, where each nodes is given a
number, the higher the number, the greater the activation. The information is
now passed throughout the network based on the strength of the connection. The
activation flows down the network,
through the hidden layers, until it reaches the output nodes. The output nodes
then reflect the input in a meaningful way to the outside world. The difference
between predicted value and actual value will be propagated backwards by
apportioning them to each nodes weights according to the amount of the error
the node accumulated.
Fig.1.1 Mathematical structure of ANN
From
Fig. 1.1, we have
and the output of the neuron is
where
f is the activation function.
1.2
DEFINITION (ACTIVATION FUNCTION)
An
activation function is a function that performs a mathematical operation on the
signal output by translating the input signal to output signal
The
most common activation functions are depicted above. They are;
(a) Threshold or Step
function :
A
threshold activation function is a
saturating linear function and can have either a binary type or a bipolar type
as shown below.
Binary
threshold type
Output
of a binary threshold function produces
1
if the weighted sum of the inputs is positive
0
if the weighted sum of the inputs is negative
1
Bipolar
threshold type
Output of a bipolar threshold
function produces
I if the weighted sum of the input is
positive
-I if the weighted sum of the input
is negative
(b) Piecewise linear function
It is also called saturating linear function
and can have either a binary or bipolar range for the saturation limits of the
output. The mathematical model for a symmetric saturation function is described
below. This is a slopping function that produces
-1 for a
weighted sum of input
1
for a weighted sum of input
I proportional to input for values between +1
and -1 weighted sum
(c) Sigmoid
function
The
non-linear curved S-shape function is called the sigmoid function. This is the
most commonly type of activation used to construct the neural networks. It is
mathematically well behaved, differentiable and strictly increasing function.
The function can
be written in the form
O
for large input
value
1 for large values
d) Gaussian function
This
functions are bell-shaped curves that are continuous. The function is of the
form:
Example
The neuton consists of four inputs with
the estimated weights
The output
I of the network, prior to the activation function stage; is
Since the weighted sum of the input is
. With a binary activation function the
outputs of the neuron is y = 1.
Note: All functions f are designed to produce values between 0 and 1
1.3 MATHEMATICAL ANALYSIS OF THE PERCEPTRON
The
Perceptron unit is simply an artificial neuron structure with a step function
as activation function. A single layer perceptron is an arrangement of one
input layer of neurons feed forward to one output layer of neurons. The single
layer feed-forward network consists of a single layer of weights and the inputs
are directly connected to the output. The synaptic link carrying weights
connect every input to every output and this way is called a network of feed
forward type. The sum of the weight
productsand the inputs is calculated in each neuron node. If the value is above
zero, the neuron fires and take the activated value 1 otherwise it takes
deactivated value -1.
Fig.1.2
The perceptron
For
this network the output z is given by
Learning
by Gradient Descent Error Minimization
The Perceptron learning rule is an algorithm that adjusts the network weights
to minimize
the difference between the
actual outputs
and the target outputs.
We can quantify this difference by defining the sum squared error function, summed over all output units i and all
training patterns m:
1.4 TRAINING
All neural networks must be trained, and the common
training algorithm is depicted in the following flowchart below (Fig. 1.3).
Figure.1.3. Network training
It is the general aim of network learning to minimize this error by adjusting the weights wmn.
Typically we make a series of small adjustments to the weights
until the
error
is ‘small enough’. Here
is the weight update of the link
connecting the mth and nth neuron of the two neighbouring
layers. We can determine which direction to change
the weights in by looking at the gradients of E with respect to each
weight wmn. Then the gradient descent update equation (with
positive learning rate n
) is given by
which can be applied iteratively to minimize the error.
1.5
MULTI- LAYER PERCEPTRON
The most common neural network model is the
Multilayer Perceptron (Shriver, 1988; Salchenberger). This type of neural
network is known as a supervised network because it requires a desired output
in order to learn. The goal of this type of network is to create a model that
correctly maps the input to the output using historical data so that the model
can then be used to produce the output when the desired output is unknown. A
multilayer perceptron has the same structure of a single layer perceptron with one
or more hidden layers, the hidden layer does intermediate computations before
directing the input to output layer. The input layer neurons are linked to the
hidden layer neurons, the weight on these link are referred to as input-hidden
layer weights. The hidden layer neurons and the corresponding weights are
referred to as output-hidden layer weights. The computational unit of the
hidden layer are known as hidden neurons.
The inputs are led into the input layer and get multiplied by
interconnection weights as they are passed from the input layer to the hidden
layer. Within the hidden layer, they get summed then processed by a non-linear
function (usually the hyperbolic tangent). If more than a single hidden layer
exists then, as the processed data leaves the first hidden layer, again it gets
multiplied by interconnection weights, then summed and processed by the second
hidden layer and so on. Finally, the data is multiplied by interconnection
weights then processed one last time within the output layer to produce the
neural network output. To perform any task, a set of experiments of an input
output mapping is needed to train the neural network. Thus, the training sample
data have to be fairly large to contain all the required information and must
include a wide variety of data from different experimental conditions and
process parameters.
Figure.1.4 Structure of the Multilayered Perceptron
network with single hidden layer.
1.6 STATEMENT OF THE PROBLEM
Many
constraint optimization problems have been solved using Khun-Trucker condition
method and Lagrange multiplier methods. Most of these methods still give
concern because of their vigorous steps and assumptions. To solve constrained
non-linear optimization problems using artificial neural networks, we employ
penalty functions that transform the constrained problem into a single
unconstrained problem. The constraints are placed into the objective function
via a penalty parameter in such a way that it penalizes any violation of the
constraints. The penalty function constructed is actually an energy function
for the neural network. Assuming differentiability,
a local minimum of the penalty function is found by using a dynamic gradient
scheme which provides a system of differential equations for the input neurons
corresponding to the variables of the given optimization problem.
1.7 MOTIVATION FOR THE STUDY
I
am motivated to carry out this research work on ANN optimization, because compared
with traditional numerical methods for constrained optimization, the neural
network approach has several advantages in real-time applications.
1.8 AIM AND OBJECTIVES OF THE STUDY
The major aim of this work is to study and
analyze the application of Neural Network to constrained optimization problems
using penalty function approach. Our specific objectives are to:
i)
Apply Artificial Neural
Network (ANN) in non-linear optimization theory.
ii)
Show the basics of
employing penalty functions that transform the constrained problem into a
single unconstrained problem.
iii)
Solve some non-linear
constrained problem by applying Artificial Neural Network to penalty functions
method.
1.9 SIGNIFICANCE OF THE STUDY
Neural networks, possess the remarkable
ability to derive meaning from complicated or imprecise data, and can be used
to extract patterns and detect trends that are too complex to be noticed by
either humans or other computer techniques.
A trained neural network can be thought of
as an expert in the category of information it has been given to
analyze. This expert can then be used to provide projections given new
situations of interest.
1.10
LIMITATIONS OF THE
STUDY
At the present moment there exist quite a
number of different architectures for ANNs. In this research work we restricted
our study to feed forward neural network for solving non-linear optimization
problems subject to equality and inequality constraints.
1.11
DEFINITION OF TERMS
Minimum Point: Given a function the point is said to be a minimum point of
Maximum Point: A point
is said to be a maximum point of a function if
Convex Function: A function defined on a convex set is convex if and only if for all we have;
Constraint Function: This is a function which describes the conditions that the
decision variables are subjected to. It can be equation or an inequality.
Linear Optimization Problem: An optimization problem is said to be linear if the
objectives function is linear and the decision variables are subject to linear
constraints.
Nonlinear Optimization Problem: An optimization is said to be non-linear if either the
objective function or the constraint function is non-linear.
Constrained Optimization Problem: An optimization problem is said to be constrained if its
decision variables are subject to some constraints or condition. It is given as
Positive Definiteness: A matrix H is said
to be positive definite if given a vector
then
Login To Comment