ABSTRACT
In this dissertation, a new family of distribution called the Exponentiated Gumbel-G (EGu-G) family of distributions was developed using the T – X approach proposed by Alzaatreh et al., (2013a). The probability density function (PDF) of exponentiated Gumbel distribution was used as the generator while the logit of the Cumulative Distribution Function (CDF) of any continuous random variable is the transformation function. The PDF, CDF, Survival Function (SF) and Hazard Rate Function (HRF)) of the new family was explicitly defined. Various properties of the proposed family were investigated. The PDF of the new family was expressed as an infinite linear combination of exponentiated – G distribution of the baseline distribution. Its bivariate extension of the proposed family was derived while estimation of the parameters of the family were discussed based on Maximum Likelihood Estimation (MLE) method. Taking the baseline distribution as Exponential, Power, Lomax and Weibull distributions, we obtained Exponentiated Gumbel Exponential (EGuE), Exponentiated Gumbel Power (EGuP), Exponentiated Gumbel Lomax (EGuL) and Exponentiated Gumbel Weibull (EGuW) distributions respectively as members of the EGu – G family. Shapes of HRF obtainable from these members of the family include increasing, decreasing, bathtub and inverted bathtub shaped. The properties of EGuL and EGuW were studied. The effect of the shape parameters on the shape of the studied members was investigated using quantile based measures of coefficient of skewness and kurtosis. A simulation study was carried out on MLEs of parameters of the EGuL and EGuW distribution to ascertain their stability. The potentiality of the EGu-G family was illustrated using EGuL and EGuW, through the applications to four different datasets.
TABLE OF CONTENTS
Title Page i
Declaration ii
Certification iii
Dedication iv
Acknowledgement v
Table of Contents vi
List of Tables ix
List of Figures x
Abstract
xii
CHAPTER 1: INTRODUCTION
1.1 Background of the Study 1
1.2 Statement of the Problem 3
1.3 Rationale for Study 4
1.4 Aim and Objectives 4
1.5 Scope of the Study 5
1.6 Definition of Terms 5
CHAPTER 2: LITERATURE REVIEW
2.1 Introduction 8
2.2 Method of Generating Families of
Distributions 8
CHAPTER 3: EXPONENTIATED GUMBEL (EGu-G) FAMILY OF
DISTRIBUTIONS AND ITS PROPERTIES
3.1 The Exponentiated Gumbel (EGu-G) Family
23
3.2 Mathematical Properties 27
3.2.1 Shapes of the PDF and HRF of EGu-G family 27
3.2.2 Quantile function of EGu-G family 29
3.2.3 Useful expansions 31
3.2.4 Representation of CDF and PDF of EGu-G 32
3.2.5 Moments 36
3.2.6 Incomplete moments 37
3.2.7 Probability weighted moment (PWM) 38
3.2.8 Mean deviation 39
3.2.9 Moment of residual Life function (MRL) 40
3.2.10 Entropy 40
3.3 Order Statistics 42
3.4 Bivariate Extension 45
3.5 Estimation 48
3.6 Conclusion
50
CHAPTER 4: SPECIAL MEMBER OF EGu-G FAMILY OF DISTRIBUTIONS
4.1 Exponentiated Gumbel Exponential
Distribution (EGuE) 51
4.2 Exponentiated Gumbel Power Distribution
(EGuP) 53
4.3 Exponentiated Gumbel Lomax Distribution
(EGuL) 55
4.3.1 Shapes of PDF and HRF of EGuL 59
4.3.2 A mixture representation of PDF and CDF of
EGuL 60
4.3.3 Quantile function
63
4.3.4 Moment 66
4.3.5 Inequality curves EGuL distribution 69
4.3.6 Probability weighted moment (PWM) 70
4.3.7 Moment of residual life function (MRL) 70
4.3.8 Entropy 72
4.3.9 Order statistics 75
4.3.10 Maximum likelihood estimates 77
4.3.11 Monte carlo simulation of MLE for EGuL 80
4.3.12 Applications of EGuL 82
4.3.13 Conclusion 90
4.4 Exponentiated Gumbel Weibull (EGuW)
Distribution 91
4.4.1 The Model 91
4.4.2 Shapes of PDF and HRF of EGuW distribution 95
4.4.3 Quantile function of EGuW distribution 98
4.4.4 Expansion of PDF and CDF of EGuW 101
4.4.5 Ordinary and incomplete moments of EGuW
distribution 104
4.4.6 Inequality measures 107
4.4.7 Probability weighted moments (PWM) 108
4.4.8 Entropy 109
4.4.9 Order statistics 111
4.4.10 ML estimates of EGuW parameters 112
4.4.11 Monte carlo simulations of MLE for EGuW 114
4.4.12 Application of EGuW to real data sets 117
4.4.13 Conclusion
124
CHAPTER 5: SUMMARY AND CONCLUSION
5.1 Summary
125
5.2 Conclusion 126
References
128
LIST
OF TABLES
4.1 The Skewness
and Kurtosis
of EGuL Distribution for b=1 66
4.2 Result of Monte Carlo Simulation of EGuL 83
4.3 MLEs of EGuL parameters and other
competing Models for dataset 1 (Standard errors in parenthesis)
86
4.4 Cramer – Von Mises
and Anderson Darling
Statistics for dataset 1 87
4.5 MLEs of EGuL parameters and other
competing Models for dataset 2 (Standard errors in parenthesis)
90
4.6 Cramer – Von Mises
and Anderson Darling
Statistics for dataset
2
90
4.7 Skewness
and Kurtosis
of EGuW distribution for b = 1
100
4.8 Result of Monte Carlo Simulation of EGuW 117
4.9 MLEs of EGuL parameters and other
competing models for dataset 3 (Standard errors in parenthesis)
120
4.10 Cramer – Von Mises
and Anderson Darling
Statistics for dataset
3 120
4.11 MLEs of EGuL parameters and other
completing Models for data set 4 (Standard errors in parenthesis) 123
4.12 Cramer – Von Mises
and Anderson Darling
Statistics for dataset 4
124
LIST OF FIGURES
4.1 Plots of PDF of EGuE Distribution for
selected parameter values 52
4.2 Plots of CDF of EGuE Distribution for
selected parameter values 53
4.3 Plots of HRF of EGuE Distribution for
selected parameter values 54
4.4 Plots of PDF of EGuP Distribution for
selected parameter values 55
4.5 Plots of CDF of EGuP Distribution for
selected parameter values 55
4.6 Plots of HRF of EGuP Distribution for
selected parameter values 56
4.7 Plots of PDF of EGuL Distribution for
selected parameter values 58
4.8 Plots of CDF of EGuL Distribution for
selected parameter values 58
4.9 Plots of HRF of EGuL Distribution for
selected parameter values 59
4.10 Plots of Estimates PDF of EGuL distribution
and other competing Models based on
dataset 1 88
4.11 Plots of Estimates CDFs of EGuL
Distribution and other competing
models based on dataset 1 88
4.12 Plots of Estimates PDF of EGuL distribution
and other competing models based on
dataset 2 91
4.13 Plots of Estimates CDFs of EGuL Distribution
and other competing models based on dataset 2 91
4.14 Plot of CDF of EGuW distribution for
selected parameter values 93
4.15 Plots
of PDF of EGuW distribution for selected parameter values 94
4.16 Plots
of HRF of EGuW distribution for selected parameter values 96
4.17 Plots of Estimated PDF of EGuW
Distribution and other competing models based on dataset 3. 121
4.18 Plots of Estimated CDF of EGuW
distribution and other competing models based on dataset 3. 122
4.19 Plots of Estimated PDF of EGuW
Distribution and other competing
models based on dataset 4 124
4.20 Plots
of Estimated CDF of EGuW distribution and other competing
models based on dataset 4 125
CHAPTER 1
INTRODUCTION
1.1 BACKGROUND
OF THE STUDY
The normal distribution
has been at the center of most practical statistical studies and developments
in probability distribution theory for many years. Thus most findings in
statistics are being reported based on normality assumptions. The increasing
collection, tabulation, and publication of data in various fields in the late
19th century have revealed that the normal distribution was no
longer sufficient for describing phenomena in real world situations (Kotz and
Vicari, 2005). This has lead to the development of many theoretical
distributions to take care of asymmetry in some data sets.
Theoretical distributions
in statistics can either be discrete or continuous. Our interest in this
research is on continuous theoretical distributions. Examples of notable continuous theoretical continuous
distributions include but not limited to, exponential distribution , normal
distribution, lognormal distribution, Weibull distribution, Lomax distribution, Frechet distribution, beta distribution, gamma
distribution, Rayleigh distribution,
Burr III, X and XII distributions,
Lindley distribution, uniform distribution, Gumbel distribution, logistic distribution, Pareto distribution,
Kumaraswamy distribution , student-t distribution, chi-square distribution , power
distribution, Topp-Leone distribution, kappa distribution , Cauchy distribution
and Birnbaum-Saunders distribution.
These theoretical
distributions have many applications both in theory and practice. But their
applications are limited due to some obvious limitations that some of them
have. For instance, the exponential distribution which is a very important
distribution in reliability modeling has a memoryless property and constant
failure rate. It is difficult to find real life process whose failure rate is
constant. This has greatly limited the applicability of exponential
distribution as most real life processes have a failure rate that is
increasing, decreasing, bathtub, unimodal or modified unimodal shaped
(Almaliki, 2014). Similarly, the Weibull distribution is an important
distribution in lifetime modeling. It has a monotone hazard rate function.
However certain lifetime data (for instance human mortality, machine life cycle
and data from biological and medical studies) require nonmonotonic hazard rate
shapes (Almaliki, 2014). Hence the application of Weibull distribution is
restricted to only hazard rate that is monotonic in nature. In fact, monotonic
hazard rate is a feature that is common to many popular lifetime models.
In order to accommodate
this reality in statistical analysis, many methods of generating univariate
distributions with various hazard rate shapes have been developed. Lee et al., (2013)
classified some of the methods of generating families of univariate
distributions broadly into methods developed prior to 1980 and those developed
from 1980 to date. Post-1980 methods involve adding parameter(s) to a
distribution or combining two distributions. Prominent among the post-1980
methods include the method of addition of parameter(s), beta-generated, transformed-transformer
and the composite method. A detailed discussion of these methods is done in the
next chapter.
When a classical
distribution is generalized extra parameters from the generator (another
probability distribution) are added to the distribution to induce skewness to
the generated distribution. A classical distribution can be generalized using a
generator and the properties of the generalized distribution largely depend on
the generator. The preference to a
generator in generalizing an existing distribution is largely on the basis of
flexibility or tractability. Choosing a
generator whose cumulative density function (CDF) is tractable when
generalizing a distribution is of theoretical importance because studying the
properties of the generated distribution is easier. Furthermore, the simulation
of a random sample from the generated distribution is also possible when the
generator is tractable. The Beta family proposed by Eugene et al., (2002) is an example of a family of distributions that has
a generator that is not tractable while Kumaraswamy family Jones (2009) and Cordiero
and de Castro (2011) is a family with a tractable generator.
Apart from the beta and
Kumaraswamy family, other notable families of distribution are; transmuted
family of distributions, generalized transmuted family of distributions,
Marshal-Olkin family of distributions, McDonald-G family of distributions,
Weibull-G family of distributions, Weibull-X family of distributions, beta
Marshall-Olkin family of distributions, Kumaraswamy Marshall-Olkin family of
distributions, Gumbel-X family, Gamma-X family of distributions, logistic-X
family of distributions, T-Normal family of distributions, T-Weibull family of
distributions, Lindley family of distributions, Power Lindley family of distributions and exponentiated Weibull
family of distributions.
The generators for most
families of distributions listed and those not listed above have support
between 0 and 1or the positive real line. Very few families have generators
with support on the real line. Secondly, the generators for the families have a
maximum of two parameters out of possible three (shape, scale, and location).
These form the basis of the choice of the generator used in this research. We
considered exponentiated Gumbel distribution which has its support on the real
line with a shape, scale and location parameters.
1.2 STATEMENT OF PROBLEM
Probability distributions
have been used over time to model random behaviors of many processes. Many
classical distributions have been used to serve this end. In a further
development of the theory, researchers have shown that classical distributions
are unable to model these processes effectively. This has spurred the need to
modify, generalize and extend these classical distributions. In order to modify
a distribution, parameters have to be added to it from the family used in the
modification. Many classical distributions with support on positive real line
have been used as generators for many families of distribution. In this work,
an attempt is made on extending the work of Al-Aqtash
et al., (2015) through the use of
exponentiated Gumbel distribution with an extra parameter (shape) as a
generator in the T-X system of distributions.
1.3 RATIONALE FOR STUDY
The quality of
results obtained in the statistical analysis of data depends heavily on the assumed
probability model or distribution. Because of this, considerable effort has
been expended in the development of large classes of probability distributions
and their extensions. However, there are still many instances where real data
does not follow any of the classical or standard probability models. Thus there
is a need to develop new models that can take care of this situation and
possibly serve as an alternative to existing models.
1.4 AIM AND OBJECTIVES
The
aim of this study is to propose and study a new family of probability
distribution called the exponentiated Gumbel-G (EGu-G) family of distribution.
The following are the objectives of the
study:
I.
To define the Probability density function
(PDF) and CDF of the new family.
II.
To study the general properties of the new
family.
III.
To propose an estimation procedure for the
new family of distribution.
IV.
To study at least two special members of
the new family of distribution.
V.
To ascertain the stability of the
estimates of the proposed family through a simulation study.
VI.
Apply the generated distributions to real
life data and compare it with other distributions with the same baseline
distribution.
1.5 SCOPE OF THE STUDY
This Study is basically
on generation of the exponentiated Gumbel family of distributions using the T-X
framework. The properties of the new family such as the shapes of the PDF, CDF,
survival function and hazard function will be investigated. Other properties
such as the quantile function, moments, inequality measures, entropy, order
statistics, bivariate extension and Maximum Likelihood Estimate (MLE) of the
parameters of the new family will be derived. Two special members of this family
will be studied and stability of the MLEs of their parameters ascertained using
simulation study. The real life data sets used for illustration were extracted
from referred journals. R statistical software is used in making all the plots
and computations.
1.6 DEFINITION OF TERMS
The
definitions of terms used in this study are given in this section.
Definition 1.6.1
Asymmetry: A distribution is said to
be asymmetric if the distribution is uneven in nature.
Definition
1.6.2
Baseline Distribution: This
is the same as the parent distribution. It is an existing distribution
that is generalized, extended or modified.
Definition 1.6.3
Bathtub Shape:
This is a term used to describe a curve that has the shape of a bathtub. A
curve with a bathtub shape initially has a decreasing stage followed by the
constant stage and finally the increasing stage.
Definition 1.6.4
Failure rate:
This is the frequency at which a system fails. It is synonymous to the hazard
rate function.
Definition 1.6.5
Inverted bathtub shape:
This is a curve that is characterized by three parts in the following order:
increasing, constant and decreasing. Such a curve usually has one mode.
Definition 1.6.6
Kurtosis:
Kurtosis
is a measure of tail heaviness or tenderness of a curve relative to a normal
distribution. A curve of a
distribution can be; Leptokurtic, platykurtic and Mesokurtic.
Definition 1.6.7
Location parameter: This is a parameter whose change in value
could shift the curve either to the left or to the right. Location parameters
are usually subtracted from a random variable in PDF of a distribution.
Definition 1.6.8
Outlier:
This is an observation that is far detached from other remaining observations
in a data set.
Definition 1.6.9
Scale parameter:
A scale parameter is that parameter whose change in value does not change the
shape of the curve of a PDF. It is that parameter that divides a random
variable in a density function. It can be referred to as a rate parameter when
it multiplies a random variable in the density function
Definition 1.6.10
Shape parameter:
This is a parameter that determines the shape of the probability distribution.
A change in its value brings a change in the shape of the density curve. It
usually appears as the power of a random variable or may stand alone in density
function.
Definition 1.6.11
Skewness:
This is the measure of the asymmetry of a curve.
Definition 1.6.12
Symmetric:
A distribution is said to be symmetric if its mode, median and mean are all
equal.
Definition 1.6.13
Tractable:
Tractable means easy to work with. In the context of this research, a
distribution is most tractable when the CDF and PDF have simple analytic
expressions (Ramos et al., 2015).
Definition 1.6.14
Support: This
is the range of values where the probability density function is a valid PDF.
Login To Comment