ABSTRACT
In this study, a new family of distributions called the Gumbel Marshall Olkin-G (GMO-G) family of distributions was developed using Transformed – Transformer ( T-X ) method of generating family of distributions. The probability density function (pdf) of Gumbel distribution was used as the generator while the log- logit of the Marshall Olkin family of distributions was the transformation function. The pdf, cdf, Survival Function (sf) and Hazard Rate Function (hrf)) of the new family of distributions were defined. The properties of the proposed family were derived and studied. The pdf of the new family was expressed as an infinite linear combination of exponentiated – G distribution of the baseline distribution. The bivariate extension of the proposed family was derived while estimation of the parameters of the family was discussed based on Maximum Likelihood Estimation (MLE) method. The sub-models of the new family of distributions, namely Gumbel Marshall Olkin Exponential distribution (GMO-E), Gumbel Marshall Olkin Normal distribution (GMO-N), and Gumbel Marshall Olkin Weibull (GMO-W) were derived. The plots of pdf and hrf of GMO-E and GMO-N were illustrated, and GMO-W properties studied. Shapes of hrf obtainable from these members of the family include increasing, decreasing, constant, right-skewed, left-skewed, bathtub shaped, reversed bathtub shaped, and reversed J-shaped. More so, the shapes of the pdf include increasing, decreasing, constant, unimodal, bimodal, symmetric, right-skewed, and left-skewed shaped. A simulation study was carried out on MLEs of parameters of the GMO-W distribution to ascertain its stability. The potentiality of the GMO-G family was illustrated by applying three different data sets each on GMO-W, the results from the goodness of fit statistics showcased that GMO-W provided a better fit amongst the competing baseline distributions.
TABLE OF CONTENTS
Cover
Page
i
Title Page
ii
Declaration
iii
Dedication iv
Certification
v
Acknowledgements vi
Table of
Contents
vii
List of
Tables
xi
List of
Figures xii
Abstract xiii
CHAPTER 1
INTRODUCTION
1
1.1 Background of Study 1
1.2 Statement of the Problem 3
1.3 Justification of the
Study 4
1.4 Aim and Objectives 5
1.5 Scope of the Study 5
1.6 Definition of Terms 6
CHAPTER 2 Literature
Review 9
2.1 Introduction 9
2.1.1 Gumbel distribution
9
2.2 System of distributions 11
2.3 Addition of
parameter(s)
15
CHAPTER 3
Methodology 30
3.1 The Gumbel Marshall Olkin
Family (GMO-G) of Distributions and its properties 30
3.1.1 The Gumbel Marshall Olkin
Family 30
3.2 Linear
representation 33
3.2.1 Linear representation of
the cumulative distribution function
of GMO-G 34
3.2.2 Linear representation of
GMO-G family density function
35
3.3 Statistical Properties of
GMO-G 37
3.3.1 Shapes of the pdf and hrf
of GMO-G family 37
3.3.2 Quantile function of GMO-G
family 39
3.3.3 Median 41
3.3.4 Mode 42
3.3.5 Ordinary moments
42
3.3.6 Incomplete moment
44
3.3.7 Moment generating
function
45
3.3.8 Probability weighted
moments
45
3.4 Order Statistics 48
3.5 Entropy
50
3.6 Parameter Estimation of GMO-G family
51
3.7 Bivariate Extension of
GMO-G family 53
3.7.1 Conditional density
function of GMO-G family 56
3.8 Conclusion
57
CHAPTER 4 Results and Discussion
58
4.1 Special members of GMO-G family of
distributions 58
4.1.1 Gumbel Marshall Olkin
-Exponential (GMO-E) distribution 58
4.1.1.1 The Model 58
4.1.1.2 Hazard rate function of
GMO-E 59
4.1.1.3 Illustrative plots of
pdf and hrf of GMO-E 60
4.2 Gumbel Marshall
Olkin-Normal distribution (GMO-N) 62
4.2.1 The Model 62
4.2.2 Hazard rate function of
GMO-N 63
4.2.3 Illustrative plots of pdf
and hrf of GMO-N 64
4.3 Gumbel Marshall
Olkin-Weibull distribution (GMO-W) 67
4.3.1 The Model of GMO-W 67
4.3.2 Statistical properties of
GMO-W 71
4.3.2.1
Quantile function 71
4.3.2.2 Median 72
4.3.2.3 Moments 72
4.3.2.4. Moment generating function of GMO-W 74
4.3.2.5 The
Mode of GMO-W 75
4.3.3 Hazard rate function of
GMO-W 75
4.3.4 Mean residual life
function 78
4.3.5 Entropy of GMO-W 79
4.3.6 Order statistics of GMO-W 81
4.3.7 Parameters estimation of GMO-W 82
4.3.7.1
Simulation study of MLE for GMO-W 85
4.3.8 Applications of GMO-W to
real data sets 87
4.4 Conclusion
95
CHAPTER 5
SUMMARY,
CONCLUSION, and Recommendations 96
5.1 Summary
96
5.2 Conclusion
97
5.3 Contribution to
Knowledge
98
5.4 Recommendations
98
References
99
Appendices
105
LIST OF TABLES
4.1 Results of
Simulation Study of GMO-W 86
4.2 Summary of
Goodness of Fit Statistics Data set 1, Data set 2,
and Data set 3 of GMO-W 89
4.3 Result
Estimates based on Maximum Likelihood and Standard errors
for Data set 1,Data set 2, and Data set
3 of GMO-W 90
LIST OF FIGURES
4.1 Plots of pdf of GMO-E Distribution for some selected Parameter
values 60
4.2
Plots of hrf of GMO-E Distribution for some selected Parameter values 61
4.3 Plots of pdf of GMO-N Distribution for some
selected Parameter values 64
4.4 Plots of hrf of GMO-N
Distribution for some selected Parameter
values 66
4.5 Plots of GMO-W Distribution
pdf for different Parameter values 70
4.6
Plots of GMO-W Distribution hrf for different Parameter values 76
4.7 Estimated plots of pdf of
GMOW distribution with other competing pdfs for data set 92
4.8 Estimated plots of cdf of GMO-W
with other competing cdfs for data set 1 92
4.9 Plots of pdf
of GMO-W Distribution with other competing pdfs for data set 2
93
4.10 Estimated plots of cdf of GMO-W Distribution with other
competing cdfs
for data set 2. 93
4.11. Estimated plots of pdf of GMO-W Distribution with other competing pdfs for data set 94
4.12
Estimated plots of cdf of GMOW
Distribution with other competing cdf for data set 94
CHAPTER 1
INTRODUCTION
1.1 BACKGROUND OF STUDY
Probability distribution is relevant in modeling
real-life phenomena and the preference for any distribution is based on its
adequate fit and flexibility (Oguntude ,2017). In this era of “Big data”, the
demand for analysis of data set has been growing increasingly. In many
practical areas, the classical distributions do not provide adequate fit in
data modelling (Ahmad et al., 2019). This development has necessitated
the need for the extended version of existing distributions in the literature
to increase their flexibility and enhance their capability to model real-life
situations. For instance, the Exponential distribution is limited to modelling of
life-time data with constant hazard function; the Rayleigh distribution has
increasing hazard function only; and the Weibull distribution will not be able
to model non-monotone failure rate function notwithstanding its capacity of
modeling increasing, decreasing and constant hazard function. Also, the Gamma
distribution cumulative distribution function has no closed form, this makes it
difficult in expressing its mathematical properties.
Based on the limitations of classical probability distributions,
significant advancements in probability distribution theory have been made
through the introduction of new generalized families of distributions. Some of
the notable ones include: the exponentiated generalized class of distribution
(Cordeiro et al., 2013) and Weibull-G family of probability distribution
(Bourguignon et al., 2014). They made attempts in developing generalized
distributions which will be robust and more flexible than the existing
classical distributions. The generalized distributions are found to be better
than many classical distributions in terms of provision of adequate fit for
data sets. In particular, when the data set is heavily skewed, a generalized
distribution tends to produce a better fit than the parent distribution. Hence,
attention has been shifted to favour generalized distributions in recent years
(Alshawarbeh, 2011).
The flexibility of a distribution can be increased by using the
available generalized family of distributions, thereby adding extra shape
parameter(s). The role of these additional shape parameter(s) is to vary the
tail weight of the resulting compound distribution, thereby inducing it with
skewness (Bourguignon et al., 2012 and Ahmed et al.,2019). More
so, flexibility can be increased by modifying the existing distribution. For
example, two or more classical distributions can be combined as the case of
convolution, quotient, or product of independent random variables. In addition,
some distributions are the distributions of functions of continuous random variables;
for instance, composition of the student t-distribution (Sun, 2011).
Tractability and flexibility of a probability distribution enhances ease of
mathematical computations and in provision of the best fit in application of
varieties of data set, rather than transformation of the existing data set
which might affect the originality of the data set (Oguntude,2017).
Some of the methods proposed for generating families of distributions
were summarized by (Lee et al., 2013) as: method of differential
equation developed by (Pearson,1895); method of transformation (also known as
translation) which was proposed by (Johnson,1949); and method of quantiles
proposed by (Hastings et al., 1947 and Tukey, 1960).
However, methods of generating a new family of
distributions have shifted since 1980 to adding parameters to an existing
distribution or combining existing distributions. Some of the noticeable
developments using this method are: method of generating skew distributions;
beta-generated method; method of adding parameters; Transformed-Transformer
method (T-X family); and Composite method (Lee et al., 2013). All these
methods are limited to the support range of 0 and 1 as generator, except
Transformed-Transformer method.
Some of the notable families of distributions include: beta family of
distributions, Kumaraswamy family, transmuted family of distributions,
generalized transmuted family of distributions, Marshal-Olkin family of
distributions, McDonald-G family of distributions, Weibull-G family of
distributions, Weibull-X family of distributions, beta Marshall-Olkin family of
distributions, Kumaraswamy Marshall-Olkin family of distributions, Gumbel-X
family, Gamma-X family of distributions, logistic-X family of distributions,
T-Normal family of distributions, T-Weibull family of distributions, Lindley
family of distributions, Power Lindley family of distributions and
exponentiated Weibull family of distributions (Alzaatreh et al.,2013).
According to Yousof et al.(2018), the study of a new
generalized family of distributions is
centered on the following objectives
:produce skewness for symmetrical models; define special models with different
shapes of hazard rate function; construct heavy -tailed distributions for modeling
various real data sets; make the kurtosis more flexible compared to that of the
baseline distribution; generate distributions which are skewed, symmetric,
J-shaped or reversed-J shaped; and provide consistently better fits than other
generalized distributions with the same underlying model.
This study is expected to develop a family of
distributions that will be very flexible using Gumbel distribution as the
generator and Marshal- Olkin family distributions as the transformer. The new
family of distributions is expected to possess extra parameter(s) embedded in
the generator and the transformer used.
1.2 STATEMENT OF THE PROBLEM
The
limitation of classical distributions in providing approximate representations
of samples encountered in statistical practice has necessitated researchers to
seek the development of the generalized families of distributions. The
generalized families of distributions in literature have proved to provide
adequate fit more than classical probability distributions in modelling of
lifetime random processes. They are more flexible and can be used to model data
sets of diverse shapes of hazard rate function and pdf of different skewness.
Distributions with bathtub shaped hazard rate function are relevant in
reliability and survival data analysis.
In
the literature, many distributions that have bathtub shaped hazard rate
function don’t have bimodal pdf, while those that have bimodal pdf do not have
bathtub shaped hazard rate function. In this work, a flexible family of
distributions with sub-models that will have the capacity of modelling data
sets with bathtub hazard rate function and bimodal pdf is being proposed. The
sub-models shall be able to provide adequate fit more than existing baseline
distributions, in particular extreme value distributions.
More
so, the sub-models of the proposed family of distributions might have hazard
rate function with other shapes and heavy kurtosis to model more complex data
sets, including data with outliers and pdf with various skewness. We intend to
work on ‘‘Gumbel-Marshal Olkin family of distribution’’. Gumbel distribution as
a generator and Marshall -Olkin family of distribution as transformer in the
T-X (Transformed-Transformer) method of generating distributions.
1.3
JUSTIFICATION OF THE STUDY
In this era of “big data”, there are myriad of data
sets which need to be analysed statistically with appropriate probability model
but, the classical probability distributions have the limitations of not providing
the best fit thereby causing non-attainment of the desired result. The central
limit theory of Normal distribution, which is used in approximating other distributions,
sometimes might not give better representation of distributions of data in real
life situation. Past experiences have
shown that generalised distributions derived from existing baseline
distributions provide a better fit over the existing baseline distribution in real-life
data modelling. In view of this, studies have been tailored towards the
development of new family probability of distribution models from existing
baseline distributions that will be robust and flexible enough to handle
practical real-life processes that are asymmetric in nature.
1.4 AIM AND OBJECTIVES OF THE STUDY
The
main aim of this research is to develop a new family of continuous
distributions called Gumbel-Marshal Olkin family of distributions.
The
specific objectives considered in this study include:
1. To
define the pdf and cdf of the new family of distributions.
2. To
study the properties of the new family of distributions.
3. To
derive a sub-model and illustrate two sub-models of the new family of
distributions.
4. To
ascertain the stability of the MLEs through simulation studies.
5. To
compare the sub-models of the new family of distributions with other already
existing distributions with the same baseline distributions using real-life
data sets.
1.5 SCOPE
OF THE STUDY
This
study is limited to the generation of the Gumbel-Marshal Olkin family of
distributions (GMO-G) using the T-X framework, its sub-models and statistical
characteristics. The properties of the new family such as the shapes of the
pdf, cdf, survival function and hazard rates function will be investigated.
Other properties such as the quantile function, moments, entropy, order
statistics, and maximum likelihood estimate (MLE) of the parameters of the new
family are derived. Two special members of this family (Gumbel Marshall Olkin-Exponential
and Gumbel Marshall -Olkin Normal) distributions are studied. More so, one special
member of this family (Gumbel Marshall -Olkin Weibull ) is studied extensively and
the stability of the MLEs of the parameters ascertained using a simulation
study. Real-life data sets will be used for illustration to test the potential
of the sub-model of the proposed class of distribution. R statistical software
will be applied where analytical solution is impossible, in making all the plots
and computations.
1.6 DEFINITION OF TERMS
Definition
1.6.1
Baseline
Distribution: It refers to the distribution of
reference before modification, generalization, or extension. It is the same as the
parent distribution.
Definition
1.6.2
Bathtub
Shape: This describes a curve shape with decreasing,
constant, and increasing parts. It has a constant or flat bottom and steep
sides.
Definition
1.6.3
Inverted
Bathtub Shape: This describes a curve shape with
increasing, constant, and decreasing parts. It is an upside-down bathtub shape.
It has one peak.
Definition
1.6.4
Symmetric:
A symmetric distribution usually has equal
mean, mode and median. It has bell curve, the left and right sides of the curve
will be equal if one draws a line at the centre. .
Definition
1.6.5
Asymmetric:
A distribution is asymmetric if the value
of variables occurs at uneven frequencies. The values of mean, median, and mode
differ, the distribution curve tilt either to the left or to the right.
Definition
1.6.6
Skewness:
This is the measure of asymmetry of a probability model in relation to its mean.
Definition
1.6.7
Failure
Rate or Hazard function : This is the frequency at
which a component or system fails. It is likened to the force of mortality and
risk .
Definition
1.6.8
Kurtosis:
This is the measure of whether the data
are heavy -tailed (positive kurtosis) or light-tailed(negative kurtosis)
relative to a normal distribution. There are three types of Kurtoses, namely; Leptokurtic,
Platykurtic and Mesokurtic.
Definition
1.6.9
Outlier:
This is an observation that differs from
other observations in a data set or group of data.
Definition
1.6.10
Tractable: This means having easy algebraic property. A
distribution is said to be tractable if the cdf and pdf could be expressed
analytically or solvable.
Definition
1.6.11
Location
Parameter: This is a scaler or vector valued
parameter that its change affects the direction of the distribution curve
either to the right or to the left.
Definition
1.6.12
Scale
Parameter: This is a parameter that its change
in value affects the shape of the curve of pdf either to shrink or widen.
Definition
1.6.13
Shape
Parameter: This is parameter that its change in
value can affect the entire shape of the probability distribution.
Definition
1.6.14
Support
Range: These are the limits or boundaries within
which the values of the probability density function are valid.
Definition
1.6.15
Memoryless
Property: This refers to the independence of
probability of some future events on the occurrence of past events. This means
one cannot use what happened previously to predict what will happen in the
in-coming event.
Definition
1.6.16
Generator:
This refers to the distribution from which
the new distribution or generalized distributions is derived.
Login To Comment