Abstract
Gaussian mixture models are applied in machine learning specifically unsupervised machine learning. More specifically they can be used during image segmentation and music classification just to mention a few.
In this project, it is shown how the EM Algorithm is derived and how it effectively comes into use in terms of soft clustering data sets into distributions.
EM Algorithm is used to estimate parameters within a model in a fast and stable way then fills the missing data in a sample and find the values of latent variables.
The Gaussian Mixture model looks at the distributions. It groups only data points that belong to a similar distribution. This is done through soft clustering where by the points are assigned the probability of being in a certain distribution, It goes as far as clustering data points in between different distributions accurately by showing to which extent a data point falls in a particular distribution.
Expectation Maximum Algorithm uses the observed data to get optimum values that can be used to generate the model parameters.
Limitations anticipated within this study include;
Expectation Maximum Algorithms have slow convergence and this convergence is made to the local optima.
It also requires forward and backward probabilities, while numerical optimization only requires forward probability.
Table of Contents
CHAPTER ONE
GENERAL INTRODUCTION
1.1 Background Information 1
1.2 Problem Statement 1
1.3 Objectives of the Study 2
1.4 Research Method 2
1.5 Significance of the Study 2
1.6 Literature Review 3
CHAPTER TWO
GAUSSIAN MIXTURE MODELS
2.1 Gaussian Distribution (Parameter Estimation) 4
2.2 Gaussian Mixture Models (Parameter Estimation) 7
CHAPTER THREE
DERIVATION OF THE EXPECTATION MAXIMIZATION ALGORITHM
3.1 Log-Likelihood 9
3.2 Convergence in the Expectation Maximization (EM) Algorithm . 12
CHAPTER FOUR
GENERALIZED EXPECTATION MAXIMIZATION
4.1 Likelihood for complete data 16
4.2 The Expectation Step 18
CHAPTER FIVE
EXPECTATION MAXIMIZATION(EM) IN GAUSSIAN MIX- TURE MODELS
5.1 Expectation Maximization (EM) in Gaussian Mixture Models 20
5.2 Illustration of the EM Steps 21
CHAPTER SIX
DISCUSSIONS AND RECOMMENDATIONS
6.1 Discussion 26
6.2 Study Limitation 26
6.3 Recommendations 26
REFERENCES 27
CHAPTER ONE
GENERAL INTRODUCTION
1.1 Background Information
A mixture can be described as a constructed probability distribution after com- bining two distributions or more to get a new distribution. It can be classified as either Discrete, Finite, or Continuous.
A mixture created by combining numerous Gaussian distributions is known as a Gaussian Mixture Model. It is predicated on the idea that each data point is produced by a combination of a limited Gaussian distributions’ number with unknown characteristics.
Application of Gaussian mixture models is applied in machine learning specifically unsupervised machine learning. More specifically they can be used during image segmentation and music classification just to mention a few.
They use a clustering format where we try to find cluster points using unsupervised learning in the datasets that shares common characteristics. In the clustering analysis, the Expectation Maximum Algorithm is used to fill in miss- ing data in a sample, estimate model parameters quickly and steadily, and determine the values of latent variables.
Dempster (1977) introduced the Expectation Maximum Algorithm to obtain the Maximum Likelihood Estimates of incomplete data. A broad applicable algorithm for computing maximum likelihood estimates from incomplete data was presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm was derived.
1.2 Problem Statement
Most research works have applied the Expectation Maximum Algorithm to obtain Maximum Likelihood Estimates for missing or incomplete data sets in Gaussian mixture models. The EM approach, which is frequently used to estimate the model’s parameters, mainly relies on the estimation of insufficient data. However, it doesn’t make use of any data to lessen the uncertainty caused by missing data.
This project aims to do parameter estimation for the Gaussian mixture models using the Expectation Maximum Algorithm.
1.3 Objectives of the Study
The study’s main objective is to estimate the parameters for the Gaussian mixture model using the EM Algorithm.
The specific objectives are to;
I. Derive the Expectation Maximization Algorithm.
II. Estimate Gaussian Mixture models’ parameters using the EM Algorithm method.
III. Apply the EM Algorithm in the Gaussian Mixture Model.
1.4 Research Method
The breakdown of the method used for the estimation of parameters using the EM Algorithm method is given below;
I. Derive the Expectation Maximum Algorithm.
II. The system is provided a collection of imperfect observed data with the presumption that the observed data originates from a particular model.
III. E – Step is employed, in which values of the missing or partial data are estimated or conjectured using the observed data. updating the variables, in short.
IV. M – Step is applied whereby complete data sets gotten in the E – Step is used to perform the updating of the values of the parameters. In summary, updating the hypothesis.
V. Check whether if values converge or not. If yes, then stop, otherwise, repeat steps two, three, and four until the convergence occurs.
1.5 Significance of the Study
The world is changing at a fast pace and embracing technology to make life eas- ier. Machine learning is one core aspect that has been embraced by individuals and largely by companies to manage data and algorithms and improve accuracy in terms of data analysis.
This study seeks to play a contributory role in making readers gain a further understanding of EM Algorithms for Gaussian Mixture models and broaden the reader’s understanding of unsupervised learning which is an area under machine learning.
1.6 Literature Review
Clusteing is a method used to place data points that have similar characteristics into groups. It is a type of unsupervised learning broken down into two types; soft clustering and hard clustering. Using a probability- model based approach, it is assumed that the data follows a mixture model of probability distributions in which Expectation and Maximization Algorithm is used as stated by Yang,Lai and Lin (2012).
Mixture models can be described as a combination of multiple distributions. It used probability as a tool to project presence of sub-populations in an overall population an the observation’s distribution.
A Gaussian mixture model is a type of mixture model meaning that it uses probability to assign data points to a certain number of Gaussian distributions using soft clustering. The idea of Gaussian mixtures was popularized by Duda and Hart (1973).
Expectation and Maximization Algorithm for Gaussian Mixtures performs max- imum likelihood estimation with missing values. The process was introduced by Dempster, Laird and Rubin (1977) It is an iteration approach that cycles between two steps that is; estimating the missing values and optimizing the model and the two steps are repeated until convergence occurs. It is a good estimation for missing variables as will be seen in this paper.
The current values of the existing parameters are used to calculate weights,then the weighted joint log-likelihood is maximized in each iteration. In short in each procedure the expectation is maximized hence the name Expectation Maximiza- tion Algorithm.
In the field of discrete choice modelling, EM algorithms have been used by Bhat (1997a) and Train (2008a,b).
Click “DOWNLOAD NOW” below to get the complete Projects
FOR QUICK HELP CHAT WITH US NOW!
+(234) 0814 780 1594
Buyers has the right to create
dispute within seven (7) days of purchase for 100% refund request when
you experience issue with the file received.
Dispute can only be created when
you receive a corrupt file, a wrong file or irregularities in the table of
contents and content of the file you received.
ProjectShelve.com shall either
provide the appropriate file within 48hrs or
send refund excluding your bank transaction charges. Term and
Conditions are applied.
Buyers are expected to confirm
that the material you are paying for is available on our website
ProjectShelve.com and you have selected the right material, you have also gone
through the preliminary pages and it interests you before payment. DO NOT MAKE
BANK PAYMENT IF YOUR TOPIC IS NOT ON THE WEBSITE.
In case of payment for a
material not available on ProjectShelve.com, the management of
ProjectShelve.com has the right to keep your money until you send a topic that
is available on our website within 48 hours.
You cannot change topic after
receiving material of the topic you ordered and paid for.
Login To Comment