STOCHASTIC SEARCH VARIABLE SELECTION DIFFUSE IN BAYESIAN VECTOR AUTOREGRESSIVE MODELS

  • 0 Review(s)

Product Category: Projects

Product Code: 00007290

No of Pages: 167

No of Chapters: 1-5

File Format: Microsoft Word

Price :

$20

ABSTRACT

The study proposed a Stochastic Search Variable Selection Diffuse (SSVS-Diffuse) method for selecting restriction in Vector Autoregressive (VAR) models. This was done by eliciting a new class of Stochastic Search Variable Selection (SSVS) prior using diffuse prior for the variance covariance which allows for non-diagonal treatment of the variance covariance matrix.  The performance of the SSVS-Diffuse was evaluated using a Monte Carlo experiment with 50 replications after deriving the posterior distribution which have no closed form solution. The study generated different sample sizes of VAR, namely T=50,100, 200 and 500 from a two variable, three variable and four variable VAR models with VAR order set at one, VAR(1), two, VAR(2), three, VAR(3) and four, VAR(4) and these models were fitted. The VAR model was simulated from a Multivariate Normal distribution under two scenarios when the variables were independent and when the variables were correlated with various levels of correlations; very high, , high, , moderate, and low . The forecast performance of these scenarios were evaluated in two ways depending on the type of forecast. For the point forecast, the Mean Square Forecast Error (MSFE) was used as the performance measure and for the density forecast the energy score, a multivariate performance measure was used since VAR models are multivariate models. The SSVS-Diffuse prior outperformed the existing Bayesian VAR and classical VAR models namely classical VAR, Minnesota, SSVS-SSVS and SSVS-Wishart in terms of density forecast with minimum energy scores. The study further applied SSVS-Diffuse using the posterior inclusion probability to determine the VAR coefficients that are important to be included in the model. The optimal lags obtained using the SSVS-Diffuse were compared to the optimal lags obtained using classical methods of selecting lag order such as, Final prediction error (FPE), Akaike Information Criterion (AIC), Schwarz information criterion (SC), Sequential modified LR test statistic (each test at 5% level) (LR) and Hannan-Quinn information criterion (HQ).In all the cases considered, the posterior inclusion probability of SSVS-Modified correctly identified the optimal lags. The classical method exhibits fluctuations with SC and HC failing in some cases considered. The Study concludes by applying SSVS-Diffuse to real life data, where SSVS-Diffuse out-performed the existing methods based on historical performance.



TABLE OF CONTENTS

Title page                                                                                                                    i                                           

Declaration                                                                                                                  ii

Certification                                                                                                                iii

Dedication                                                                                                                  iv

Acknowledgement                                                                                                      v

Table of Contents                                                                                                       vi

List of Tables                                                                                                              x

Abstract                                                                                                                      xi

CHAPTER 1: INTRODUCTION                                                                          1                           

1.1 Background of Study                                                                                           1

1.2  Statement of  the Problem                                                                                   5

1.3 Objectives of the Study                                                                                        6

1.4 Justification of Study                                                                                           6

1.5  Definitions of Terms                                                                                            7

1.5.1    Bayesian Terms                                                                                               7

1.5.2    Vectorization                                                                                                  8

1.5.3        Kronecker Product                                                                                          8

1.5.3   VAR coefficients                                                                                             8

1.5.4          Stochastic search variable selection                                                              9

1.5.5        Markov chain monte carlo                                                                              9

1.5.6        Distributions                                                                                                   10

CHAPTER 2: LITERATURE REVIEW                                                             12

2.1   Introduction                                                                                                        12

2.2  Theoretical Framework                                                                                        12

2.3   Prior Elicitation in Bayesian VAR                                                                      14

2.3.1  Minnesota prior                                                                                                15

2.3.2  Diffuse prior                                                                                                     17

2.3.3 Natural conjugate prior                                                                                      18

2.3.4 Normal diffuse prior                                                                                          19

2.3.5 Extended natural conjugate prior                                                                      20

2.3.6 Independent normal wishart                                                                              21

2.4   Stochastic Search Variable Selection                                                                  26

2.5   Forecasting and Evaluation                                                                                34

2.6   Empirical Application                                                                                         39

2.6.1 Sequential testing procedure                                                                              40

2.6.2 Model selection for VAR models                                                                      41

CHAPTER 3  METHODOLOGY                                                                          45

3.1   Introduction                                                                                                        45

3.2   Likelihood Function                                                                                           45

3.3   Stochastic Search Variable Selection Diffuse Procedure                                   46  3.4   Gibbs Sampling for SSVS-Diffuse                                                                                                       51

3.4.1 Bayesian computation: markov chain monte carlo diagnostics                         54

3.5   Design of the Monte Carlo Studies                                                                    56

3.5.1 Energy score                                                                                                      57

3.5.2 Mean square forecast error                                                                                 58

3.6   Determination of Optimal Lag Length                                                               58

CHAPTER 4  RESULTS AND DISCUSSION                                                     61

4.1   Introduction                                                                                                        61

4.2    Simulated Numerical Example                                                                          61

4.3    Application to the SSVS-Diffuse                                                                      92

CHAPTER 5 CONCLUSION AND RECOMMENDATIONS                          99

5.1    Summary                                                                                                            99

5.2    Contribution to Knowledge                                                                               100

5.3    Area of Further Research                                                                                   100

5.4     Conclusion                                                                                                        101

References                                                                                                                  102

Appendices                                                                                                                 105                                                                                                                                                                            


 

 

LIST OF TABLES

2.1: Summary of some existing priors in bayesian VAR                                            33

4.1: MSFE across forecast horizons                                                                            62

4.2: Density forecast across various forecast horizons                                               63

4.3: MSFE for evaluating point forecast of different BVAR priors                          63

4.4: Energy scores for evaluating entire density for independent variables               65

4.5: Scores for evaluating density forecast  with low correlation                66

4.6: Scores for evaluating density forecast  with low correlation                66

4.7: Scores for evaluating density forecast  with low correlation                67

4.8: Scores for evaluating density forecast  with low correlation                67

4.9: Predictive Mean for Data 1                                                                                 93

4.10: MSFE for Evaluating Point forecast for Data 1                                                94

4.11: Scores for Evaluating Density forecast for Data 1                                            94

4.12: Predictive Mean for Data 2                                                                               96

4.13: MSFE for Evaluating Point forecast for Data 2                                                97

4.14: Scores for Evaluating Density forecast for Data 2                                            97

 



 

 

 

CHAPTER 1

INTRODUCTION


1.1 BACKGROUND OF STUDY

Bayesian Statistics is a school of thought in the field of Statistics that uses the subjective view of probability to interpret uncertainty. Bayesian Statistics interprets probability based on personal belief, this is in contrast to the relative frequency interpretation, which is at the heart of classical Statistics. Bayesian Statistical inference specifies how belief should be changed in the light of new information. Bayesian Statistics makes available a mathematical means of incorporating our individual belief to evidence at hand in order to arrive at a new belief (posterior). Bayesian Statistics was named after an English Clergyman, Rev. Thomas Bayes (1702-1761). This was after the article “An Essay towards solving a problem in Doctrine of Chances”, was posthumously published in his honour.

Bayesian method provides a complete paradigm shift for both Statistical Inference and decision making under uncertainty. This is based on a subjective view of probability, which argues that our uncertainty about anything unknown can be expressed using the rules of probability. Bayesian Statistics looks at the unknown parameter of the model that the researcher wishes to estimate, as a random variable which has probability distribution. This is against the view of the classical Statistics that sees the unknown parameter as a constant while the estimates of the unknown as a random variable. The probability statement about the unknown parameter is interpreted as a degree of belief. The belief about the unknown parameter is updated after seeing the data by using Bayes theorem. Bayesian Statistics involves combining the past (the things which we know before seeing the data i.e. the prior) with the present (data generating process i.e. the likelihood) to arrive at the future (Posterior).

The distinguishing factor of Bayesian methods and classical methods is the incorporation of the prior information into the model. Prior Elicitation is the process of formulating personal beliefs about parameter of interest into probability distribution (Garthwaite et al., 2005). It is pertinent that priors are elicited correctly. A good prior will perform better than a no prior and a bad prior will be worse than a no prior. A major impediment to widespread use of the Bayesian paradigm has been that of the determination of the appropriate form of the prior distribution. This is often an arduous task. Typically, these prior distributions are specified based on information accumulated from past studies or from the opinions of subject-area experts. The prior refers to the subjective view or belief of the researcher about the parameter of interest that is expressed in the form of probability distribution. This belief about the parameter of interest can be based on information available to the researcher from previous knowledge or expert opinion. The prior in this case is referred to as informative prior.  The informative prior usually dominates the data. The researcher may not have much information about the parameter of interest as result of knowing a little or even ignorance about the parameter of interest. In this case, the non-informative prior distribution is used to represent “knowing little or ignorance”. Here, the data will dominates the prior. The subjective view of the researcher can be made after seeing the data unlike the informative and non-informative prior that is done before seeing the data, this is known as empirical prior. 

The posterior distribution is the marriage of the sample information and prior information, it is of basic interest in Bayesian Statistics as it updates the belief of the researcher after seeing the data. The posterior is proportional to likelihood function (the data generating process) times the prior distribution.  The posterior distribution is the fundamental thing in Bayesian Statistics because, it is the fulcrum of statistical analysis Bayesian Statistics like parameter estimation, test of hypothesis, model comparison, forecasting etc.

Bayesian Statistics has continued to grow, as bayesian method is now applied in virtually every aspect of Statistics from Biostatistics, Econometrics, Experimental Design, Sampling Methods and Techniques, Time Series Analysis etc. Application of Bayesian method in research has increased tremendously as the number of Bayesian journal articles, conference presentations, text books and statistical software packages has increased in the past few years. In general, work in Bayesian statistics now focuses on the development of bayesian counterparts to the existing classical statistical methods, application of bayesian methods in data analysis, bayesian computation, prior elicitation etc.

Bayesian Statistics became more prominent in the past 50 years, the earlier challenge encountered in the area of Bayesian Statistics was that some of the posterior distributions obtained had no close form solution and this made Bayesian Statistics to be jettisoned by most researchers. The computing revolution of the past 50 years has overcome this hurdle and has led to a blossoming of Bayesian methods in many fields (Koop, 2003). Bayesian data analysis is now accessible to scientists because of recent advances in computational algorithms, software, hardware, and textbooks. Indeed, whereas the 20th century was dominated by classical Statistics, the 21st century is becoming Bayesian according to Kruschke (2011). Poirier (2006) reported that there has been an upward growth of Bayesian Methods in Statistics and Economics since 1970; this, he attributed to increase in Bayesian thinking among authors.

Bayesian Statistics has been widely applied in the area of Econometrics, leading to Bayesian Econometrics. Bayesian Econometrics has enjoyed huge popularity that has seen Bayesian methods applied in Econometrics (Zeller, 1971; Poirier, 1995; Koop, 2003; Lancaster, 2004; Geweke, 2005) to mention but a few.

There are several interdependent economic variables used in macroeconomic modeling. Sims (1980) developed the Vector Autoregressive (VAR) model as a means of modeling interdependency between Time Series data.

VAR models are multiple time series models that is used to study dynamic interrelationships between series under consideration in order to make forecast and structural analysis. VAR models have become the work horse of macroeconomic forecasting (Karlsson, 2013). VAR models often performs better than other macroeconomic models but it is still prone to several problems such as over parameterization. The classical VAR models requires the estimation of so many coefficients and doing this without restriction will lead to excess parameter that the available data cannot even calculate all the parameter.  According to Koop and Korobilis (2010), to get the number of parameters to be estimated, the formula,  is used. Where n is the number of variables and p is the optimal lag length. For instance, for a three variable VAR model with an optimal lag length of four there will a total of 39 parameters to be estimated and for a five variable VAR with optimal lag length of 4 there will be a whopping number of 105 parameters to be estimated. Imagine the number of parameters that will be estimated for a 10 variable VAR with lag length of 10. This obviously makes VAR models to be over parameterized. Over parameterization makes estimate imprecise and has consequences on the precision of the inference and reliability of prediction. The estimates can be improved if the analyst has any information about the parameters beyond that contained in the sample. Bayesian estimation provides a convenient framework for incorporating prior information with as much weight as the analyst feels it merits (Hamilton, 1994).

Bayesian VAR (BVAR) is a flexible way to both reduce the dimensionality of the parameter space and incorporate additional information. A BVAR specification is the shrinkage of dynamic parameters towards a specific representation of the data which reflects researchers’ prior beliefs and deals with the over-parameterization problem.

The bayesian VAR literature has seen the elicitation of various priors to shrink the parameters of the VAR models to avoid over parameterization. The following prior gives various levels of shrinkage Minnesota (Litterman) prior and the various modifications of Litterman prior, steady state prior, hierarchical prior, stochastic search variable selection prior. Stochastic Search Variable Selection prior is a shrinkage method that gives data based restrictions on the parameters of the VAR model.


1.2 STATEMENT OF THE PROBLEM

The Stochastic Search Variable Selection (SSVS) priors as applied in bayesian VAR models have been reported to have better forecast when compared to the existing Minnesota priors and its various modifications as shown in studies carried out by George et al. (2008), Koop and Korobilis (2010), Korobilis (2013), George et al. (2018) to mention a few. SSVS has better forecast performance, though it is faced with the challenges of the Normal-Wishart restrictions on the variance–covariance matrix, . This implies that every equation must have the same set of explanatory variables. For a researcher who wishes to put restrictions on various equations using different explanatory variables in order to avoid over parameterization, this Normal-Wishart restriction needs to be dealt with. (Koop and Korobilis, 2010). To overcome the Normal-Wishart restriction on the variance covariance matrix, this study elicited Stochastic Search Variable Selection Diffuse model using diffuse prior for the variance covariance which allows for non-diagonal treatment of the variance covariance matrix.


1.3 OBJECTIVES OF THE STUDY

The global objective of this study is to propose a Stochastic Search Variable Selection Diffuse that overcomes over parameterization in VAR models using restrictions on various equations involving different explanatory variables.

The specific objectives are as follows

i.                    To use  SSVS-Diffuse  to address over parameterization of  VAR model

ii.                  To identify the optimal lag length using SSVS-Diffuse

iii.                To examine the performance of the elicited prior under the posterior model inference when compared with existing methods of Bayesian VAR

iv.                To demonstrate the application of Modified Stochastic Search Variable Selection model for bayesian VAR to a real life data

1.4 JUSTIFICATION OF STUDY

In Vector Autoregressive models there are so many parameters to estimate and doing so without restricting some of the parameters to zero leads to over parameterization which affects inference. There are two parameters of interest in VAR models, the VAR coefficients and variance and the variance covariance matrix. The study provides restriction on the variance covariance matrix using diffuse prior thus allowing different equations of the model to have different explanatory variables.  .Also, the study will provide an alternative method for finding the optimal lag length under the Bayesian paradigm using posterior inclusion probability.


1.5 DEFINIFION OF TERMS

In this section, some terminologies used in this study will be defined.

1.5.1 Bayesian terms

Parameter: A parameter is an attribute or quantity that is calculated from the population examples are population mean, population variance, population standard deviation etc.

Likelihood is the expression for the distribution of the data conditional on the parameter

 Prior is a probability statement about a parameter which is expressed as degree of ones belief about the parameter before observing the data. It is a non-data information.

Diffuse (vague) prior is a prior that encompasses all reasonable beliefs that can be done by using a uniform or flat distribution for the parameter of interest.

Natural conjugate prior is a prior that is from a family of density function that after multiplication with the likelihood produce a posterior in the same family.

Informative prior is a prior based on the information available about a parameter of interest through expert opinion or previous study. Informative prior dominates the data.

Non-informative prior is a prior form based on no information about the parameter of interest.

Hierarchical priors is when the prior is specified as a series of priors known as multistage priors. It can be two stage priors, in the sense that a prior is placed on another prior. They are more flexible than non-hierarchical priors, making the posterior distribution less sensitive to the main prior.

Hyperparameters are parameters which themselves are given a probabilistic specification in terms of further parameters. It is a prior distribution of parameter depending on one or more
parameters

Posterior distribution is a function of the parameter given the data. It represents the belief about a parameter of interest after seeing the data. It is a combination of prior and the likelihood

Posterior model probability is to access the degree of support for a particular model i.e. the weighted probability of a model.

Posterior inclusion probability is the probability statement that shows the importance of a particular VAR coefficient in the VAR model. In this study we proposed the use of posterior inclusion probability to determine the number of important VAR coefficients in bayesian VAR.

Bayes theorem is a probability theorem linking the unconditional distribution of a parameter with the conditional distribution. It connects the likelihood function with the prior probability


1.5.2 Vectorization: vectorization of a matrix is a linear transformation which convert a matrix to a column vector. For matrix C with dimensions , the vectorization of matrix C denoted by vec(C) is obtained by stacking the columns of matrix C on top of one another resulting to a column vector of dimension .

1.5.3 Kronecker product, : is the operation on two matrices of arbitrary size resulting to a block matrix. If C is  matrix and D matrix then the Kronecker product,  is the  block matrix.


1.5.4 VAR coefficients: Numerical values that shows the relationship between current own variable and lagged variables with lagged variables of other variables


1.5.5 Stochastic search variable selection (SSVS)

Variable selection deals with which subset of variables is to be included in the act of model building. In bayesian paradigm variable selection is literally parameter estimation i.e. the task is to estimate the marginal posterior probability that a subset should be in the model.

Stochastic search means model space is too large to assess in a deterministic manner so therefore a data based restriction on the parameter is required.

SSVS is a method of restricting some of the coefficients in the model to be equal to zero in other to reduce the number of excess parameters.

Mixture distribution mixes or averages two distributions over a mixing distribution

     (1.5.1)

Here x ( ) is the mixing distribution.  x ( ) can be discrete or continuous .


1.5.6 Markov chain monte carlo (MCMC) is a posterior simulation algorithm that provides iterative procedures to approximately sample from complicated posterior densities by avoiding the independence assumption. It approximates the expectations of functions of interest with their sample averages.

Gibbs sampling is the strategy of sequentially drawing from the full conditional posterior distributions.

Metropolis Hastings is an algorithm that draw samples from any probability distribution P(x), provided the value of a function f(x) that is proportional to the density of P can be computed. These sample values are produced iteratively, with the distribution of the next sample being dependent only on the current sample value. Specifically, at each iteration, the algorithm picks a candidate for the next sample value based on the current sample value. Then, with some probability, the candidate is either accepted or rejected



Click “DOWNLOAD NOW” below to get the complete Projects

FOR QUICK HELP CHAT WITH US NOW!

+(234) 0814 780 1594

Buyers has the right to create dispute within seven (7) days of purchase for 100% refund request when you experience issue with the file received. 

Dispute can only be created when you receive a corrupt file, a wrong file or irregularities in the table of contents and content of the file you received. 

ProjectShelve.com shall either provide the appropriate file within 48hrs or send refund excluding your bank transaction charges. Term and Conditions are applied.

Buyers are expected to confirm that the material you are paying for is available on our website ProjectShelve.com and you have selected the right material, you have also gone through the preliminary pages and it interests you before payment. DO NOT MAKE BANK PAYMENT IF YOUR TOPIC IS NOT ON THE WEBSITE.

In case of payment for a material not available on ProjectShelve.com, the management of ProjectShelve.com has the right to keep your money until you send a topic that is available on our website within 48 hours.

You cannot change topic after receiving material of the topic you ordered and paid for.

Ratings & Reviews

0.0

No Review Found.


To Review


To Comment