A novel Goodness of fit test for Multilevel Proportional Odds model

essay A+
  • Words: 2011
  • Category: Novel

  • Pages: 8

Get Full Essay

Get access to this section to get all the help you need with your essay and educational goals.

Get Access

Chapter 51

5. Application TO A REAL LIFE DATA SET

Throughout this chapter, the purpose was to place how the proposed goodness of fit trial works with the existent life informations set. It provides a good apprehension about the fresh trial and can be used to place the belongingss of the goodness of fit trial every bit good. Since this trial is developed for multilevel informations, a information set which gives a multilevel information construction was selected for this trial proof procedure. The method of set uping the information set harmonizing to the nature of the trial and besides the application of the fresh goodness of fit trial was clearly discussed through this chapter under the relevant bomber subjects.

5.1 Description about the informations set

The information set selected for the survey is comparatively big with 31,022 persons in 2280 schools. The information set consisting the consequence of GCSE scrutiny of pupils among 131 instruction governments.

The following table represents the description of the variables which are carried through this illustration.

Table 5.1: Description about the variables

Variable Description about the variable Type of the variable
ID_LEA ID for local instruction governments. ID for the 2nd degree
ID_I ID for persons. ID for the first degree
AGCSE Average GCSE mark of single centered at mean. Continuous variable
Gender

Gender of the persons.

Male=0 and Female=1

Categorical variable

( Male as basal class )

AGE_MONTHS

Age in months, centered at 222 months

or 18.5 old ages.

Continuous variable

Harmonizing to the above tabular array the information set consist of a multilevel nature where it is represented by two degrees. That is as the first degree the ID for persons and as the 2nd degree ID for Local Education Authorities was considered.

Under the survey the involvement was on ordinal categorical response and hence the uninterrupted variable GCSE is categorized into three classs based on percentiles. The classification is explained under the following bomber subject 5.1 which is Data readying. Besides the variables GENDER and AGE_MONTHS were selected as the explanatory variables.

5.1 Data Preparation

In order to integrate the ordinal categorical informations the variable GCSE was coded harmonizing to the percentiles values. Three ordered classs of the response were considered to suit the multilevel relative odds theoretical account. Those classs are constructed harmonizing to the undermentioned mode.

Harmonizing to the above the response variable is categorized in to three classs based on the percentile values and hence variable is categorized without any biasness and truth of the classification is improved.

Besides it is of import to observe that this information set consists with 131 bunchs and within a bunch the Numberss of observations are altering. However the bunch sizes which are well larger adequate to use the fresh trial and all the 131 bunchs are used to suit the theoretical account. The representation of the bunchs and bunch sizes can be tabulated as follows.

Table 5.2: Description of bunchs with regard to their bunch sizes

Cluster Identity Cluster size Cluster Identity Cluster size Cluster Identity Cluster size Cluster Identity Cluster size
1 41 7 75 13 112 19 220
2 144 8 10 14 33 20 181
3 50 9 96 15 357 21 182
4 22 10 91 16 77 22 67
5 128 11 35 17 91 23 246
6 35 12 78 18 224 ( Continued )
Cluster Identity Cluster size Cluster Identity Cluster size Cluster Identity Cluster size Cluster Identity Cluster size
24 113 54 225 84 314 114 395
25 155 55 171 85 64 115 510
26 95 56 65 86 519 116 931
27 173 57 149 87 65 117 74
28 111 58 153 88 423 118 969
29 51 59 289 89 102 119 916
30 233 60 328 90 190 120 376
31 151 61 104 91 127 121 408
32 155 62 333 92 109 122 329
33 121 63 470 93 217 123 251
34 703 64 232 94 85 124 661
35 197 65 69 95 185 125 520
36 201 66 274 96 150 126 265
37 57 67 131 97 670 127 321
38 201 68 55 98 53 128 432
39 138 69 119 99 96 129 762
40 149 70 200 100 432 130 391
41 18 71 235 101 203 131 442
42 232 72 103 102 75
43 147 73 94 103 466
44 313 74 28 104 68
45 323 75 67 105 278
46 256 76 119 106 64
47 121 77 115 107 704
48 317 78 95 108 469
49 160 79 220 109 802
50 53 80 71 110 241
51 49 81 108 111 364
52 207 82 490 112 592
53 92 83 152 113 791

Above tabular array 5.2 represents the figure of bunchs and the sizes of each and every bunch. All together there are 31,022 observations among 131 bunchs.

5.3 Model edifice

In order to suit the theoretical account to the selected information set the MLwiN 2.19 version was used in order to integrate the multilevel nature of the informations and to suit multilevel relative odds theoretical account. Theory behind the theoretical account and the theoretical account edifice procedure was clearly discussed under the Theory and methodological analysis chapter. In this subdivision the application of those mentioned methods will be discussed. Besides after initialising the theoretical account a relevant parametric quantity appraisal method should be given and consequently 1storder PQL method was used in order to get the better of the convergence jobs occurred in gauging parametric quantities.

5.3.1 Variable choice

When suiting the theoretical account the variables which significantly make an impact on the response variable should be identified. For this the forward choice method was used with 5 % degree of significance. In order to take the of import variables to the theoretical account the Wald trial statistic was used instead than utilizing the Likelihood ratio trial statistic. The ground behind this is for distinct response multilevel theoretical accounts the likelihood trial is non available in MLwiN. Therefore Wald trial statistic was calculated to prove the significance of the single coefficients in the theoretical account and those values were compared with chi-square one grades of freedom. If the Wald statistic for a covariate is important the peculiar covariate should be included in the theoretical account.

Harmonizing to the above process, by get downing bit by bit from the smallest theoretical account that is model merely with the changeless term the best theoretical account was selected by adding variables one at a clip. As the first measure a variable which is traveling to be added to the theoretical account should be selected foremost. For this ab initio all the variables were added to pattern with changeless term and among them the most important variable was selected based on the p-values.

Then harmonizing to the p-values the variable which has the lowest p-value was selected as the variable which should be added foremost to pattern. The out puts obtained from under this can be tabulated as in the following bomber subject.

5.3.2 Parameter appraisals under forward choice process for multilevel relative odds theoretical account

As discussed under bomber subject 5.3.1 the forward choice method can be carried in the undermentioned mode and the resulted out puts are mentioned consequently.

Phase 1

Each variable was added individually and Wald statistic was calculated for each variable and so the p-value obtained for trial statistic was compared with 5 % ( 0.05 ) significance degree to look into the significance of variables.

Table 5.3: Out puts obtained under phase 1

Variable Classs Estimates-( venereal disease () )

Wald trial

statistic

P-value
Gender

Female

( Male-Reference )

-0.650 ( 0.022 ) 872.934 7.5118e-192*
AGE_MONTHS -0.029 ( 0.003 ) 93.444 4.1786e-022*

*significance at 5 %

Harmonizing to the resulted values under phase 1 both the variables are important since the p-values are less than 0.05. Among the two variables GENDER is extremely important when compared with AGE_MONTHS. Therefore GENDER was selected as the first inclusion variable to the theoretical account with the changeless term merely. Harmonizing to the response of involvement which has three classs, the relative odds model gives two logits by sing 3rd class as the base degree. Now the random intercept multilevel relative odds fitted theoretical account after adding GENDER to the changeless merely theoretical account can be written as follows.

Where is the cumulative chance for each class, and stand for the observation index and the bunch ID severally.

Phase 2

As the following measure AGE_MONTHS variable was added to the theoretical account which was fitted under phase 1 and it was checked for significance with the other explanatory variable which is already exist in the theoretical account. The resulted values under this phase besides tabulated as follows.

Table 5.4: Out puts obtained under phase 2

Variable Classs Estimates-( venereal disease () )

Wald trial

statistic

P-value
AGE_MONTHS -0.029 ( 0.003 ) 90.150 2.2077e-021*

*significance at 5 %

After adding the AGE_MONTHS to the theoretical account with the GENDER variable the covariate AGE_MONTHS is important. Therefore the new theoretical account obtained under phase 2 can be formulated as follows.

0.648 ( 0.022 ) Gender

0.648 ( 0.022 ) Gender

Where is the cumulative chance for each class, and stand for the observation index and the bunch ID severally.

Final chief effects theoretical account

The concluding chief effects model consist both the covariates that have been selected at the 2nd phase. Therefore the multilevel relative theoretical account fitted by utilizing forward choice process is,

0.648 ( 0.022 ) Gender

0.648 ( 0.022 ) Gender

[ 5.1 ]

The is the cumulative chance of the mean GCSE mark of the pupil ( observation unit ) in the local instruction authorization ( bunch ) . Besides harmonizing to the principal of parsimoniousness, the chief effects theoretical account prefer over the interaction theoretical account due to the simpleness and the comprehendible than the interaction theoretical account.

5.4 Application of the fresh goodness of fit trial

Under this bomber topic the application of the proposed goodness of fit trial for the multilevel relative odds theoretical account will be discussed. For that the fitted chief effects theoretical account was applied to the selected information set under subdivision 5.3.

In order to the developed method foremost predicted chances of the fitted theoretical account were calculated and so the predicted mean tonss were calculated. The computation of the predicted mean tonss is mentioned under the theory and methodological analysis chapter. Now based on the predicted mean tonss observations of each bunch should be partitioned. The figure of divider used here is 10 and it is non a fixed value. But harmonizing to Hosmer and Lemeshow ( 1980 ) , 10 is the most celebrated value for the figure of groups. Then the 10 groups are such that first group contains the smallest values of the predicted mean tonss and the ten percent group contains the largest values of the predicted mean tonss.

By partitioning the information, the goodness of tantrum of the trial is conducted by making nine index variables for each bunch.

 

1if is in part g =

0 otherwise

Where is the predicted mean mark for the pupil in the bunch and .

Then to measure the theoretical account adequateness of the fitted theoretical account in equation 5.1, an alternate theoretical account is constructed by adding 10 index variables and theoretical account contains 9 index variables since first index variable was selected to be the basal class.

Harmonizing to the information set of the selected illustration most of the Numberss of observations within bunchs are non divisible by 10 in order to make index variables. To get the better of this job the method discussed by Abeysekara and Sooriyarachchi ( 2008 ) , was used where index variables can be defined as follows.

Here and is the index variable.

As the aim, it is interested to look into the adequateness of the fitted theoretical account the nothing and the alternate hypothesizes can be stated as,

HydrogenO: Fitted multilevel relative odds theoretical account is equal

Hydrogen1: Fitted multilevel relative odds theoretical account is non equal

Harmonizing to the above hypothesis so to measure the theoretical account adequateness of the fitted chief effects model 5.1, the alternate theoretical account 5.2 is constructed by adding index variables as follows.

[ 5.2 ]

Where is the index variable for the  group for  observation in the bunch.

If the fitted theoretical account 5.1 is equal so,

HydrogenO: Fitted multilevel relative odds theoretical account is equal to the informationsis non rejected and it implies that the coefficients of all index variables, .

Harmonizing to the above, discussed hypothesis is tested by utilizing MLwiN package. Consequently the joint Wald trial statistic of the alternate theoretical account 5.2 was calculated in order to look into the undermentioned hypothesis.

That is all the coefficients of index variables are equal to zero.

At least on coefficient of index variable is non equal to zero.

The end product of the resulted articulation Wald trial statistic can be tabulated as follows.

Table 5.5: End product of joint Wald trial statistic

Joint Wald trial

Statistic

Degrees of freedom

Chi-square value

( Tabulated at 5 % )

p-value
11.365 9 16.9 0.25152

Harmonizing to the end product values of the above tabular array 5.5, the p-value of the joint Wald trial statistic is greater than 0.05 ( 5 % degree significance ) proposing non rejecting the void hypothesis that is at 5 % degree of significance. On the other manus this is clearly shown by the Wald trial statistic value every bit good. Because it is less than the chi-square value, with 5 % degree of significance and 9 grades of freedom ( 11.365 & A ; lt ; 16.9 ) . This consequence concludes that the alternate theoretical account that is model with index variables is non equal to the information. It gives the fitted chief effects model under 5.1 is good fitted to the informations.

Finally all the above suggest the fresh goodness of fit trial fitted for multilevel relative odds theoretical account is equal to the selected illustration informations set.

5.5 Outline

Throughout this chapter the application of the proposed goodness of fit trial to the multilevel relative odds theoretical account was discussed. There a multilevel relative odds theoretical account was fitted to a existent life informations set and so the fresh trial was applied in order to measure the adequateness of the fitted theoretical account. Harmonizing to the consequences obtained through this application, it was suggested that the fresh goodness of fit trial works with the existent life informations every bit good.

The following chapter will transport out the general treatment of all the findings and the decisions obtained throughout this research survey. Finally it presents the restrictions and suggestions for the farther surveies.

A fresh Goodness of fit trial for Multilevel Proportional Odds theoretical account

Get instant access to
all materials

Become a Member
unlock