ABSTRACT 
 
 
 
Title of dissertation: EFFECTS OF UNMODELED LATENT CLASSES ON 
MULTILEVEL GROWTH MIXTURE ESTIMATION IN 
VALUE-ADDED MODELING 
 
 
 Futoshi Yumoto, Doctor of Philosophy 2011 
 
 
Dissertation directed by: Professor Gregory R. Hancock 
    Professor Robert J. Mislevy 
    Department of Measurement, Statistics and Evaluation 
 
  
 Fairness is necessary to successful evaluation, whether the context is simple and 
concrete or complex and abstract.  Fair evaluation must begin with careful data 
collection, with clear operationalization of variables whose relationship(s) will represent 
the outcome(s) of interest.  In particular, articulating what it is in the data that needs to be 
modeled, as well as the relationships of interest, must be specified before conducting any 
research; these two features will inform both study design and data collection. 
Heterogeneity is a key characteristic of data that can complicate the data collection 
design, and especially analysis and interpretation, interfering with or influencing the 
perception of the relationship(s) that the data will be used to investigate or evaluate. 
However, planning for, and planning to account for, heterogeneities in data are also 
critical to the research process, to support valid interpretation of results from any 
statistical analysis. The multilevel growth mixture model is a new analytic method 
specifically developed to accommodate heterogeneity so as to minimize the effect of 
variability on precision in estimation and to reduce bias that may arise in hierarchical 
  
 
 
data. This is particularly important in the Value Added Model context ? where decisions 
and evaluations about teaching effectiveness are made, because estimates could be 
contaminated, biased, or simply less precise when data are modeled inappropriately. This 
research will investigate the effects of un-accounted for heterogeneity at level 1 on the 
precision of level-2 estimates in multilevel data utilizing the multilevel growth mixture 
model and multilevel linear growth model. 
  
  
 
 
 
 
 
 
EFFECTS OF UNMODELED LATENT CLASSES ON MULTILEVEL GROWTH 
MIXTURE ESTIMATION IN VALUE-ADDED MODELING  
 
 
 
by 
 
 
Futoshi Yumoto 
 
 
 
 
Dissertation submitted to the Faculty of the Graduate School of the 
University of Maryland, College Park in partial fulfillment 
of the requirements for the degree of 
Doctor of Philosophy 
2011 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Advisory Committee: 
 Professor Gregory R. Hancock, Co-Chair 
 Professor Robert J. Mislevy, Co-Chair 
 Professor George Macready 
 Professor Robert Lissitz 
 Professor Paul Hanges 
  
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
?Copyright by 
 
Futoshi Yumoto 
 
2011 
 
 
 
 
 
 
 
  
ii 
 
Dedication 
 
This dissertation is dedicated, in part, to my mother Atsuko Yumoto and to my father 
Kazuaki Yumoto, who wholeheartedly supported me in every endeavor and whose faith 
and encouragement have continued to push me to learn and grow. I was unable to 
complete the work and this program while my mother was alive but I hope and believe 
the final result would have pleased her. This work is also dedicated, in part, to my brother 
Ryogo Yumoto, who sacrificed his opportunity to move to the United States so that I 
could do it. My life here has been rich and rewarding, and so I am profoundly grateful to 
him for making this experience ? and essentially, this research, possible. Finally, I want 
to dedicate this work to my wife, Rieko, and daughters, Sora and Lena, and to Lin and 
Toto. While the PhD dissertation is a significant intellectual achievement, what it really 
represents is the end of my attention being divided so completely between research and 
real life. 
 
 
 
 
  
iii 
 
Acknowledgments 
 
This research project would not have been completed without the support, and insistence, 
of Professors Gregory Hancock and Robert Mislevy at the University of Maryland, 
College Park; and it might never have come to be if Mark Stone had not introduced me to 
the world of psychometrics, way back in 2000. Gregory Anderson helped me to become a 
competent researcher with his unstinting generosity with respect to time, encouragement, 
and wit. Rochelle Tractenberg provided invaluable assistance with the writing and 
organizational aspects of the dissertation, as well as patience and sometimes even grant 
support of both me and the endeavor.  My dissertation committee was unbelievably 
accommodating throughout the process of proposing and completing this project. Finally, 
I am indebted to my wife Rieko, and my daughters Sora and Lena: you have brought 
balance to my life and shown me how important it is to PLAY.
  
iv 
 
Table of Contents 
Chapter 1: Introduction ........................................................................................................1?
 1.1? Heterogeneity?and?estimation?.........................................................................................?9?
 1.2? Multivariate?analytic?methodological?innovations?for?heterogeneity?and?precision?....?11?
 1.3? Consideration?of?latent?covariates?and?the?general?mixture?model?..............................?14?
 1.4? The?general?mixture?model?and?latent?growth?curves?in?growth?mixture?models........?18?
 1.5? Multilevel?growth?mixture?modeling?supporting?precise?estimation?and?inference?.....?19?
 Chapter 2: Literature Review .............................................................................................21?
 2.1? Educational?effects?estimation?with?growth?curve?mixture?models?..............................?22?
 2.2? The?problem:?Estimation?with?growth?curve?modeling?.................................................?23?
 2.2.1? Algebraic?representation?of?Multilevel?Growth?Mixture?Model?(MLGMM)?..........?26?
 2.2.2? Algebraic?representation?of?Growth?Mixture?Model?.............................................?30?
 2.2.3? Algebraic?representation?of?multilevel?latent?growth?model?................................?32?
 2.2.4? Algebraic?representation?of?latent?growth?model?.................................................?33?
 2.2.5? Algebraic?representation?of?the?multilevel?model?.................................................?34?
 2.2.6? Contrasting?multilevel?(MLM)?and?latent?growth?(LGM)?models?..........................?34?
 2.2.7? Value?Added?Model?as?a?multilevel?model?............................................................?35?
 2.2.8? Effects?of?level?1?heterogeneity?on?the?estimates?of?level?2?effects?in?MLGMM?..?36?
 2.3? Substantively?interpretable?latent?class?structure?in?VAM?............................................?42?
 2.3.1? Potential?impact?of?latent?classes?..........................................................................?44?
 2.4? Effects?of?un?accounted?for?heterogeneity?at?level?1?on?the?precision?of?level?2?
 estimates?in?multilevel?data?.......................................................................................................?44?
 Chapter 3: Methods ............................................................................................................46?
 3.1? Characteristics?of?the?simulation?...................................................................................?46?
 3.2? Characteristics?of?individual?(level?1)?data?.....................................................................?47?
 3.3? Characteristics?of?cluster?(level?2)?data?.........................................................................?48?
 3.3.1? Sample?size?is?determined?by?cluster?size?and?cluster?number?.............................?48?
 3.3.2? Cluster?type?............................................................................................................?49?
 3.3.3? Mixture?proportion?................................................................................................?50?
 3.3.4? Cluster?effects?........................................................................................................?51?
 3.4? Data?simulation?..............................................................................................................?53?
 3.5? Model?Fitting?..................................................................................................................?55?
  
v 
 
3.6? Analysis?of?results?of?model?fitting?................................................................................?56?
 3.7? Analysis?..........................................................................................................................?58?
 3.7.1? Outcomes?of?Interest:?Parameter?Recovery?..........................................................?58?
 3.7.2? Outcome?of?Interest:?Classification?error?at?quintile?level?....................................?59?
 3.8? Evaluation?of?simulation:?Achievement?of?stated?design?aims?......................................?59?
 3.9? Summary?of?Methods?....................................................................................................?60?
 Chapter 4: Pilot Study: testing simulation features and analysis plan ...............................61?
 4.1? Testing?simulation?features?...........................................................................................?61?
 4.2? Preliminary?results?on?the?model?identification?and?class?identification?......................?62?
 4.3? Preliminary?results?on?precision?of?estimates?................................................................?65?
 4.4? Preliminary?results?on?classification?accuracy?...............................................................?67?
 4.5? Preliminary?results?on?ANOVA?over?the?simulation?condition?......................................?69?
 4.6? Determination?of?number?of?replications?for?the?study?................................................?70?
 4.7? Other?Analysis?Issues?.....................................................................................................?71?
 4.8? Pilot?study?summary?......................................................................................................?71?
 Chapter 5: Main Study Results ..........................................................................................73?
 5.1? Results?on?model?identification?.....................................................................................?75?
 5.2? Results?on?bias?of?estimates?..........................................................................................?81?
 5.2.1? Cluster?Effect?Condition?1?(CE=1)?...........................................................................?83?
 5.2.2? Cluster?Effect?Condition?2?and?3?(CE2?and?3)?.........................................................?87?
 5.2.3? Cluster?Effect?Condition?4?and?5?(CE=4?and?CE=5)?.................................................?90?
 5.3? Precision?of?estimates?....................................................................................................?92?
 5.4? Results?on?classification?accuracy?..................................................................................?93?
 Chapter 6: Discussion ......................................................................................................100?
 6.1? Model?convergence?of?the?multilevel?growth?mixture?model?.....................................?104?
 6.2? Information?criteria?performance?................................................................................?104?
 6.3? Evaluation?of?systematic?biases?across?the?simulation?condition?................................?107?
 6.3.1? Cluster?effect?condition?1?(CE=1)?.........................................................................?107?
 6.3.2? Cluster?Effect?Conditions?2?and?3?(CE=2?and?3)?...................................................?110?
 6.3.3? Cluster?Effect?Condition?4?and?5?(CE=4?and?5)?.....................................................?113?
 6.3.4? Precision?of?estimates?..........................................................................................?113?
 6.4? Results?on?classification?accuracy?................................................................................?114?
  
vi 
 
6.5? Limitations?of?the?research?..........................................................................................?115?
 6.6? Future?directions?..........................................................................................................?117?
 Appendix A ......................................................................................................................119?
  
 
 
  
vii 
 
List of Tables 
 
Table 1 .............................................................................................................................. 47?
 Table 2 .............................................................................................................................. 48?
 Table 3 .............................................................................................................................. 51?
 Table 4 .............................................................................................................................. 53?
 Table 5 .............................................................................................................................. 63?
 Table 6 .............................................................................................................................. 64?
 Table 7 .............................................................................................................................. 66?
 Table 8 .............................................................................................................................. 67?
 Table 9 .............................................................................................................................. 69?
 Table 10 ............................................................................................................................ 71?
 Table 11 ............................................................................................................................ 75?
 Table 12 ............................................................................................................................ 77?
 Table 13 ............................................................................................................................ 78?
 Table 14 ............................................................................................................................ 79?
 Table 15 ............................................................................................................................ 80?
 Table 16 ............................................................................................................................ 81?
 Table 17 ............................................................................................................................ 94?
 Table 18 ............................................................................................................................ 95?
 Table 19 ............................................................................................................................ 96?
 Table 20 ............................................................................................................................ 97?
 Table 21 ............................................................................................................................ 98?
 Table 22 ............................................................................................................................ 99?
  
  
viii 
 
List of Figures 
 
Figure 1. Graphical representation of growth profile ......................................................... 5?
 Figure 2. Graphical representation of simulation sample: Cluster size by cluster type ..... 6?
 Figure 3. Graphical representation of simulation sample: mixture proportion .................. 7?
 Figure 4. Graphical representation of simulation sample: Cluster effect ........................... 7?
 Figure 5. Hierarchy of family of latent growth models .................................................... 26?
 Figure 6. Graphical representation of unconditional MLGMM ....................................... 30?
 Figure 7. Graphical representation of linear GMM .......................................................... 31?
 Figure 8. Conceptual representation of value added model ............................................. 35?
 Figure 9. Bias estimates for cluster effect 1 (CE1) and cluster size 20 (CS20) ............... 84?
 Figure 10. Bias estimates for cluster effect 1 (CE1) and cluster size 40 (CS40) ............. 85?
 Figure 11. Bias estimates for cluster effects 2 and 3 (CE2 and CE3) .............................. 88?
 Figure 12. Bias estimates for cluster effect 4 and 5 (CE4 and CE5) ................................ 91?
       
1 
 
Chapter 1: Introduction 
 Fairness is necessary to successful evaluation. Evaluation can be based on simple 
measurement of the weights of things for comparison, or based on measuring something 
as abstract as a single effect within a complex system such as the effect of teachers upon 
the progress of students? performance. Fairness in evaluation requires that all subjects are 
measured without a bias (i.e., the evaluation score reflects the true property of subjects) 
and accurately (i.e., repeated measures yield the same results). A single evaluation 
process that can produce favorable results to one group and penalize another group when 
measured identically is unfair. Fairness is essential to both simple and complex 
evaluation. A simple evaluation example is to compare the average weights of students in 
two classrooms, derived from the weights of individuals measured on a pair of scales. A 
complex evaluation example, and the focus of this research, is the value-added model 
(VAM; Sanders & Rivers, 1996), which is an evaluation of teachers based on a statistical 
estimate of student performance gains that are attributed to the effect of the teacher. 
 Statistical models ? irrespective of their complexity ? are always simplifications 
of the data they represent: when summarizing the weights of students in classrooms, the 
mean value is a (very simple) model that collapses over the distribution (Miles & 
Shevlin, 2000), thereby masking features such as whether the distribution is multimodal, 
whether there are outliers, and so forth. For example, if one classroom has more males, 
and this was not incorporated into the estimate of the mean (naturally resulting in a more 
complex summary of the classroom?s weight), then one room could appear to have a 
higher average weight but in fact the reasons for differences between the groups? weights, 
a model which does not take the sex of students into account produces bias. Similarly, 
       
2 
 
when applying the VAM approach to estimating teacher effects on student performance, 
assumptions are made that might affect the estimates or their 
interpretation/interpretability. The goal of this research is to assess the impact on VAM 
analyses of the assumption that all students are equally affected by the teacher?s effect 
within a classroom ? that is, that improvement of student performance due to the teacher 
does not vary systematically across students in the classroom. If there is systematic (as 
opposed to random) variation in the teacher effect within a classroom, and this leads to 
incorrect estimates of the teacher?s overall effect, then this VAM assumption is not 
supportable, that is, if unaccounted for in the VAM analysis, this heterogeneity (of 
teacher effect on students) could bias the estimates of teacher effects and result in an 
unfair evaluation procedure.  
 In the classroom weights example, the comparison will not be fair if it involves 
two scales and one scale always shows a higher weight than the other scale for any given 
student. One scale is biased in this example. The other scale might have higher variation 
in its measurement (i.e., the scale shows wider variation of values for the same subject). 
This scale has less precision or, equivalently, higher error in this case. It is relatively 
easy to control the issues of bias or precision on the scales in this example. The example 
can be made more complex by adding other factors such as the proportion and males and 
females in each classroom, as noted above. If the groups being compared differ in a 
systematic way, they are not strictly comparable. The evaluations of students and teachers 
pose similar challenges, in terms of isolating the effects of interest while taking all 
sources of bias into account. The importance of fairness is naturally far greater since 
decisions and even policies are made based on these estimates. The VAM approach has 
       
3 
 
many advantages over simpler models, because it permits the inclusion of multiple 
sources of bias and heterogeneity that could affect VAM-derived estimates. 
 The ?value-added? model is commonly implemented in multi-level modeling 
frameworks so as to capture the contribution of higher-level effects such as teachers 
(level 2) and schools (level 3) on the student?s achievement and/or improvement (Sanders 
& Rivers, 1996). As with the proportions of genders in the previous example, there are 
several factors, such as student ethnicity, socio-economic status, previous performance 
level, or classroom size to control so as to minimize the systematic bias and errors, 
deriving from the classroom or school, that can affect the estimation of a particular 
teacher?s ?value? added.  
A common assumption for this type of teacher evaluation method is that the 
teacher?s effect is constant within a classroom. In other words, all students are assumed to 
have received the same contribution or benefit from the teacher, so the teacher?s effect on 
students is homogeneous. This assumption may be unreasonable for a student with a 
minor and undiagnosed disability (e.g., minor learning disability), a student who has no 
interest in education, or a student who lives in such conditions that study cannot be a 
priority (e.g., lack of food or safety in life).  In fact, this assumption is unrealistic. There 
are students who are unmotivated, who have different priorities other than focusing on 
studies, or simply who unable to understand the instruction. These students do not receive 
any benefit from the teacher no matter how effective or ineffective she or he is. These, 
and other, unanticipated or unknown sources of heterogeneity can contribute bias and/or 
imprecision to estimates. As discussed in Chapter 2, a recent study of persistently low 
performing (PLP) students (Lazarus et al., 2010) strongly indicates that there is a group 
       
4 
 
of students who do not receive benefit from traditional classroom instruction. The 
measure of a teacher?s effect is likely to be different when there are different proportions 
of different types students in each class; the presence of a group of class of students 
within a classroom represents a systematic effect that should be accounted for in the 
model and estimate. It is not entirely fair for teachers to be responsible for students? 
improvement if the majority of students are not interested in learning. Fairness in 
evaluation of the instructor effect on student performance or gains cannot be established 
in this case unless the effects from such students are either negligible or adequately 
controlled in the evaluation procedure; simply assuming that they are is insufficient. The 
primary focus of this research is to investigate the impact of ignoring the non-performing 
classroom group in the last example on the evaluation of teacher?s effect on students? 
gain in test scores, focusing on the bias and precision of the estimated teacher?s effect.  
 This study is a simulation motivated by the situation where teacher effect on 
student performance must be measured so as to evaluate the teacher?s quality, 
performance, or achievements. In this situation, we assume that there are two types of 
students in the classroom: fast and slow growers (or students with fast or slow growth 
profile) in terms of the skills they are being taught, represented by both gains on 
standardized test scores (slope) and initial achievement level (intercept). Figure 1 shows a 
graphical representation of students? growth profiles in each of these two groups (the 
actual slopes for particular students vary around these two lines.)  Each classroom may 
have different proportions of students with each growth profile, and that is represented in 
this simulation in order to determine whether unknown, or unmodeled, heterogeneity in 
student type (based on proportion of students with each growth profile) ? which is 
       
5 
 
inconsistent with the VAM assumption that all students get the same effect from a given 
teacher ? affects VAM-based estimates.  An additional challenge for VAM-derived 
estimates is that some teachers may actually facilitate the transition of students from the 
slow growth group to the fast growth group. This would have a substantial impact on 
students but may not be reflected within the current value added evaluation context, at 
least in the short term.  This aspect of the VAM approach is beyond the scope of this 
study, but represents additional aspects teacher effects that should be evaluated for the 
fairest estimates of their quality, performance or achievement. 
 Figure 1. Graphical representation of growth profile 
To illustrate the study design, the simulation features are consistent with, for example, a 
school with six 8th grade classrooms: classrooms A, B, C, D, E and F, each with 40 
students. Since there are six classrooms, in this example (but not in the simulation design 
which are 30, 60, or 90), cluster number (CN) is six. Figures 2 through 4 illustrate such a 
school, with classes A to F and cluster size (CS) of 40.  
       
6 
 
 There are three levels of classroom, high achieving classes A and B, moderately 
achieving classes C and D, and low achieving classes E and F. These three levels are 
shown as the cluster type 1, 2, or 3 in Figure 2.  Each cluster type has an equal number of 
classes, two classes (of CN=6 in this example) per cluster, as shown in Figure 2.  
? ?
 ? Cluster?Type?
 ? 1? 2? 3?
 Clu
 ste
 r?S
 ize
 ?(C
 S)?
 ?? ?? ??
 ?? ?? ??
 1/3?of?CN?
 (Classes?
 E?&?F)?
 1/3?of?CN?
 (Classes?
 C?&?D)?
 1/3?of?CN?
 (Classes?
 A?&?B)?
 ?? ?? ??
 ?? ?? ??
 ?? ?? ??
  
Figure 2. Graphical representation of simulation sample: Cluster size by cluster type 
 With this example, imagine that the proportion of students in the two growth 
groups is different among three types of classrooms, resulting in the overall achievement 
level of that classroom. High achieving classes A and B have all 40 students in fast 
growth group. Moderately achieving classes C and D have 75% (30/40) students in the 
fast growth group and 25% (10/40) students in the slow growth group. In low achieving 
classes E and F, each growth group makes up 50%, or 20 students. Figure 3 shows the 
different proportion of students in each growth profile, or the mixture proportion (MP), in 
each cluster type (cluster type 1 is the low achieving because it is a mixture containing 
mostly low growth students, cluster type 2 is the moderately achieving because it 
containing lower proportion of low growth students, and cluster type 3 is the high 
achieving group because they are mixtures with highest proportion of high growth 
students).  
       
7 
 
? Cluster?Type?
 ? 1? 2? 3?
 Clu
 ste
 r?S
 ize
 ?(C
 S)? ?? ?? ??Low?Growth? Low?Growth? ??
 ?? ?? ??
 ?? ?? ??
 High?Growth? High?Growth? High?Growth?
 ?? ?? ??
  
Figure 3. Graphical representation of simulation sample: mixture proportion 
We are positing for the sake of this study that only students in the fast growth group 
receive any benefit of instruction from teachers ?that is, the teacher?s effect is zero for 
students in low intercept/growth group. Figure 4 shows the teacher?s effect based on the 
growth profile and mixture proportion. The teacher?s effect is the same as the cluster 
effect (CE) in this study   
? Cluster?Type?
 ? 1? 2? 3?
 Clu
 ste
 r?S
 ize
 ?(C
 S)? ?? ?? ??No?Cluster?Effect? No?Cluster?Effect? ??
 ?? ?? ??
 ?? ?? ??
 Cluster?Effect? Cluster?Effect? Cluster?Effect?
 ?? ?? ??
  
Figure 4. Graphical representation of simulation sample: Cluster effect 
 In this scenario, it is very difficult for teachers with low achieving classes to 
obtain a high value added score as compared to teachers with high achieving class ? even 
if they have add identical value compared with teachers in high achieving classes ? 
because the expected average teacher effect is attenuated by the group of students who 
are not responsive to any instruction. In other words, teachers are penalized, in terms of 
the estimation of their effectiveness, by the kinds of students they have in the classroom 
under the assumption that all students receive the same benefit from the instruction. 
       
8 
 
Fairness in teacher evaluation cannot be established without accounting for the growth 
profile (type) of students.  
 This study systematically investigates the bias in estimating teacher effects when 
student type is unmodeled, that is, under VAM assumption that the students receive a 
homogeneous (randomly, not systematically, varying) effect from the teacher, by 
manipulating conditions identified in the example above, including cluster number (CN) 
which was fixed at 6 in Figures 2-4 but which varies as described below and more 
extensively in Chapters 2-4. The growth profiles are consistent throughout the study: high 
mean score or intercept and high growth rate or slope for the fast growth group, and the 
low mean score and low growth rate for the slow growth group. The actual parameters 
are explained in Chapter 3. This study manipulates conditions used in the 8th grade school 
example above, including the class size, number of classes in a school, proportions of 
students in each growth group, and teachers with different effects. The study tried to 
identify conditions, and/or interactions among conditions, which are plausible or 
empirically established, and which have the greatest potential to cause unfairness in 
evaluation. 
 There are four simulation conditions to manipulate as illustrated in Figures 2 
through 4. 
? Cluster number (CN) is the number of clusters (e.g., classes) in the sample (e.g., 
school). 
? Cluster size (CS) is the number of students in a cluster (e.g., 40 students in a 
classroom). 
       
9 
 
? Mixture proportion (MP) is the proportion of students in each growth profile 
within a cluster (e.g., 100%, 75%, or 50% of students in fast growth group in 
three levels of classes). 
? Cluster effect (CE) is a cluster?s true effect (i.e.. an individual teacher?s effect) on 
the students in a cluster (e.g., classroom). 
This study systematically manipulated these four simulation conditions to investigate the 
bias and precision in an estimated cluster effect (or teacher?s effect) when estimated with 
or without accounting for the heterogeneity (i.e., mixture proportion) in data. The goal of 
the study is to identify if there are substantial, systematic biases in the cluster effect 
estimates in any simulation conditions that would make fair evaluation difficult, if not 
impossible. The simulation conditions of this study are still much simpler than the real 
world; as noted earlier, statistical models ? irrespective of their complexity? are always 
simplifications of the data (or the systems) they represent ? and so are simulation studies. 
However, the study was designed to determine if the VAM approach and specifically, the 
assumption of a homogeneous, or randomly varying, teacher effect for all students is 
reasonable or not. The simulation study is described completely in Chapter 3.  
1.1 Heterogeneity and estimation 
 Heterogeneity of the student population or distribution is a key characteristic of 
data that can complicate evaluation design and especially, analysis and interpretation. 
Heterogeneity can either be random or systematic. Random variability is what makes 
estimation necessary, otherwise there would be a single value to summarize any effect or 
system. Systematic variability is the heterogeneity that makes estimation complex, 
because as noted earlier it leads to bias and imprecision in estimation if it is not included 
       
10 
 
in the analytic model. In the context of  VAM-derived teacher effects, the systematic 
variability, or heterogeneity among the students belonging to high and low growth groups 
as described in Figures 2-4, represents a clear violation of the assumption that student 
growth patterns are homogeneous (i.e., vary only randomly) within a classroom. 
Therefore, if student growth patterns vary systematically (i.e., are heterogeneous) within 
classrooms, the variability attributed to the unmodeled heterogeneity inflates variability 
higher up in the model (e.g., at level-2, classroom/teacher), causing mis-estimation and 
even mis-interpretation of the teacher?s effect. When it is assumed that students are 
homogeneous, or that the teacher?s effect on the students varies only randomly across 
students, then any actual heterogeneity represents unknown or uncontrolled sources of 
variability in the system ? violating the VAM assumption. 
  It is crucial to identify the relationship(s) that any dataset will be used to 
investigate, and what it is in the data that needs to be modeled, before conducting any 
research; specification of these two features will inform both study design and data 
collection, shaping the research design and/or hypothesis. However, planning for, and 
planning to account for, heterogeneities in data are also critical to the research process, to 
support valid interpretation of results from any statistical analysis. This study sought to 
investigate the effects of un-accounted for heterogeneity at student level (i.e., level 1) on 
the precision of teachers? effect (i.e., level-2) estimates in multilevel data. 
The importance of modeling variability explicitly is reflected in both empirical 
studies (e.g., Clogg & Goodman, 1985; Goodman, 1974; Henry & Muth?n, 2010; Jo, 
2002; Kreuter & Muth?n, 2008; Lambert, 1992; Lazarsfeld & Henry, 1968; Muth?n & 
Shedden, 1999; Nagin, 1999; Nagin & Land, 1993; Samuelsen, 2005) and 
       
11 
 
methodological developments (e.g., Asparouhov & Muth?n, 2008; Bartholomew & 
Knott, 1999; Bollen & Curran, 2006; Goodman, 1974; Lazarsfeld, 1950; McLachlan & 
Peel, 2000; Muth?n, 2001; Nagin, 1993; Quandt, 1958; Quandt & Ramsey, 1972; 
Raudenbush & Bryk, 2002; Skrondal & Rabe-Hesketh, 2004; Titterington, Smith & 
Makov, 1985; Verbeke & Lesaffre, 1996; Vermunt & Van Dijk, 2001).  Methodological 
work has benefitted from and expanded to accommodate and model, the influence of both 
observed (manifest) and unobserved (latent) variables on the estimation and interpretation 
objectives of multivariate statistical analysis (Loehlin, 1998). Software like MPlus 
(Muth?n & Muth?n, Ver 6.1, 2010) and Latent Gold (Vermunt & Magidson, Ver 4.5, 
2010) has been both developing, and supporting, the capacity of investigators to consider 
and analyze manifest and observed contributors to heterogeneity in their data (e.g., 
Feldman, Masyn & Conger, 2009; Henry & Muth?n, 2010; Jo, 2002; Kreuter, Yan, & 
Tourangeau, 2008; Marsh et al., 2009; Preacher, Zyphur, & Zhang, 2010; Schaeffer et al., 
2006; Van Horn et al., 2009). As is explicated in Chapter 2, estimates from statistical 
models can vary depending on whether manifest and/or latent variables are modeled (see, 
e.g., Hancock & Lawrence, 2006; Muth?n & Asparouhov, 2009) ? and particularly 
whether these are modeled appropriately or not (e.g., Chen et al., 2010; Palardy & 
Vermunt, 2010). Since the estimates can vary in relation to these features, so, too can the 
inferences based on those estimates. 
1.2 Multivariate analytic methodological innovations for heterogeneity and precision 
 Multivariate methods have been developing and evolving with increasing, and 
increasingly sophisticated, mechanisms for modeling both the heterogeneity in data and 
the actual relationships under study. A recent development is the multi-level model, 
       
12 
 
which has become increasingly widespread in educational research. In 1972, Lindley and 
Smith published the first multi-level model ? in Journal of the Royal Statistical Society 
(Lindley & Smith, 1972), which was developed to accommodate heterogeneity in the 
individual that detracted from the precise estimation of the group-level parameter of 
interest; random effects can include variation in group level parameters (e.g., group level 
mean/intercept), or degree of individual mean deviation from the overall group-level 
mean. The terms multilevel or hierarchical describe the situation where sets of 
observations are treated as levels, hierarchically nested within other sets or levels, such as 
a students nested within schools (Nezlek & Zyzniewski, 1998). Verbeke and 
Molenberghs (2000) describe the specification of random effects as the second of a two-
 stage modeling method (?general linear mixed modeling?, Ch. 3); considering their 
?stages? as levels corresponds to a multi-level model. The random effect represents an 
additional level of analysis, so that regression coefficients become random variables; with 
observations nested within, for example, individuals (for whom a single constant 
regression coefficient would be estimated). The multilevel model is generally used to 
account for the interdependence of individuals within the same group and model the 
effects of both individual-level and group-level variation (i.e., heterogeneity) on an 
outcome simultaneously (Pollack, 1998). 
 Burstein, Linn, and Capell (1978) utilized multi-level data analysis to 
accommodate the presence of heterogeneity in regression estimators across classrooms 
within a single sample. Other investigators have focused on the bias introduced into 
estimation when within-group correlations are, or are not, explicitly accounted for within 
modeling (see Kreft & de Leeuw, 1998). The treatment of data as explicitly hierarchical, 
       
13 
 
with observations at one level (e.g., at the individual level) nested within other levels 
(e.g., the group or class level) depends critically on how the levels and hierarchy are 
described and defined (see Kreft et al., 1995), and this is true with random effects, mixed 
effects, or multi-level models. Verbeke and Molenberghs (2000) considered observations, 
repeated over time, to be nested within individuals. Explicit modeling of the hierarchies 
in data is sometimes called hierarchical linear modeling (Raudenbush & Bryk, 2002). To 
maintain generality, we refer to this type of model as a multi-level model (MLM). 
Considering scores nested within students nested within schools makes the 
estimated student level (level 1) effects of pre- on post-test scores more precise and less 
biased (Raudenbush & Bryk, 2002). That is, by planning for and accommodating the 
heterogeneity arising from specific features in the data, the effect of variability on the 
estimates can be minimized. For example, if students within a classroom are more 
homogeneous (random variation is lower) than the overall student population, accounting 
for clustering of data (e.g., modeling students as if they are nested within a classroom) 
reduces overall error. In this example, accommodating the lower level of variability 
within this classroom improves precision and reduces bias in estimates based on this 
classroom by reducing error. Muth?n and Asparouhov (2009) investigated the source of 
heterogeneity in multilevel data by treating a mathematics achievement score as level 1 
(student level), and student scores as nested within a school (level 2). Their two-level 
regression (actually, a mixture regression) model showed heterogeneous residual variance 
varying across level-1 covariates (p. 649), indicating the presence of additional, 
unaccounted-for variability at level 1 ? going beyond the multi-level modeling approach 
to accommodating variability. In fact, Muth?n and Asparouhov (2009) found that the 
       
14 
 
effects of level-1 covariates were different for estimates of both level-1 and level-2 
effects on the dependent variable when their additional variability was modeled at level 
1; thus, the explicit inclusion of covariates at level 1 had a significant impact on their 
results and interpretation. This finding, further described in Chapter 2, underscores the 
importance of the careful investigation of the sources of heterogeneity in multilevel 
analysis ?beyond the simple nesting effects of student and school in the ?conventional? 
two-level model. Muth?n and Asparouhov (2009) concluded that the conventional two-
 level model, with effects estimated separately for student and for school, could not 
effectively eliminate the effects of level 1 heterogeneity on the estimation of level 2 
effects. This implies that simply using a multi-level modeling approach may be 
insufficient for valid modeling and interpretation of results, because variability at level 1, 
if unaccounted for, could affect estimates of level 2 effects. Thus, estimating student 
scores via a multilevel regression will be less precise and will be biased, if this 
conventional two-level model is used but unmodeled student-level covariates are actually 
contributing to the heterogeneity underlying within-school correlations among student 
scores. This is described more fully in Chapter 2. 
1.3 Consideration of latent covariates and the general mixture model 
The covariate used to account for the variability at level 1 in Muth?n and 
Asparouhov?s (2009) analysis represents a latent class. As noted, Muth?n and 
Asparouhov (2009) found that the conventional multilevel model was insufficient to yield 
precise estimates of level 2 effects; their solution was to utilize latent covariates, inferred 
from the data, because none of the manifest covariates had any explanatory power.  In 
fact, the latent covariate used to account for the variability at level 1 in Muth?n and 
       
15 
 
Asparouhov?s (2009) analysis represents a latent class. Magidson and Vermunt (2004) 
describe a latent class as some factor causing ??some of the parameters of a postulated 
statistical model <to> differ across unobserved subgroups,? (p. 175) where categories of 
subgroups of this unobserved or latent categorical variable make up the levels of the 
latent class (LC). Therefore, a LC is a subgroup indicator, like a covariate, but it is latent 
and must be inferred from data. An example of a latent class model, first noted by 
Lazarsfeld (1950), includes the classification of applicants into subgroups (e.g., 
acceptance and rejection groups for uniformed services recruits), that were estimated 
from a set of dichotomous responses on a questionnaire (see also Lazarsfeld & Henry, 
1968). That is, the classification of applicants was not based on any observed data, but 
based on their dichotomous responses, the latent (unobserved) classes into which the 
applicants were sorted were inferred. 
An example of the development and growing support of the capacity of 
investigators to consider and analyze both manifest and observed contributors to 
heterogeneity is a family of methods called ?mixture models? (Muth?n, 2002).  Verbeke 
and Molenberghs (2000) refer to a ?mixture? as a regression that includes both random 
and fixed effects. However, in the more general context (as described in Muth?n, 2002), 
mixture models are a type of statistical method used to conduct an analysis while 
simultaneously examining if there is more than one sub-population (e.g., at least two 
subgroups with different distributions) in data (e.g., Muth?n, 2002; Magidson & 
Vermunt, 2004).  
Mixture models (in this more general sense) have been applied in research 
domains as diverse as organization (Lazarsfeld, 1950), education (Dayton, 1991), and 
       
16 
 
medicine (Rindskopf & Rindskopf, 1990). In each case, some analytic method (e.g., 
linear regression) is the objective, but subpopulations in the data may warrant different 
regression features.  
The most general mixture model can be defined as analysis that includes the 
search for latent subpopulations while simultaneously estimating statistical models 
including several causal effects, which is beyond straightforward multiple regression. For 
example, multilevel, structural equation, growth, and the combination of these types of 
modeling approaches fall under ?general mixture models? (see Bartholomew, 1987; 
Muth?n, 1989; Muth?n, 2001; Skrondal & Rabe-Hesketh, 2004; Vermunt & Magidson, 
2002). Latent class analysis and finite mixture modeling (McLachlan & Peel, 2000) are 
technically subsumed within mixture modeling, as they are very specific types of mixture 
models. This most general formulation of mixture models, which we refer to as ?the 
general mixture model approach? comprises models ranging from simple estimation of a 
latent class or a finite mixture, through less complex models with simultaneous latent 
class or finite mixture evaluation, to highly complex modeling such as latent growth plus 
latent class/finite mixture combinations. 
The general mixture model approach has the potential to completely reshape how 
educational research is done.  For example, with general mixture models one can both 
identify differential patterns of growth in a group of students while simultaneously 
identifying the subgroups within the sample for which targeted interventions (e.g., 
different types of instruction) can be tailored. In educational research, manifest variables 
such as socio-economic status (SES, low/high) are often important covariates, but these 
should not be confused with latent class (LC) variables. Less general models (e.g., latent 
       
17 
 
class or finite mixture models) cannot serve these purposes because the primary focus of 
the less general models is to identify the latent class from the set of observed categorical 
or continuous variables, instead of permitting the identification of such classes from 
estimates derived from other simultaneous analyses (see, e.g., Goodman, 1974; 
Hagenaars & McCutcheon, 2002; Muth?n, 2000; Nagin, 1999).  
It is important to note that the latent class analysis is a valid and useful analytic 
method in research where the identification of latent classes is the primary focus. In the 
present context, however, the latent classes represent a complicating feature of the 
estimation (of teacher effect), introduced with the intention of reducing bias, and are not 
an end in themselves. 
Exemplifying this potential, Muth?n and Asparouhov (2009) used a multilevel 
mixture model, instead of the conventional multilevel model, where subgroups of 
students were identified within the latent variable ?student type? with levels ?fast learner? 
or ?slow learner?. This student level (level 1) latent class variable (LC) accounted for the 
heterogeneity in level 1 residual variance that was unaccounted for by observed 
covariates or the conventional multilevel model; the mixture model that included this LC 
also identified effects which were estimated at the school level (level 2), ultimately 
changing the estimated effects of covariates at both student and school levels, and leading 
to different interpretations of parameter estimates than was supported by the conventional 
two-level model. They also tested for the presence of a LC at the school level and found 
that, although such a level 2 LC could be identified, it had a very limited impact upon the 
estimation or interpretation of other parameters.  Muth?n and Asparouhov?s (2009) 
example showed the importance of thorough investigation of heterogeneity in variance at 
       
18 
 
each level and in particular, that the conventional multi-level model will not always 
suffice to limit bias and optimize precision of estimates.  
1.4 The general mixture model and latent growth curves in growth mixture models 
Just as hierarchies in data led to multivariate methodological developments such 
as the multi-level model, individual effects in intercepts and slopes of repeated measures 
datasets led to the development of the latent growth curve model (or growth/growth curve 
model, e.g., see Preacher, Wichman, MacCallum, & Briggs, 2008). The purpose of 
growth models is to model change over time with particular emphasis on the variability in 
starting points (i.e., intercepts) and/or change over time (i.e., growth/slope). A latent 
growth curve mixture model or growth mixture model (GMM) is an extension of latent 
growth curve model (LGM). The idea behind GMM is to permit further examination ? 
and estimation ? of the heterogeneity of growth trajectories that may be explained by 
latent classes. For example, there may be groups of students with distinctive growth 
trajectories that cannot be explained well by one set of slopes, intercepts, and their 
correlations. As noted earlier, accounting for heterogeneities in data is critical to support 
valid interpretation of results from statistical analysis. The inclusion of slopes and 
intercepts (growth curve modeling) as multiple levels (multi-level modeling), plus 
identification of important covariates such as student type (mixture modeling) are united 
in the estimation underlying the growth mixture model.  
The multilevel extension of GMM was recently introduced and has been applied 
in education (e.g., Muth?n & Asparouhov, 2009; Palardy & Vermunt, 2010). The Muth?n 
and Asparouhov (2009) example outlined above can be generalized to other educational 
outcomes like the evaluation of teacher effectiveness ? which would typically be 
       
19 
 
estimated using a value added model (VAM; McCaffrey et al., 2003; Sanders & Rivers, 
1996). As will be explained in Chapter 2, the VAM is actually a special case of the 
GMM; suggesting that growth/growth mixture modeling is a natural tool for estimating 
the development of student capabilities over time ? as well as other effects (e.g., teacher 
and school) that could be ? and may need to be shown to be ? contributing to students? 
growth. 
1.5 Multilevel growth mixture modeling supporting precise estimation and inference 
The heterogeneity in data that obscures, or diminishes the precision of estimates 
of, parameters or relationships of interest can either be ignored (leading to 
imprecise/biased and possibly incorrect estimates) or modeled explicitly (also possibly 
leading to incorrect estimates if the modeling is not appropriate). Analytical 
developments have included finite mixture/latent class, multilevel, growth curve, 
mixture, and multilevel growth mixture modeling approaches, as described above. Each 
of these developments addresses previously-unaccounted for heterogeneity in data and 
precision in estimates. Similarly, the goals of this research address the potential impact of 
unaccounted-for heterogeneity at level 1 (e.g., student level) in the level-2 estimates (e.g., 
teacher), testing whether the inclusion of this type of heterogeneity merits further 
consideration for a group-level statistical evaluation procedure, including teacher or 
school evaluation with VAM. 
This research investigated model feature effects on the precision of individual 
parameter level estimates at level 2 of a multi level growth mixture model. The goals of 
this study were to:  (1) investigate the bias and precision of level-2 parameter estimates in 
       
20 
 
the multilevel model affected by incorrectly modeled level 1 effects; (2) the effectiveness 
of information criteria to identify the true number of latent classes in MLGMM. 
These were accomplished via a simulation study, described more fully in Chapter 
3. Representing an educational study context, this simulation focused on the 
precision/bias and variability of estimation of level 2 (i.e., teacher-level) effects on 
student performance (level 1), that is,, within a VAM framework. The research also 
investigated the issue of latent class identification/misidentification which has the 
potential to cause serious estimation bias at more than one level (Chen et al., 2010). 
The dissertation is organized as follows: The different models, their comparisons, 
contrasts and implications for data and assumptions are described more fully in Chapter 
2. Chapter 3 presents the methodology that was used to complete this study. Chapter 4 
presents the results from a pilot study supporting the proposed simulation methodology. 
Results of the study are presented fully in Chapter 5 followed by the discussion of the 
research in Chapter 6. 
 
 
  
       
21 
 
Chapter 2: Literature Review  
Heterogeneity is a characteristic of data that can complicate research design, 
analysis and interpretation. As outlined in Chapter 1, the multilevel growth mixture 
model (MLGMM; Asparouhov & Muth?n, 2008; Muth?n, 2004) is a new analytic 
method specifically developed to accommodate heterogeneity so as to minimize the 
effect of variability on precision in estimation and to reduce bias that may arise in 
hierarchical data. This is particularly important in the VAM context ? where decisions 
and evaluations about teaching effectiveness are made, because estimates could be 
contaminated, biased, or simply less precise when modeled inappropriately. Therefore, 
the research questions for this proposed work are:  
1) Are the level-2 parameter estimates in the multilevel model affected (in terms of 
bias and precision) by incorrectly modeled level 1 effects? 
2) What information criteria can be used to identify the true number of latent 
classes in MLGMM? 
To answer these research questions, the following objectives were set for the 
simulation study:  
1) to contrast the estimation (precision and bias) of level 2 effects from a two-level 
growth model with heterogeneity at level 1 un-modeled (incorrectly specified) and a two-
 level growth mixture model with heterogeneity at level 1 modeled (correctly specified) in 
order to investigate the systematic biases associated with inappropriate model 
specifications  in MLGMM and VAM;  
2) to investigate the effectiveness of information criteria to identify true models; 
3) to examine the accuracy of model identification in MLGMM.  
       
22 
 
This chapter outlines the motivation and algebraic foundations for the research 
questions and study design. 
2.1 Educational effects estimation with growth curve mixture models 
Growth curve, mixture, and multi-level models are all very important in 
educational research (e.g., Boscardin et al., 2008; Muth?n et al., 2003) and their 
combination, the growth curve mixture model, has also been promising (Muth?n, 
Asparouhov, 2009; Palardy & Vermunt, 2010). Growth mixture models have also been 
applied in different fields including preventive intervention (e.g., Muth?n et al., 2002), 
criminology (e.g., Kreuter & Muth?n, 2008; Schaeffer et al., 2006), epidemiology (e.g., 
Croudace et al., 2003), and substance abuse (e.g., Boscardin et al., 2008). A common 
theme in this body of research is to identify latent classes from growth trajectories that 
are both substantively and statistically distinct. Thus, the results become more 
informative in that specific strategies (e.g., interventions) can be formulated for each 
level of the so-identified latent class. For example Muth?n et al., (2002) report that the 
effect of drug treatments were found not to differ statistically for the experimental group 
as compared to the placebo group in a placebo-controlled clinical trial; this failure to 
achieve significance was later determined to have been due to non-compliance (i.e., study 
subjects did not follow directions or take drugs as prescribed). Without the mixture 
approach, this analysis ? a conventional multilevel model ? would have led to the 
conclusion that the drug under study did not work better than placebo. However, when 
this non-compliance was included as a latent class (was/ was not compliant, inferred from 
data) variable in their model, then the drug effects (relative to placebo) were identified 
(and estimated) in the compliant subgroup of the active arm, and additional design 
       
23 
 
features were uncovered for future clinical trials (i.e., to specifically encourage 
compliance in the trial participants). Another benefit of the mixture analysis in this 
example was the improved precision of the estimated drug effect, derived from the 
compliant group versus the placebo group.  
Similarly, in an educational context, students may be classified into meaningful 
groups based on differences in growth trajectories over time. Identifying patterns of 
growth specific to different groups has the potential to inform the development of 
strategies supporting effective instruction for each group of students, whether academic 
(e.g., alternative instruction), behavioral/psychological (e.g., behavioral intervention), or 
social (e.g., individual counseling).  
In the context of VAM to estimate teacher effectiveness, some group of students 
(i.e., in one subgroup of the student-level latent class variable) might not receive any 
contribution from teachers, similar to the situation for the non-compliant group in the 
clinical trial example above; this subgroup of students may bias the estimates of teacher 
effect, whereas the teacher effect might be more precisely estimated among students who 
do benefit from the teacher (similar to the compliers in the clinical trial example). This 
study investigates a situation similar to this example, where there is a differential level-2 
effect on level-1 depending on the class (i.e., latent class) to which a subject in level-1 
belongs. 
2.2 The problem: Estimation with growth curve modeling 
As described in Muth?n and Asparouhov (2009), researchers could reach a 
different conclusion if the level-1 latent class variable was ignored when it actually has a 
significant impact on estimation of level-2 effects on the dependent variable.  This 
       
24 
 
sophisticated approach to the analysis of educational data is compelling ? as evidenced 
by Muth?n and Asparouhov (2009)?s example; but the analytic complexity entails other 
challenges to be considered in order to support valid inferences. Namely, Palardy and 
Vermunt (2010) raised two important issues to consider in multilevel growth mixture 
models.  First, one must be extremely cautious with the use of covariates to identify latent 
classes, since covariates can change the distribution of random effects from which latent 
classes are identified. Secondly, the choice to include random effects at a higher level or 
not ? in addition to those at the lower level ? could affect interpretation and results, 
because ?the latent class and random effect compete to explain the same variability in the 
growth trajectory? (p. 555, emphasis added). This second point of Palardy and Vermunt 
(2010) is consistent with Muth?n and Asparouhov?s (2009) findings. Thus, there are 
many modeling ?tricks? that could be brought to bear when heterogeneity is complex 
and/or unknown; but as noted above, analytic complexities bring their own challenges. 
Thus, the method has great potential, but this must be balanced against these two 
particular challenges. Therefore to estimate the impact of ignoring these challenges this 
simulation study evaluated the precision of level-2 estimates by introducing the level-1 
heterogeneity in the form of differential growth trajectories among level-1 subjects within 
a series of GMMs fit to simulated datasets (described in Ch 3). Including heterogeneity in 
the growth trajectories will show whether the identification of latent trajectories in 
MLGMM is really important challenge. Estimation with growth curve modeling 
improves precision of estimates and accommodates the realistic condition of the nesting 
of observations within a classroom, but without examining the possibility of 
heterogeneity among the growth curves, the true potential of the method is unknown. 
       
25 
 
As described in Chapter 1, linear growth or growth curve modeling is a statistical 
method that was relatively recently developed. It has become increasingly important in 
educational research in the past decade or so. For example, the U.S. Department of 
Education initiated the pilot study, Growth Model Pilot, in 2005 to address the 
potential/perceived unfairness in Adequate Yearly Progress (AYP) evaluation (No Child 
Left Behind Act of 2001, sec 6161). The introduction of governmental initiatives such as 
Growth Model Pilot study (US Department of Education, 2005) and Race to the Top (US 
Department of Education, 2010), which place a strong emphasis on estimating (and 
thereby facilitating improvement in) the effectiveness of teachers, reinforces the need for 
unbiased and precise estimation of performance over time for both students (scores 
nested within students) and teachers (students nested within teachers). Thus, the methods 
introduced in Chapter 1 will become increasingly important in educational policy and 
decision making. As noted earlier, Palardy and Vermunt (2010) concluded that covariates 
may arbitrarily separate variability based on manifest classes in covariates such as 
ethnicity or socio-economic status; Muth?n and Asparouhov (2009) noted that failures to 
accommodate covariates in growth mixture models can also adversely impact the 
precision and bias in growth model estimates. Therefore, although growth curve (and 
related) modeling methods exist and have been used in educational contexts including 
teacher evaluation, for these methods to be both useful and used appropriately, the 
impacts of latent classes within the growth curve modeling framework should be better 
understood, as was done in this study. 
 As was suggested in Chapter 1, the modeling methods explored in this study are 
closely related. In fact, growth mixture modeling (GMM; Muth?n 2001; Muth?n 2004) is 
       
26 
 
a mixture extension of the latent growth curve or latent growth model (LGM), and 
MLGMM is a multilevel extension of GMM. Therefore, MLGMM is described in this 
section to provide background to the simulation structure and to show how GMM and 
LGM are special cases of MLGMM. Figure 5 shows the hierarchy of the family of latent 
growth model. VAM is a special case of the multi-level growth mixture model 
(MLGMM); it is equivalent to a MLGMM where there is only one class, i.e., there is no 
mixture because everyone is assumed to be in the same class. When MLGMM is used 
instead of VAM, because it does include latent class estimation, it does not assume that 
all students are in the same class.  
 
Figure 5. Hierarchy of family of latent growth models 
2.2.1 Algebraic representation of Multilevel Growth Mixture Model (MLGMM) 
The formulation of MLGMM has two parts: the within-group (i.e., level-1) and 
between-group (i.e., level-2) models. This formulation includes both within-group level 
and between-group level latent class variables. This study focused on the estimates from 
the between-level slope, but the entire formulation is presented below for context. A 
       
27 
 
cluster represents the unit of the between-group or level-2 identifier in this paper and is 
used interchangeably with a group.   
Level-1 
? Within-group level measurement model 
 20 1 1 , ~ (0, )tij ij ij tij tij tijY a e e N? ? ?? ? ?  (1) 
Where 0ij?  is an intercept for individual i in cluster j,  1ij?  is a slope for 
individual i in cluster j, 1tija is covariates at time t for individual i in cluster j , 
and tije  is an error at time t for individual i in cluster j . 
 
 
? Within-group level structural model for the intercepts and slopes (Level-1) 
o Intercepts 
 0 00 0 0
 1 1
 K M
 ij k kij jk mij ij
 k m
 c X r? ? ?
 ? ?
 ? ? ?? ?  (2) 
o Slopes 
 
 1 10 1 1
 1 1
 K M
 ij k kij jk mij ij
 k m
 c X r? ? ?
 ? ?
 ? ? ?? ?  (3) 
 ~ ( )ijr N r0,?  (4) 
? Model for subjects? latent class memberships, give their covariates 
 0
 1
 logit[ ( 1)]
 M
 kij k mk mij
 m
 P c X? ?
 ?
 ? ? ??  (5) 
Level-2 
 
? Between-group level model  
o Intercepts 
 0 000 00 0
 1 1
 L N
 ij l lj n nj j
 l n
 d W u? ? ?
 ? ?
 ? ? ?? ?  (6) 
o Slopes 
 10 100 10 1
 1 1
 L N
 j l lj n nj j
 l n
 d W u? ? ?
 ? ?
 ? ? ?? ?  (7) 
       
28 
 
 ~ ( )iju N u0,?  (8) 
? Model for Between-group for the latent class variable and class 
membership. 
 0
 1
 logit[ ( 1)]
 N
 lj k nk nj
 n
 P d W? ?
 ?
 ? ? ??  (9) 
where  
 t :  time point 
 i :  individual 
 j: group/cluster 
a1tij:  individual level, time related variable 
 ijX :  within-group level covariate  
 jW :  between-group/cluster level covariate 
 Equations 1 through 5 show the within-group level models and Equations 6 
through 9 show the between-group level models. In equation 1, tijY  is the observed 
individual outcome at time/occasion t for individual i within a group/cluster j (e.g., 
school), 0ij?  is the expected value of Y for this individual when t=0, 1ij?  is the expected 
slope/growth on the outcome for this individual, 1tija  measures the time/occasions for this 
individual and tije  is the residual/error associated with this model for this individual. It is 
possible to include more time/occasion variables to model other growth effects (e.g., 
quadratic effect) in addition to the linear growth effect shown here. Equations 2 through 4 
show the within-group level model or the repeated measure for intercepts and slopes and 
Equation 5 shows the model for subjects? latent class memberships, given their 
covariates. Within-class intercepts and slopes are expressed with three factors, m 
       
29 
 
covariates mijX , k latent classes kijc , and random effects 0iju . kijc  is equal to one when an 
individual i in cluster j belongs to the latent class k and otherwise zero where k = 1, 2, 
3,?.,K and K is the total number of within-group latent classes 0 jk?  and 1 jk?  are the 
mean intercept and slope value for within-group class k. Equation 5 represents a 
multinomial logistic regression to describe the likelihood of membership in each of the 
latent class variable?s levels, associated with predictors  where k=1 is the reference class 
level. 
 Between-group level equations 6 through 9 are almost identical to within-level 
equations from 2 to 5. Within-group heterogeneity in intercepts and slopes are regressed 
on three factors: between-group covariates, njW , between-group latent class variable, ljd , 
and  random effects ( 0 ju and 1 ju ) where d is the between-group latent class variable with 
l  levels, and L is the total number of between-group latent classes  (l = 1, 2, 3,?.,L). ljd  
is one when a cluster j belongs to the latent class l and otherwise zero. 0l?  and 1l?  are the 
mean intercept and slope value for between-group latent class variable level l. Equation 9 
represents a multinomial logistic regression to describe the likelihood of class 
membership associated with predictors where k=1 is the reference class level. The 
errors/residuals in each of the within-level measurement model, within-level 
structural/repeated measure models, and between-group models, are all assumed to be 
normal, independent across levels (e.g., between level-1 and level-2), and uncorrelated 
with the covariates. There are three levels of equations for MLGMM (i.e., two levels for 
within-group and one level for between-group). The term cluster is defined as the 
grouping unit at level-2. 
       
30 
 
 Figure 6 is a graphical representation of unconditional MLGMM based on the 
Muth?n and Muth?n (1993-2010) representation for the MPlus software. 
 
Figure 6. Graphical representation of unconditional MLGMM 
2.2.2 Algebraic representation of Growth Mixture Model 
The growth mixture model (GMM) is a special case of MLGMM where no 
between-group models are included. Formulation of GMM models is achieved by 
dropping the group/cluster notation j from MLGMM Equations 1 through 4 above, 
resulting in the following specifications: 
? Individual level measurement model  
       
31 
 
 20 1 1 , ~ (0, )ti i i ti ti tiY a e e N? ? ?? ? ?  (10) 
o Individual level structural model for the: 
o Intercepts 
 0 00 01 0
 1 1
 K M
 i k ki k mi i
 k m
 c X r? ? ?
 ? ?
 ? ? ?? ?  (11) 
o Slopes 
 
 1 10 11 1
 1 1
 K M
 i k ki k mi i
 k m
 c X r? ? ?
 ? ?
 ? ? ?? ?  (12) 
 , ~ ( )ir N r0,?  (13) 
? Model for the latent class variables 
 0
 1
 logit[ ( 1)]
 M
 ki k mk mi
 m
 P c X? ?
 ?
 ? ? ??  (14) 
 
Figure 7. Graphical representation of linear GMM 
       
32 
 
It is clear from Equations 11 through 14 that the cluster, j, does not appear in any 
model, reflecting the assumption that individual growth factors are sufficient to estimate 
the effects of interest in the data. Figure 7 is a graphic representation of GMM, which is 
the same as the within-subject part of Figure 6 showing MLGMM. 
2.2.3 Algebraic representation of multilevel latent growth model 
A multilevel latent growth model (MLLGM) is a non-mixture case (i.e., without a latent 
class variable) of MLGMM. Therefore the formulation of MLLGM takes the MLGMM 
formulations and excludes both within-level latent class variables, kijc , and the between-
 level latent class variable, kijc , from Equations 1 through 4 and 6 through 8, so they 
become: 
? Within-group level measurement model
  20 1 1 , ~ (0, )tij ij ij tij tij tijY a e e N? ? ?? ? ?  (15) 
? Within-group level structural model for the intercepts and slopes 
o Intercepts 
 0 00 0 0
 1
 M
 ij j mj mi ij
 m
 X r? ? ?
 ?
 ? ? ??  (16) 
o Slopes 
 
 1 10 1 1
 1
 M
 ij j mj mi ij
 m
 X r? ? ?
 ?
 ? ? ??  (17) 
 ~ ( )ijr N r0,?  (18) 
? Between-group level model 
o Intercepts 
 0 000 00 0
 1
 N
 ij n nj j
 n
 W u? ? ?
 ?
 ? ? ??  (19) 
o Slopes 
       
33 
 
 10 100 10 1
 1
 N
 j n nj j
 n
 W u? ? ?
 ?
 ? ? ??  (20) 
 ~ ( )iju N u0,?  (21) 
 The formulation of MLLGM is identical to a MLGMM where the latent class 
variable has just one level (and everyone falls into this single level). The graphical 
representation of MLLGM (not shown) is also very similar to MLGMM, obtained by 
simply excluding the latent class variables C and D from Figure 6 and any connections 
from/to these latent class variables (in Equations 16 through 20). 
2.2.4 Algebraic representation of latent growth model 
The simplest form of MLGMM is the latent growth model (LGM) where there are 
neither latent class variables nor group/cluster information included in the model. LGM is 
expressed with the following four equations: 
? Within-group level measurement model 
 20 1 1 , ~ (0, )ti i i ti ti tiY a e e N? ? ?? ? ?  (22) 
? Within-group level structural model for the intercepts and slopes 
o Intercepts 
 0 00 0 0
 1
 M
 i m mi i
 m
 X r? ? ?
 ?
 ? ? ??  (23) 
o Slopes 
 
 1 10 1 1
 1
 M
 i m mi i
 m
 X r? ? ?
 ?
 ? ? ??  (24) 
 ~ ( )ijr N r0,?  (25) 
In LGM, two growth factors, representing intercept and slope, completely capture 
individual growth trajectories. 
       
34 
 
2.2.5 Algebraic representation of the multilevel model 
The formulation of a longitudinal multilevel model (MLM) is similar to LGM; 
Equations 26 through 28 below are almost identical to Equations 22-24 for LGM. The 
formulation for a two-level unconditional (i.e., without covariates or explanatory 
variables) MLM is: 
? Level-1  
 20 1 , ~ (0, )it i i ti ti tiY T e e N? ? ?? ? ?  (26) 
? Level-2 
 0 00 01 0j i iX r? ? ?? ? ?  (27) 
 1 10 11 1j i iX r? ? ?? ? ?  (28) 
 where Y is a response variable, T is a time variable, t is a time or measurement occasion, i 
is an individual, and X is a time-invariant covariate.  
2.2.6 Contrasting multilevel (MLM) and latent growth (LGM) models 
The selection of one model from the family of latent growth models falling within 
the MLGMM classification might be dictated by the quality of data (e.g.,  in case of 
missing individual data; insufficient sample size for the model complexity; etc. e.g., 
Muth?n, 2004, 2006) and/or by the research questions under study. In LGM, time or 
measurement occasions are fixed, with values that must be pre-specified, while with 
MLM, time is a variable reflecting any values representing a time, visit, or occasion. 
Therefore, LGM and MLM will have identical specifications and equivalent estimates 
when time or measurement occasions are fixed (e.g., t= 0, 1, 2, 3 in Equation 15 and 17). 
       
35 
 
2.2.7 Value-Added Model as a multilevel model 
The value-added model (VAM) is a special case of multilevel model wherein the 
data have the specific hierarchical structure with students nested within a classroom and 
classrooms nested within a school.  
 
Figure 8. Conceptual representation of value added model 
Figure 8 shows a conceptual representation of VAM. The difference in a student?s 
achievement between that predicted by the model (red dotted line) and the actual 
achievement (black solid line), is that ?value-added? by the external factor (e.g., school 
and teacher, shown in blue line) to be estimated.  The term ?value-added? represents the 
emphasis on estimating the contribution of higher-level effects such as teachers (level 2) 
and schools (level 3) on the student?s achievement and/or improvement (Sanders & 
Rivers, 1996).  
       
36 
 
 The simplest case of VAM is an unconditional two-level MLM (Doran & 
Lockwood, 2006; Singer, 1999), very similar to Equations 27 through 29 if terms for 
predictors X are removed. Value-added modeling is currently being used in states such as 
Tennessee (Tennessee Value Added Assessment System; TVASS; Sanders & Rivers, 
1996) and North Carolina (Education Value Added Assessment System; EVAAS, SAS 
Inc. 2010). These models are far more complex than the unconditional VAM, and are 
intended, and are being used, for high stakes evaluation, potentially influencing 
employment status of teachers (see Springer et al., 2010 for current use of VAM for 
teacher evaluation).  
The primary focus of evaluation of performance using VAM is to identify and 
quantify the contributions of higher-level variables such as teachers, schools, and district 
to the observed growth in level-1 (e.g., change in student scores over time); this is similar 
to the general objectives of the conventional MLM with a focus on the 2nd ? and higher-
 level estimates. The main difference between VAM and the 2-level unconditional MLM 
lies in both the addition of another level of hierarchy (e.g., schools within district) and 
estimation of the effects of covariates associated with the additional level (e.g., level-3) 
on the level-1 outcomes. 
2.2.8 Effects of level-1 heterogeneity on the estimates of level-2 effects in MLGMM 
Muth?n and Asparouhov applied the multilevel mixture model to simulated data 
and real data to demonstrate the use and utility of mixture modeling in educational 
contexts. Muth?n and Asparouhov (2009) progressively added complexity to models. 
First, they showed that a simple regression model could not fully explain group 
differences in math achievement between males and females due to the underlying 
       
37 
 
heterogeneity in scores arising from a latent class variable with two levels (i.e., low and 
high achievers).  
The second example in Muth?n and Asparouhov (2009) was a conventional MLM 
with a single (manifest) level-1 predictor, student-level socio-economic status. Muth?n 
and Asparouhov (2009) showed the different conclusions derived from regression with 
and without consideration of the latent class, effectively comparing results from a 
conventional multilevel model against those of a multilevel mixture model. Muth?n and 
Asparouhov (2009) showed that the effect of student-level covariates can affect the 
interpretation of the results from a conventional multilevel regression, since the student-
 level latent class variable interacted with the school-level predictor. They also showed 
that in the presence of a substantive latent class variable at the student level, especially 
when the class levels interacts with the covariate, interpretation of results will depend on 
the latent class membership at level 1 and the value of the school-level covariate (i.e., at 
level 2). 
Finally, Muth?n and Asparouhov (2009) used actual data to fit three multilevel 
mixture models that varied in complexity. There were both within-level and between-
 level predictors (i.e., covariates) in all three models. A ?plausible? null model, the 
simplest one fit to the data, was an unconditional MLM. Three mixture models were fit to 
the data, each with latent class variables. In one of these, both within-group and between-
 group latent class variables, the other two with only within-group latent class variable 
where the more complex of two allow both the intercept and slope from covariates to the 
predictor to be different whereas the simplest model only allows the intercept to be 
different. 
       
38 
 
The fit of these models to the data was estimated using a variety of indices 
designed to capture how well the models represented the relationships and variability in 
the data. The main fit statistic used to compare these models was Bayesian Information 
Criteria (BIC; Schwartz, 1978) representing the information in the data that was lost with 
each model?s respective formulation (see Anderson, 2008), but also by comparing the fit 
to the data by each model against that of a model without a latent class variable (a 
plausible ?null? alternative model).  
Muth?n and Asparouhov (2009) found three specific impacts to estimates and 
inferences, as compared to the conventional MLM, were derived from the three mixture 
models:  
1. The degree of precision in level-2 estimates from the MLM was limited by the 
failure to capture the level-1 latent class variable.  
2. Estimates of level-2 effects were inflated in the conventional MLM compared 
to those from the three mixture models. 
3. The effects of predictors were significantly different between the conventional 
and mixture models; these effects were attenuated in all mixture models as compared to 
the conventional MLM estimates. This supports the importance of modeling the level-1 
heterogeneity with latent classes in order to avoid reaching the wrong conclusion by 
inflating the effect of covariates. 
Based on their exploration of the simple regression, conventional MLM, and 
MLGMM models and their respective fits to the data, in addition to the differing results 
and inferences supported under each analysis, the authors stated that ?level-1 
heterogeneity in the form of latent classes is mistaken for level 2 heterogeneity in the 
       
39 
 
form of the random effects that are used in conventional two-level regression analysis? 
(p. 655).  
Muth?n and Asparouhov (2009) had used BIC to identify/select the version of 
each mixture model that contained the number of levels within the latent class variable 
that was most consistent with the data (i.e., to select the model with the number of latent 
class levels associated with the lowest BIC value for that particular model specification). 
For each of the three mixture models, Muth?n and Asparouhov (2009) also explored 
varying numbers of levels for the latent class representing the ?mixture?. They used BIC 
to identify the version of each mixture model with the number of class levels that best 
(lowest BIC of the set) captured the heterogeneity in the data. They reported that all three 
mixture models fit the data significantly better than the model without a latent class 
variable, but the differences in fit among the three alternative mixture models were less 
pronounced. In fact, all three mixture models yielded substantively interpretable results, 
with a single number of latent class levels identified by BIC for each. Thus, in this case, 
fit and interpretability of classes (i.e.., yielding classes that could be assigned 
substantively interpretable labels) did not identify a single ?best? model. Therefore, in 
addition to demonstrating the utility and incremental improvements in interpretability that 
MLGMM brings to, Muth?n and Asparouhov (2009) also underscored new challenges 
that can arise from the application of this technique, namely, that fit and interpretability, 
which usually drives model selection, may not clearly differentiate reasonable alternative 
models derived under MLGMM. This is an additional consideration for adoption of 
models that could be used in teacher evaluation and for decisions or policies made on the 
basis of such evaluations.  
       
40 
 
Muth?n and Asparouhov (2009) identified the importance of accounting for 
heterogeneity attributable to a latent class variable in the context of the MLM framework; 
this partly informed the design of this simulation study. Their use of information criteria 
for model selection was also integrated into this study, as described in Chapter 3. 
Specifically, this study included BIC, as they did, but also an assortment of other 
information criteria, in order to further understand BIC?s specific functionality under 
MLGMM. 
2.2.9 Latent class variable identification in MLGMM 
Palardy and Vermunt (2010) and Chen, Kwok, Luo, and Willson (2010) 
investigated the issues of latent class variable identification and the precision of 
classification of individuals into the latent class variable?s levels in MLGMM. The issue 
of latent class variable identification in MLGMM, have been studied by several 
investigators (Muth?n & Asparouhov, 2009; Palardy & Vermunt, 2010). Palardy and 
Vermunt (2010) reported that manifest covariates have the potential to change the 
distribution of random effects by which latent class variables are identified, thereby 
affecting identification of substantively interpretable latent class variables. Palardy and 
Vermunt (2010) recommended that manifest covariates should be identified a priori, and 
substantively, and that they not be derived via exploratory analysis of the data under 
study in mixture modeling.  
The other major issue identified by Palardy and Vermunt (2010) is that the random 
effects for the intercepts and slopes of growth trajectories that are estimated in a growth 
modeling framework will interact with the identification of latent class variables because 
they all compete to explain the same variability in the student-level data. Consistent with 
       
41 
 
other studies (e.g., Bauer & Curran, 2004; Lubke & Neale, 2006), Palardy and Vermunt 
(2010) found that fixing random effects (i.e., fixing the error term on a group level 
equation for intercepts and/or slopes to zero) is likely to cause over-extraction of latent 
class variable levels. So, although the results from Muth?n and Asparouhov (2009) 
underscored the importance of mixtures for this model type, namely the latent class 
variable at level 1, and its influence on estimates and inferences on level 2 variable 
effects, Palardy and Vermunt (2010) identified pronounced challenges to the use of 
mixtures in the MLGMM context. This underscores the earlier point that increasingly 
complex models can serve important purposes for improving precision and decreasing 
bias in estimation and decisions based on these estimates, but the more complex models 
often lead to other problems or issues. 
A different but related challenge is the effect of using mixtures, but not multi-level 
approaches, in growth modeling. Chen et al. (2010) investigated the effect of ignoring the 
nested structure on identifying the latent classes at level-1 in MLGMM. That is, they 
focused on the effects of erroneously treating hierarchical data as if there was no 
hierarchy ? running a GMM instead of MLGMM. Chen et al. (2010) found the nested 
structure had relatively minor effects on the latent class variable identification in that a 
given mis-specified GMM did correctly identify the latent class variable and class level 
membership for individuals in 80% to 90% of simulation conditions. When compared to 
results in simulation conditions with the correctly-specified model, MLGMM, the GMM 
results were not off by much as the MLGMM class levels were correctly recovered in 
87% to 92% of the MLGMM conditions.  
       
42 
 
However, Chen et al. (2010) found that the intraclass correlation of group, 
magnitude of within-class variance, and latent class mixture proportions each had a 
substantial effect on the latent class level identification when the model was mis-
 specified (i.e., when data were analyzed with GMM), but not with the MLGMM. Other 
effects of ignoring the nested structure inherent in data were reported to be: 1) less 
precise fixed-effect estimates with greater standard error; 2) overestimated variance 
estimates for effects of lower-level variables; and 3) less accurate standard error estimates 
for all parameter estimates. In general, incorrectly ignoring the nesting structure was 
determined to have less of an impact on the fixed-effect estimates than on random-effect 
estimates, but this particular type of misspecification (i.e., ignoring the nested structure in 
data) led to bias and imprecision that would have important implications for VAM 
applications. As the random effects are the estimates of interest in VAM applications, a 
failure to capture the nesting of the data could adversely impact policy and other high 
stakes decisions (as was alluded to by Ballous, 2002). In general, incorrectly ignoring the 
nesting structure was determined to have limited impact upon the identification of latent 
class but to have an impact on the bias and precision of parameter estimates. 
2.3 Substantively interpretable latent class structure in VAM 
Two of the papers described above, Muth?n and Asparouhov (2009) and Palardy 
and Vermunt (2010), have several important implications for VAM in terms of the 
correct ? MLGMM ? analytic approach. A latent class variable, representing student 
performance and development, has been identified by two independent studies 
(Chudowsky et al., 2007; Lazarus et al., 2010). Both studies identified a subgroup of 
students that persistently performs at the lowest level. Students are known to be 
       
43 
 
heterogeneous in their performance and their development (Chudowsky, Chudowsky, & 
Kober, 2007; Lockwood & McCaffrey, 2007; Lazarus et al., 2010), but they may also fall 
into more predictable (latent) classes that can complicate estimation with growth curve 
modeling ? particularly if this predictable source of variability is ignored (e.g., as 
reported by Muth?n and Asparouhov, 2009). Students who chronically perform at a low 
level over time have been characterized as ?permanently low performing? students (PLP; 
Chudowsky et al., 2007; Lazarus et al., 2010). These students start off, and remain, at a 
low performance level over time, and are often distinct from students who start off at a 
higher level and remain at that level over time and from those who start higher or lower 
and exhibit change over time.  
In their study of student types, Lazarus et al. (2010) identified two groups of low 
performing students, low performing (LP) and persistently low performing (PLP) (see 
also Chudowsky et al., 2007).  LP students were defined to be those who scored at the 
10th percentile or lower on the state wide standardized test in one of the past three years. 
Persistently low performing (PLP) students were those who scored at the 10th percentile 
or below on the statewide standardized test for all three years. Those students identified 
as PLP were not performing so badly overall that they were eligible to take the alternate 
form of assessment (i.e., a test for students who are in the special education program), but 
their performance suggests that the regular achievement tests are simply too difficult for 
them.  Lazarus et al. discovered two demographic (manifest) variables that tended to 
characterize the PLP student type: they were more likely to be minorities, and more likely 
to be receiving free or reduced lunch (a proxy variable for low socio-economic status). 
Although these trends were observed for the manifest demographic variables, neither was 
       
44 
 
statistically significantly predictive of belonging to the PLP student type. As Palardy and 
Vermunt (2010) suggested, predictor variables or covariates should not be included for 
exploration of latent class variables in MLGMM due to the potential interaction between 
them obscuring the identification of latent classes. Together with the PLP results of 
Lazarus et al. (2010), indicating that manifest covariates are not sufficient, or sufficiently 
explanatory, the results and recommendations by Palardy and Vermunt (2010) suggest 
that a latent class variable ? based on slopes and intercepts ? may be a more efficient and 
effective method of identifying students in this class. 
2.3.1 Potential impact of latent classes  
Palardy and Vermunt (2010) and Chen et al. (2010) are recent studies showing the 
impacts of inappropriate modeling of latent class variables (Palardy & Vermunt, 2010) or 
of the nested data structure (Chen et al., 2010) on the estimates of individual and group 
effects (i.e., slopes and intercepts) as well as their predictors. As stated before, growth 
curve (and related) modeling methods have incredible potential for educational research 
as well as for decision making and policies that are based on evaluations, but for these 
methods to be both useful and used appropriately, the impacts of latent class variables 
and hierarchical data within the growth curve modeling framework need to be fully 
investigated, particularly at the level of individual estimates (i.e., a parameter for each 
case) rather than at the effect level (e.g., overall group effect). 
2.4 Effects of un-accounted for heterogeneity at level 1 on the precision of level-2 
estimates in multilevel data  
This study builds on the results of these three key studies (Chen et al., 2010; 
Muth?n & Asparouhov, 2009; Palardy & Vermunt, 2010), and incorporated the PLP 
       
45 
 
student type (Chudowsky et al., 2007; Lazarus et al., 2010) so as to estimate, and 
understand the magnitude of, bias in estimates at each stratum of the analysis. As 
described above, there are two issues in the identification of latent class variables, namely 
the assignment of individuals to levels of these variables and the appropriate estimation 
of effects of interest in MLGMM: 
1) Covariates affect the identification of latent class variables; 
2) Nested structure has limited impact upon the identification of latent class 
variables but can influence estimation and interpretation of random effects. 
Coupled with the potential importance of the MLGMM for education research and 
decision-making, and particularly the salience of the latent class variable described by 
Muth?n and Asparouhov (2009) and the substantively important class of PLP students 
indentified by Lazarus et al. (2010) and Chudowsky et al. (2007) in their analyses, this 
recent body of work motivated this effort to quantify these effects in the simulation study 
described in Chapter 3. However, identification of class membership at level 1 has not 
been shown to be influenced by these factors, nor is it often a consideration for decision 
making or VAM interpretability; therefore, assignment of individuals (at level 1) to levels 
of these variables was not pursued in this study. 
       
46 
 
Chapter 3: Methods 
 Chapter 2 provided the background supporting the objectives of this study, which 
were to: (1) investigate the effect of unaccounted-for heterogeneity in growth at level 1 
on level-2 effects by comparing the level-2 effect estimates derived from a conventional 
MLM and from a multilevel growth mixture model (MLGMM); (2) examine the stability 
of level-2 effect estimates in MLGMM models; and (3) estimate the likelihood of class 
misidentification at level 1 in MLGMM and its consequences for level-2 estimates and 
their interpretation. To meet these research objectives, two research questions were 
investigated: 
1) Are the level-2 parameter estimates in the multilevel model affected (in terms 
of bias, and precision) by incorrectly-modeled level 1 effects? 
2) What information criteria can be used to identify the true number of latent 
class variable levels in the MLGMM context? 
This chapter describes the simulation study used to answer these questions. 
3.1 Characteristics of the simulation 
Table 1 shows the details of the proposed simulation. Each combination of four 
simulation conditions in Table 1 represents a longitudinal model from which 100 datasets 
were sampled. Each condition has three time points (t=0, 1, 2). The simulation conditions 
combined to represent a total of 120 models, and with 100 ?samples? or replications from 
each we arrived at 12,000 datasets that were built using SAS. The pilot study (see 
Chapter 4) results led to the increase in number of replications to 100. The 120 models 
were built according to the combination of characteristics representing that cell in Table 
1. 
       
47 
 
Table 1 
 
Simulation Manipulated Conditions 
 
Conditions Number of Levels 
Custer Size 2  
Cluster Number 3  
Mixture Proportion 4  
Cluster Effect 5  
Total 120  
 
3.2 Characteristics of individual (level 1) data 
Two different growth trajectories in individuals (i.e., at level 1) represented the 
level-1 heterogeneity in this simulation. Growth profiles of these two groups followed 
Chen et al. (2010) ? which in turn is based on Nylund et al. (2007) ? namely, one with 
steeper, one with shallower slope (see Figure 1). Table 2 shows the parameter settings 
that were used to represent growth profiles of the two class levels (?fast growing?, ?slow 
growing?) that were included within every model. These profiles were held constant 
across the simulation conditions as shown in Table 2, by fixing the parameters listed in 
Table 2 (leftmost column) to the respective values shown under the right columns 
(Growth profile, Fast and Slow) in Table 2.  
  
       
48 
 
Table 2 
 
Setting of Growth Parameters in order to obtain the two latent classes for every sample 
 
 Growth profile 
Parameters Fast Slow
 Intercept mean 2.5 1
 Slope Mean 0.6 0.1
   
The fast growing individuals have intercepts (i.e., initial level) varying around 2.5 
and slopes (i.e., growth rate) varying around 0.5, while the slow growing individuals have 
intercepts varying around 1.00 and growth rates varying around zero.  These settings 
were selected because they create a clear separation of the two groups, and the parameter 
settings characterizing the slow growing group correspond to the PLP students (Lazarus 
et al., 2010) described in Chapter 2. A simulation study identifying two growth profiles 
utilizing GMM, with N=2400 (i.e., 1200 for each growth profile) and 100 replications 
found the average correct identification of two growth profiles at 91% with the minimum 
of 89% and the maximum of 93%. 
3.3 Characteristics of cluster (level-2) data 
Four attributes define the characteristics of level-2 data, representing clusters in 
the hierarchy: 1) the cluster number; 2) the cluster size; 3) Cluster Types and the mixture 
proportion of individuals within a Cluster Type; and 4) the cluster effect as shown in 
Figures 2 through 4. Each attribute, and its role in the simulation, is described below. 
3.3.1 Sample size is determined by cluster size and cluster number 
The sample sizes that were used in this simulation are determined by the cluster 
number and cluster size (i.e., the level-2 characteristics), illustrated on figure 2. For this 
simulation, three levels of cluster number (CN) were chosen to represent small, medium 
and large (CN=30, 60, 90, respectively) districts from which clusters might be drawn in 
       
49 
 
VAM contexts. Chen et al. (2010) used CN = 30, 50, and 80, but because three Cluster 
Types within a cluster (see section 3.3.2, below) need to have equal numbers of 
individuals within cluster (i.e., Cluster Type size of 10, 20, or 30), the simulation 
involved equal numbers per Cluster Type. An equal number of Cluster Types controls for 
the potential effects of different Cluster Type numbers within a cluster on the estimation 
of the value-added effect (i.e., the cluster effect). 
The cluster size (CS) is the number of observations, or individuals, within a 
cluster.  For this simulation, CS was also based on the design used by Chen et al. (2010), 
namely, values of 20 and 40. Sanders and Rivers (1996) used a cluster size of 20 and 
Wright, Horn and Saunders (1997) used a cluster size of 25 on their respective simulation 
studies of VAM ? in both cases, they argued these cluster sizes represent average 
classroom size in the U.S (at that time). This study included a cluster size of 40 because 
some schools and districts tend to have larger class sizes. 
3.3.2 Cluster type  
 In a VAM context, if the level-2 data are conceptualized as representing the 
between-group level model (e.g., Equations 19-21 in Ch 2), then the cluster type can be 
thought of as schools having a different proportions of student types (i.e., fast and slow 
growth).  In their study, Chen et al. (2010) only included clusters with equal proportions 
of fast and slow growth students (i.e., mixture proportion conditions 4 and 5 of this 
study). This simulation included three Cluster Types (see table 3).  As noted in chapter 1 
and the previous section, when each Cluster Type accounts for one-third of any given 
cluster, it permits the evaluation of potential biases for the cluster effect estimates across 
Cluster Types having different proportions of students in our two growth classes. Further, 
       
50 
 
in the evaluation of an effect of school in a VAM context, it is unrealistic to expect all 
schools to have the same proportion of students in these two growth classes. Thus, unlike 
the Chen et al. (2010) study, this simulation included three equal sized Cluster Types per 
cluster but within each Cluster Type, different mixture proportions (described below) 
were included.  
3.3.3 Mixture proportion 
The mixture proportion characterizes the prevalence of membership in the latent 
class? different levels. As laid out above, the growth profiles (fast/slow growing) 
represent the two levels of the latent class variable. The mixture proportion dictates what 
proportion of the sample belongs to each of these levels (fast, slow). This study uses five 
patterns of mixture proportion. Three patterns involve different mixture proportions 
based on the cluster type, i.e., the cluster type are each 1/3 of the cluster, but within each 
of these Cluster Types, the mixture proportions of fast and slow growers vary (see Figure 
3).  Two patterns of mixture proportions, taken from the Chen et al. (2010) study, 
represent fixed parameters within each Cluster Type. Table 3 shows the mixture 
proportion for these four simulation conditions. In this study, the data were generated by 
creating populations representing the two classes (growth profiles) and then sampling ? 
as dictated by the condition's mixture proportion ? the appropriate number of 
observations from each of these classes. 
  
       
51 
 
Table 3 
 
Definition of mixture proportion by Cluster Type  
 
Level-2 features Growth (latent class, level-1 feature)
 Mixture proportion Pattern (MP) Cluster Type Fast Slow 
1 1 50 50 
 2 75 25 
 3 100 0 
2 1 25 75 
 2 50 50 
  3 75 25 
3 1,2,3 50 50 
4 1,2,3 75 25 
 
Mixture proportion conditions 1 and 2 investigate the influence of differential mixture 
proportion among cluster types, where condition 1 has less variability in proportion and 
condition 2 has more variability. Mixture proportion conditions 3 and 4 have consistent 
mixture proportions among cluster types. Simulation studies conducted by Muth?n and 
Asparouhov (2009) and Chen et al. (2010) used settings similar to conditions 3 and 4. 
These two conditions assess the effect of mixture proportion across the other simulation 
conditions. 
3.3.4 Cluster effects 
The fourth feature of the level-2 data is the cluster, or cluster-level, effect. In the 
context of VAM analysis, the cluster-level effect represents the value-added effect, 11? , 
shown in Equation 36. The cluster effects are the parameters of interest in this study. As 
can be inferred from Figure 8, and from Equations 19-21, the cluster effect is only 
defined/estimable for individuals in the fast growth group in this simulation ? because the 
slow growth group has zero slope (see Table 2). That is, since VAM seeks to estimate the 
       
52 
 
impact of higher-level variables on the development or change in the first level variable 
(i.e., at the individual level), if there is no change, there can be no value-added effect 
estimated.   
The cluster effect varied based on the makeup of the cluster type ?even when 
cluster types are equal sizes (i.e., 33% of the given cluster per Cluster Type), as in this 
simulation. Table 4 shows the five cluster effects used in the simulation, in the third 
column (Cluster Effect) of the table. The same number of individuals was assigned to 
each cluster effect condition proportionally depending on the size of cluster (CS) and the 
number of cluster effects (e.g., five cluster effects for the first cluster effect condition). 
As Table 4 shows, this study included five patterns of level-2 effects, three as 
fixed effects and two as random effects. This permitted the systematic investigation of the 
parameter recovery of cluster-level effects in the various simulation conditions. These 
effects ? representing the value added effects ? are parameters of interest, critical to 
address the research questions. 
 
 
  
       
53 
 
Table 4 
Cluster effects defined by pattern of Cluster Type 
Cluster Effect Pattern (CE) Cluster Type  Cluster effect parameters 
1 1 (-1, -0.5, 0, 0.5, 1) 
 2 (-1, -0.5, 0, 0.5, 1) 
 3 (-1, -0.5, 0, 0.5, 1) 
2 1 (-1, -0.5) 
 2 0 
 3 (0.5, 1) 
3 1 (0.5, 1) 
 2 0 
 3 (-1, -0.5) 
4 1, 2, 3 11 ~ (0,0.5)N?  
5 1, 2, 3 11 ~ (0,1.0)N?  
 
Cluster effect condition (CE) 1 has the same five parameter values across three 
cluster types. This condition is specifically design to evaluate the influence of differential 
mixture proportion (i.e., MP1 and MP2 conditions) among cluster types in terms of the 
direction of biases (i.e., positive or negative) and the precision of estimates. The cluster 
effect condition 2 (CE2) and 3 (CE3) also investigate the cluster type level bias and 
precision of parameter estimates between cluster type 1 and 3 (i.e., fixed parameters are 
reversed between cluster type 1 and 3). These conditions were included to systematically 
investigate the extent of positive and negative bias in the parameter estimates. The 
random effects based on different variances were generated for the cluster effect 
condition 4 and 5 (small variation for CE4 and large variation for CE5).  
3.4 Data simulation 
All data were generated in SAS (9.2, SAS Inc., Cary, NC). Data generation was 
based on 2-class MLGMM, varying parameters for each simulation condition as outlined 
in Tables 1 through 4. Based on the background given in Chapter 2, the following 
       
54 
 
Equations (29-37) show how the data for this simulation were generated for the models 
with characteristics outlined in Table 2 over three time points (t=0, 1, 2) 
Level 1 
 0 1 1 , ~ (0,1)tij ij ij tij tij tijY a e e N? ?? ? ?  (29) 
 0 00 01 0ij j j ij ijClass r? ? ?? ? ?  (30) 
 1 10 11 1ij j j ij ijClass r? ? ?? ? ?  (31) 
 00 01
 10 11
 0
 1
 0where ~ ,0
 ij
 ij
 r MVNr
 ? ?
 ? ?
 ? ?
 ? ?
 ? ?? ?? ? ? ?? ?? ?? ? ? ?? ?? ? ? ?? ? ? ?? ?
   (32) 
Level 2 
 00 00 0j j? ? ?? ?  (33) 
 01 01j? ??  (34) 
 10 10j? ??  (35) 
 11 11j? ??  (36) 
 0000where ~ (0, )j N ?? ?  (37) 
Based on the growth profiles shown in Table 3, in all simulation conditions the 
Level-1 variance, tije , is set to one (i.e., it is fixed). Equation 33 specifies the magnitude 
of within-class variation (i.e., variance-covariance of slopes and intercepts), which 
follows the Chen et al. (2010) specification of a medium magnitude or  ?low separation? 
condition (after Tofighi & Enders, 2008) as: 
00 0.20?? ? , 10 01 0.05? ?? ?? ? , and 
11 0.05?? ? . The four group-level (Level 2) growth parameters (Equations 33-36) are set 
to the following values: 00 1.0? ? , 01 1.5? ? , 10 0.1? ? , and 11 0.5? ? , where Class on 
       
55 
 
Equations 30 and 31 is a dichotomous indicator (1=High Growth, 0=Low Growth), so 
that growth parameters 11 j?  are correctly represented.  An intraclass correlation (ICC), 
representing the magnitude of intercept random effect (error/variability) over the total 
variability, of 0.10 translates to a parameter setting of  
000 0.133?? ?   (Chen et al., 2010). 
For the fixed Cluster Type effects are manipulated based on the specification on Table 4. 
Equal numbers of fixed parameters (e.g., -1, -0.5, 0, 0.5, or 1) are assigned to individuals 
within each Cluster Type. 
To contextualize these features within Table 1, in the simulation condition with 
cluster number (CN) =30, cluster size (CS) = 20, mixture proportion condition 1, and 
cluster effect condition 1, there are 10 clusters for each Cluster Type, with the proportion 
of individuals in the fast growth type across Cluster Types being set to 25%, 50%, and 
100%, respectively. The cluster effects are set to (-1, -0.5, 0, 0.5, and 1) for all cluster 
subtypes, having two clusters for each cluster effect condition (i.e., the number of Cluster 
Types (10) divided by the number of cluster effect conditions (5); 10/5 = 2). 
3.5 Model Fitting 
 The preceding section describes how the data were generated for the model 
features that were studied in this project. Given those 120 models, for each of the 100 
data sets, four models were fit in the way outlined by Chen et al. (2010): 1) mis-specified 
model (i.e., MLLGM latent class unmodeled; 2-3) two incorrect mixture models (i.e., 
MLGMM with 1- and 3-latent classes); and4) correct mixture model (i.e., MLGMM with 
2-latent classes) to evaluate the effect of unmodeled latent class (i.e., heterogeneity at 
students? level) and the latent class identification issues. MPlus 6.1 (Muth?n & Muth?n, 
2010) was used to fit these four models to each of the 12,000 samples that are generated 
       
56 
 
by the 120 models that were generated as shown in Table 1. The growth profile on table 2 
was used as the starting value for the corresponding parameters on the mixture models.  
3.6 Analysis of results of model fitting 
The identification of the presence of, and levels in, a latent class variable in 
mixture models should be based on more than one statistical index (Bauer & Curran, 
2004; Nylund et al., 2007; Palardy & Vermunt, 2010). This study utilized six indices ? 
information criteria ? to identify the model with the number of latent class variables, and 
its levels, that are most representative of the data (according to information in the data 
represented by the model) (see Anderson, 2008). It is important for the model selection 
criteria to be robust since, as outlined in the foregoing study design elements, there were 
mis-specified models fit to data. The six information criteria that were used are:  
? Akaike Information criteria (AIC; Akaike 1987) 
? Modified Akaike information criteria (AIC3; Bozdogan, 1993) 
? Second order bias corrected AIC (AICc; McQuarrie & Tsai, 1998; after 
Akaike, 1987) 
? Bayesian information criteria (BIC; Schwarz, 1978) 
? Bayesian information criteria with a cluster number for sample size 
adjustment factor (BICB; Parlady & Vermunt, 2010) 
? Sample size adjusted BIC (SABIC; Sclove, 1987) 
All information criteria are defined as a function of log-likelihood of the model; they 
differ in terms of the penalty each imposes depending on the number of parameters 
estimated and/or sample size. Lower values of any information criterion indicate that the 
model for which it was computed fits the data better than do models with higher criterion 
       
57 
 
values. Equation 38 is AIC (Akaike, 1987), on which many of the modern information 
criteria are based (see Anderson, 2008); the following equations define each information 
criterion that were used in this study: 
 2 log 2AIC LL P? ? ?  (38) 
 2 log 3AIC LL P? ? ?  (39) 
 2 2 [ ]1
 NAICc LL p N p? ? ? ? ?  (40) 
 2 log( )BIC LL P N? ? ?  (41) 
 2 log( )clusterBICB LL P N? ? ?  (42) 
 22 log( )24
 NSABIC LL P ?? ? ?  (43) 
where P is the number of estimated parameters, N is the sample size, and Ncluster is the 
number of clusters. 
 Muth?n and Asparouhov (2009) and Nylund et al. (2007) reported BIC to be one 
of the most effective information criteria to determine the correct number of latent classes 
with GMMs. By contrast, Palardy and Vermunt (2010) found BICB to be more effective 
than BIC and AIC3 to be more effective than AIC. Anderson (2008) recommends against 
using BIC for multimodel selection exercises (see also Burnham & Anderson, 2002) but 
its performance has been shown to be quite reliable and robust when used in simulations 
involving the types of models that were built and tested in this simulation, specifically 
because the correct model is known to be among those in the model space (see Anderson, 
2008). 
 Results of model fitting with MPlus Ver 6.1 (Muth?n & Muth?n, 2010) were 
summarized as the percent of occasions, of 100 samples fitted, that each index identified 
       
58 
 
a given model (of the four fitted) to be the best model over the replication. This summary 
is shown in Table 5 in Chapter 4, where pilot results supporting this research are 
described.  
3.7 Analysis 
Parameter estimates from MPlus analysis were processed in SAS 9.2 (SAS 
Institutes, 2009-2010). MPlus provides the estimates of individual growth parameters 
(level 1), cluster level effects (i.e., intercepts and slope ?level 2), and fit information 
(overall model). Group level effects for the MLLGM were derived from the fast-growth 
latent class. A SAS program then converted group level effects (i.e., 11? ) to quintile rank, 
computed the bias using Equation 44, the variance of the group level effect, and 
constructed 90% confidence interval (90% CI) over the replications for that model. 
 ? ?( ) est trueB ? ? ?? ?  (44) 
3.7.1 Outcomes of Interest: Parameter Recovery 
This section describes the methods used to summarize bias, variance, and 90% CI 
for the mean bias estimate over the replications for the true (i.e., two class MLGMM) and 
mis-specified (i.e., MLLGM) models. The purpose of this analysis was to investigate 
systematic trend in biases, not to identify ?significant effects? of simulation conditions 
(i.e., cluster subtype, cluster size, and model types), which were evaluated as described in 
section 3.5. Pilot work, described in the next chapter, provides an example of how 
parameter recovery was summarized, shown in Tables 7 and 8. For the main study,  
visual representations of the information described by summaries like those in Tables 7 
and 8 were constructed in order to facilitate the interpretation of overall trends in bias, if 
any emerged from the simulations and the model fitting in the main study. 
       
59 
 
 
3.7.2 Outcome of Interest: Classification error at quintile level 
Sanders and Rivers (1996) evaluated the stability of estimated teacher?s effect at 
the quintile level. This study proposes to also utilize the quintile level evaluation in order 
to examine the estimation accuracy between the incorrectly specified model (i.e., 
MLLGM) and correctly specified model (i.e., 2-class MLGMM). The classification rate 
at the quintile level comparing true (i.e., quintile rank based on simulation criteria) and 
estimated (i.e., rank estimated in the model that is identified) values were summarized 
using weighted Kappa, which penalizes disagreements more when the classification falls 
further from the diagonal (perfect agreement). This is shown using the pilot data in 
Chapter 4, Table 9. 
3.8  Evaluation of simulation: Achievement of stated design aims 
Analyses of variance (ANOVAs) were conducted in the pilot study described in 
Chapter 4 in order to examine the four design factors that had also been studied by Chen 
et al. (2010): cluster number, cluster size, mixture proportion, and cluster effect and their 
interactions. The present study investigated the significance of various factors that may 
influence the biases in parameter estimates among simulation conditions and their 
interactions, as well as exploring the functionality of BIC, relative to other information 
criteria, in the MLGMM context ? thereby integrating and refining results from Muth?n 
and Asparouhov (2009) and Palardy and Vermunt (2010). Including the ANOVA 
established how and whether the results from this study are comparable to those of Chen 
et al. (2010). The present study was therefore contextualized with prior work and poised 
to build on it. 
       
60 
 
3.9 Summary of Methods 
This chapter described the design features of this simulation study to investigate the 
effect of unmodeled heterogeneity at the individual level on the precision of estimation at 
higher levels in an MLGMM framework representing a generic VAM type analysis. The 
focus of this study was not to examine the effect of design factor at global level (e.g., 
significance with ANOVA) but rather, to identify the potential patterns of estimation 
biases and imprecision at the group parameter estimates level (e.g., level 2 parameter 
estimation) that arise when mis-specified mixture modeling is used. One of the purposes 
of this study was to determine if the precision of estimates from MLGMM warrants 
further investigation in real data, particularly in the context of the teacher evaluation with 
VAM. The pilot study illustrating these methods and testing the fidelity of the code to the 
design features is described in Chapter 4. 
 
  
       
61 
 
Chapter 4: Pilot Study: testing simulation features and analysis plan 
A pilot study was performed to test the accuracy of programs designed for the 
simulation study and to identify potential flaws in the simulation design including the 
presence and extent of any model convergence problems.  
4.1 Testing simulation features 
There were three main steps in this pilot study: 1) data generation, 2) estimation, 
and 3) analysis.  SAS macro programs were written for each step: for data generation; to 
run MPlus for the estimation of models; to process results from the MPlus model 
estimation; and to generate the analysis.  An additional program was written to test the 
quality/ensure the fidelity of the simulation code by comparing simulated data against the 
specification of simulation for all conditions.  An error identification program was also 
written to examine all MPlus output to identify convergence issues. This program created 
a list of simulation conditions resulting in convergence issues, and also automatically 
reran analyses whenever a convergence issue was encountered. All programs were 
written to automate these processes and were controlled by the Excel specification file for 
simulation conditions that were run and analyzed. The Excel file roughly approximated 
Table 1. All simulation conditions were tested with at least five replications until no 
modifications were indicated; a very small 40 sample pilot was then run.  
The purpose of these tests, run with just five replications, was to verify that all 
codes ? including data generation, estimation, error identification, and analysis ? were 
working properly. These tests led to code modifications. When no further modifications 
were deemed necessary, the code was run on 40 replications, and the pilot results 
described below are based on these 40 replications of each model. 
       
62 
 
The following sections summarize the preliminary results from the mixture 
proportion (MP=1) including all conditions on three other effects, three cluster numbers , 
two cluster sizes , and the five mixture proportions. The results were summarized for the 
40 replications. This pilot study represented 1/5 of the entire study and took roughly 30 
hours. 
4.2 Preliminary results on the model identification and class identification 
 Convergence issues were anticipated with the simulation conditions with the 
smaller cluster size (i.e., 20) and the mixture proportion condition 3 (MP=3) and with 
more variable random cluster effect condition (i.e., 11 ~ (0,1.0)N? ). Table 5 summarizes 
the model identification rate by six information criteria described in chapter 3. The 
preliminary results clearly show that BIC does not perform well for most conditions, 
which is surprising given its applicability to simulation studies (i.e., it works only when 
the true model is known to be among those in question) and its excellent performance in 
other work (e.g., Muth?n & Asparouhov; 2009). Table 6 shows the class identification 
(i.e., fast or slow growth) for the mixture proportion, MP=1, condition. The models with 
four- and five- latent classes are also performed for the model identification study. 
However, due to the large proportion of models with the convergence issues including the 
negative variance in the parameter estimates, zero case in one or more latent classes, or 
the model convergence problem (i.e., model did not converge).  Only 15% of four class 
models (i.e., 6 out of 40 replications) and none of five class models converged without 
the issues, therefore only two- and three-class models were used in the main study 
described in Chapter 5. 
 
  
       
63 
 
Table 5 
 
Pilot Study Results: Percent of Correct Model Identification by Six Information Criteria  
 
for Specified Simulation Conditions: MP1 
 
Simulation Condition Information Criteria 
Mixture 
Proportion 
Cluster 
Effect 
Cluster 
Number
 Cluster 
Size AIC AIC3 AICc BIC BICB SABIC 
1 1 30 20 32.5 22.5 27.5 0 12.5 17.5 
   40 45 27.5 47.5 0 22.5 15 
  60 20 62.5 60 65 0 22.5 22.5 
    40 60 65 62.5 0 52.5 35 
  90 20 62.5 77.5 62.5 0 62.5 67.5 
   40 67.5 87.5 67.5 17.5 87.5 72.5 
 2 30 20 25 10 22.5 0 5 10 
   40 60 17.5 60 0 15 10 
  60 20 50 37.5 55 0 25 27.5 
    40 60 77.5 60 2.5 47.5 25 
  90 20 62.5 75 67.5 2.5 37.5 47.5 
     40 70 87.5 70 7.5 77.5 52.5 
 3 30 20 17.5 7.5 17.5 0 0 5 
   40 57.5 45 55 0 37.5 22.5 
  60 20 52.5 47.5 52.5 0 15 17.5 
    40 67.5 80 70 2.5 60 32.5 
  90 20 50 75 50 0 35 42.5 
   40 67.5 87.5 67.5 7.5 70 57.5 
 4 30 20 30 12.5 25 0 7.5 12.5 
   40 37.5 25 35 0 12.5 7.5 
  60 20 52.5 50 55 0 15 25 
    40 72.5 87.5 77.5 10 50 37.5 
  90 20 72.5 75 72.5 5 42.5 47.5 
     40 72.5 92.5 75 2.5 82.5 72.5 
 5 30 20 52.5 27.5 45 0 17.5 20 
   40 62.5 55 62.5 0 47.5 37.5 
  60 20 45 50 50 2.5 50 50 
    40 57.5 90 60 32.5 95 85 
  90 20 60 92.5 62.5 7.5 75 77.5 
      40 70 87.5 70 47.5 90 87.5 
 
 
 
 
       
64 
 
 
 
Table 6 
 
Pilot study results: Recovery Rate of Latent Class for Specified Simulation Conditions:  
MP1  
 
Simulation Condition  
Mixture 
Proportion 
Cluster 
Effect 
Cluster 
Number
 Cluster 
Size Class Identification Rate (%)
 1 1 30 20 81.04 
   40 81.27 
  60 20 81 
    40 82.6 
  90 20 81.42 
   40 83.44 
 2 30 20 79.53 
   40 80.83 
  60 20 82.96 
    40 80.86 
  90 20 81.5 
     40 82.01 
 3 30 20 78.8 
   40 82.79 
  60 20 82.8 
    40 82.11 
  90 20 82.44 
   40 83.22 
 4 30 20 79.08 
   40 77.03 
  60 20 79.95 
    40 82.03 
  90 20 80.97 
     40 82.53 
 5 30 20 76.83 
   40 82.93 
  60 20 82.59 
    40 82.16 
  90 20 83.57 
      40 83.74 
 
       
65 
 
 The average latent class recovery rate was around 80%, and large cluster size 
seemed to moderately increase the recovery rate. There were no convergence issues for 
MLLGM and the two-class MLGMM, but there were a few convergence issues for the 
mis-specified three-class MLGMM, when the cluster size was small (i.e., CS=20). It was 
found that MLGMM improved the estimates of level-2 parameters even without strong 
identification of individual latent classes. Therefore the latent class recovery study was 
not included in the final analysis. 
4.3 Preliminary results on precision of estimates  
 Tables 7 and 8 show the mean bias and standard deviation of bias and 
corresponding 95% CIs for each true cluster effect (i.e.,  -1, -0.5, 0, 0.5 and 1) for all 
cluster sizes and numbers on MP=1 (Table 3) and CE=1 (Table 4) conditions. The biases 
were smaller, considerably smaller for some cases, for MLGMM as compared to 
MLLGM; the variability of bias was slightly larger for MLGMM, which could be due to 
the mis-identification of the individuals (i.e., fast and slow growth). In addition, the 
estimation parameters coded into the MPlus programs used here were still preliminary, 
which might have contributed to the inaccuracy in estimates.  
Overall, these pilot results suggest that MLGMM could be a useful method to reduce 
the bias in the estimate of VAM effects in this simulation condition.  
 
 
 
 
 
 
 
 
 
 
       
66 
 
Table 7 
 
Bias, Error, CI of group estimates: Mixture Proportion 1 (MP1) and Cluster Effect 1 
(CE1) for the true model 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
30 20 -1 -0.34 -0.69 -0.06 0.2 0.01 0.29 -0.22 -0.58 0.05 0.18 0.01 0.25 -0.12 -0.43 0.17 0.18 0.01 0.26
   -0.5 -0.13 -0.39 0.17 0.19 0 0.25 -0.12 -0.48 0.17 0.19 0 0.32 -0.1 -0.36 0.13 0.17 0 0.3
   0 0.08 -0.08 0.25 0.38 0.22 0.49 0.02 -0.21 0.29 0.36 0.22 0.44 -0.03 -0.23 0.18 0.15 0.01 0.2
   0.5 0.24 -0.04 0.55 0.22 0 0.39 0.1 -0.23 0.37 0.19 0 0.34 0 -0.25 0.24 0.16 0 0.33
   1 0.42 0.05 0.96 0.23 0.01 0.25 0.24 -0.05 0.72 0.2 0.01 0.25 0.03 -0.19 0.28 0.17 0 0.25
  40 -1 -0.28 -0.52 0.02 0.17 0 0.23 -0.19 -0.49 0.05 0.15 0 0.23 -0.08 -0.24 0.12 0.13 0 0.2
   -0.5 -0.1 -0.25 0.13 0.14 0.01 0.21 -0.09 -0.28 0.12 0.13 0 0.15 -0.07 -0.26 0.13 0.13 0.01 0.23
   0 0.09 -0.04 0.2 0.39 0.27 0.49 0.02 -0.23 0.2 0.37 0.2 0.45 -0.04 -0.2 0.11 0.12 0.01 0.22
   0.5 0.24 -0.12 0.46 0.17 0 0.26 0.07 -0.09 0.35 0.14 0 0.23 -0.01 -0.29 0.21 0.15 0 0.29
   1 0.39 0.03 0.81 0.19 0 0.23 0.11 -0.07 0.4 0.15 0.01 0.26 0 -0.17 0.17 0.11 0 0.19
 60 20 -1 -0.31 -0.63 -0.11 0.2 0.03 0.25 -0.22 -0.67 -0.01 0.21 0.07 0.3 -0.15 -0.38 0 0.17 0.03 0.26
   -0.5 -0.14 -0.31 0.06 0.19 0.04 0.27 -0.1 -0.29 0.06 0.17 0.04 0.27 -0.12 -0.23 0.06 0.16 0.04 0.22
   0 0.09 -0.03 0.19 0.38 0.2 0.45 0 -0.13 0.18 0.36 0.24 0.43 -0.04 -0.21 0.14 0.17 0.06 0.26
   0.5 0.21 0.03 0.53 0.21 0.06 0.31 0.08 -0.07 0.28 0.18 0.05 0.27 0.02 -0.14 0.25 0.18 0.05 0.25
    1 0.44 0.16 0.91 0.24 0.05 0.32 0.19 -0.05 0.58 0.2 0.06 0.29 0.06 -0.09 0.28 0.17 0.03 0.25
  40 -1 -0.29 -0.63 -0.14 0.15 0.02 0.2 -0.18 -0.52 -0.01 0.15 0.02 0.18 -0.11 -0.28 0.05 0.13 0.01 0.21
   -0.5 -0.09 -0.32 0.05 0.15 0.03 0.22 -0.09 -0.25 0.03 0.14 0.03 0.21 -0.07 -0.29 0.13 0.14 0.04 0.19
   0 0.1 -0.01 0.18 0.39 0.2 0.46 0.02 -0.08 0.1 0.37 0.27 0.43 -0.04 -0.2 0.11 0.12 0.02 0.22
   0.5 0.24 0.09 0.56 0.15 0.04 0.2 0.08 -0.03 0.33 0.14 0.02 0.23 -0.02 -0.15 0.1 0.12 0.03 0.17
     1 0.38 0.13 0.88 0.18 0.03 0.25 0.11 0 0.43 0.14 0.01 0.2 -0.01 -0.13 0.13 0.14 0.03 0.24
 90 20 -1 -0.32 -0.62 -0.14 0.2 0.1 0.24 -0.2 -0.51 -0.04 0.2 0.06 0.27 -0.13 -0.32 0.02 0.17 0.06 0.23
   -0.5 -0.12 -0.31 0.09 0.21 0.07 0.29 -0.1 -0.27 0.04 0.19 0.07 0.27 -0.11 -0.27 0.06 0.18 0.09 0.27
   0 0.09 0 0.2 0.39 0.23 0.47 0.02 -0.12 0.16 0.37 0.27 0.44 -0.03 -0.13 0.12 0.17 0.08 0.24
   0.5 0.22 0.01 0.43 0.21 0.08 0.34 0.09 -0.05 0.37 0.2 0.09 0.27 0.02 -0.08 0.13 0.16 0.07 0.27
    1 0.41 0.21 0.85 0.24 0.09 0.34 0.19 0.04 0.46 0.2 0.06 0.3 0.04 -0.15 0.15 0.17 0.03 0.25
  40 -1 -0.28 -0.4 -0.09 0.15 0.04 0.22 -0.15 -0.29 -0.03 0.14 0.05 0.19 -0.09 -0.2 0 0.13 0.05 0.19
   -0.5 -0.1 -0.2 0.03 0.13 0.05 0.18 -0.08 -0.21 0.07 0.13 0.05 0.19 -0.07 -0.17 0 0.12 0.04 0.19
   0 0.1 0.03 0.15 0.41 0.34 0.45 0.03 -0.03 0.1 0.38 0.27 0.43 -0.06 -0.18 0.05 0.12 0.05 0.18
   0.5 0.22 0.05 0.36 0.16 0.07 0.21 0.06 -0.07 0.17 0.13 0.05 0.2 -0.02 -0.13 0.08 0.13 0.06 0.19
     1 0.38 0.19 0.57 0.18 0.05 0.24 0.13 0.01 0.22 0.14 0.06 0.16 -0.01 -0.12 0.1 0.12 0.06 0.17
  
       
67 
 
Table 8 
 
Bias, Error, CI of Group Estimates: Mixture Proportion 1 (MP1) and Cluster Effect 
(CE1) for Misspecified Model 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r E
 ffe
 ct 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
Me
 an 
2.5
 % 
97
 .5%
  
30 20 -1 -0.46 -0.78 -0.23 0.16 0 0.29 -0.32 -0.56 0.02 0.14 0 0.25 -0.15 -0.42 0.19 0.17 0 0.24
   -0.5 -0.19 -0.42 0.01 0.15 0.01 0.24 -0.17 -0.54 0.07 0.18 0.01 0.31 -0.13 -0.41 0.08 0.16 0 0.24
   0 0.08 -0.05 0.3 0.32 0.22 0.41 0.01 -0.14 0.22 0.35 0.21 0.51 -0.05 -0.26 0.18 0.13 0 0.21
   0.5 0.32 0.06 0.56 0.16 0 0.35 0.15 -0.12 0.39 0.16 0.01 0.23 -0.02 -0.24 0.19 0.15 0.01 0.27
   1 0.58 0.26 0.84 0.16 0 0.19 0.33 0.05 0.54 0.16 0.01 0.23 0.02 -0.32 0.26 0.17 0 0.25
  40 -1 -0.41 -0.57 -0.16 0.13 0.01 0.16 -0.31 -0.48 -0.17 0.11 0 0.21 -0.12 -0.3 0.04 0.11 0.01 0.2
   -0.5 -0.16 -0.4 0.04 0.14 0.01 0.2 -0.15 -0.29 0.05 0.12 0 0.17 -0.1 -0.26 0.1 0.12 0 0.24
   0 0.1 -0.02 0.21 0.31 0.23 0.38 0.01 -0.09 0.14 0.33 0.17 0.45 -0.06 -0.29 0.12 0.12 0 0.18
   0.5 0.33 0.14 0.55 0.13 0.01 0.21 0.14 -0.07 0.34 0.12 0.01 0.2 -0.04 -0.25 0.15 0.14 0 0.26
   1 0.6 0.31 0.84 0.14 0 0.2 0.25 0.11 0.4 0.11 0 0.21 -0.03 -0.17 0.15 0.11 0 0.24
 60 20 -1 -0.45 -0.78 -0.27 0.16 0.04 0.22 -0.32 -0.68 -0.16 0.17 0.04 0.26 -0.18 -0.46 0.01 0.16 0.03 0.24
   -0.5 -0.2 -0.49 -0.05 0.17 0.03 0.28 -0.15 -0.36 0.04 0.15 0.03 0.25 -0.13 -0.4 0.05 0.16 0.03 0.24
   0 0.1 -0.07 0.19 0.32 0.22 0.4 0.01 -0.2 0.14 0.34 0.2 0.43 -0.06 -0.33 0.07 0.17 0.04 0.27
   0.5 0.31 0.06 0.48 0.16 0.04 0.23 0.16 -0.17 0.32 0.17 0.07 0.27 0 -0.26 0.19 0.16 0.04 0.24
    1 0.63 0.34 0.81 0.16 0.03 0.24 0.33 0.07 0.52 0.15 0.04 0.25 0.06 -0.22 0.2 0.17 0.04 0.25
  40 -1 -0.43 -0.58 -0.29 0.12 0.03 0.2 -0.29 -0.42 -0.1 0.12 0.02 0.17 -0.16 -0.29 0.06 0.11 0.02 0.19
   -0.5 -0.16 -0.31 0.02 0.13 0.02 0.2 -0.15 -0.26 0.02 0.12 0.02 0.2 -0.12 -0.25 0.12 0.13 0.03 0.19
   0 0.1 0.04 0.24 0.32 0.22 0.36 0.01 -0.06 0.11 0.33 0.21 0.49 -0.09 -0.2 0.09 0.11 0.02 0.16
   0.5 0.33 0.2 0.47 0.11 0.02 0.17 0.14 -0.02 0.4 0.13 0.02 0.23 -0.06 -0.16 0.15 0.11 0.03 0.15
     1 0.57 0.41 0.73 0.12 0.03 0.17 0.25 0.13 0.4 0.1 0.02 0.15 -0.04 -0.14 0.19 0.13 0.04 0.22
 90 20 -1 -0.47 -0.58 -0.08 0.17 0.06 0.24 -0.32 -0.45 -0.01 0.17 0.04 0.24 -0.17 -0.3 0.14 0.15 0.07 0.2
   -0.5 -0.2 -0.37 0.24 0.17 0.08 0.22 -0.15 -0.29 0.07 0.15 0.07 0.25 -0.14 -0.27 0.25 0.17 0.05 0.23
   0 0.1 0 0.32 0.33 0.26 0.42 0.02 -0.09 0.2 0.35 0.23 0.47 -0.05 -0.16 0.14 0.15 0.04 0.23
   0.5 0.33 0.24 0.73 0.16 0.1 0.26 0.16 0.02 0.52 0.17 0.03 0.27 0 -0.11 0.34 0.16 0.06 0.44
    1 0.61 0.45 0.76 0.15 0.04 0.24 0.32 0.18 0.89 0.16 0.07 0.24 0.02 -0.14 0.41 0.16 0.04 0.22
  40 -1 -0.42 -0.55 -0.21 0.12 0.06 0.16 -0.27 -0.38 0.01 0.13 0.05 0.19 -0.14 -0.23 0.16 0.12 0.05 0.16
   -0.5 -0.17 -0.28 0.08 0.12 0.04 0.16 -0.14 -0.25 0.06 0.12 0.04 0.16 -0.12 -0.2 0.17 0.11 0.04 0.16
   0 0.11 0.06 0.22 0.32 0.25 0.37 0.03 -0.03 0.14 0.34 0.22 0.53 -0.1 -0.2 0.13 0.12 0.05 0.17
   0.5 0.33 0.24 0.53 0.12 0.05 0.16 0.13 0.03 0.42 0.11 0.04 0.2 -0.06 -0.17 0.21 0.12 0.05 0.18
     1 0.59 0.5 0.82 0.12 0.06 0.19 0.28 0.17 0.67 0.12 0.04 0.15 -0.04 -0.15 0.22 0.12 0.05 0.16
  
4.4 Preliminary results on classification accuracy 
 Table 9 shows the classification accuracy of cluster effects (i.e., value-added 
effect) ranked by quintile level. The range of weighted kappa varied from .61 to .83 with 
       
68 
 
an average of .73 for the true model (i.e., two-class MLGMM), For MLLGM, the range 
for weighted kappa was .54 to.71 with an average of .64.  The MLGMM had a higher 
classification rate than MLLGM for this simulation. 
  
       
69 
 
Table 9 
Rate of Misclassification at the Quintile Level 
Simulation Condition Kappa 
Mixture 
Proportion 
Cluster 
Effect 
Cluster 
Number
 Cluster 
Size MLGMM MLLGM
 1 1 30 20 0.76 0.67 
   40 0.82 0.71 
  60 20 0.77 0.67 
    40 0.82 0.71 
  90 20 0.77 0.67 
   40 0.83 0.70 
 2 30 20 0.71 0.66 
   40 0.77 0.68 
  60 20 0.73 0.66 
    40 0.76 0.68 
  90 20 0.73 0.66 
     40 0.76 0.68 
 3 30 20 0.67 0.61 
   40 0.75 0.65 
  60 20 0.70 0.62 
    40 0.75 0.65 
  90 20 0.70 0.61 
   40 0.75 0.65 
 4 30 20 0.61 0.54 
   40 0.65 0.57 
  60 20 0.63 0.55 
    40 0.69 0.59 
  90 20 0.62 0.54 
     40 0.69 0.58 
 5 30 20 0.67 0.60 
   40 0.73 0.63 
  60 20 0.73 0.64 
    40 0.75 0.66 
  90 20 0.74 0.64 
      40 0.75 0.65 
 
4.5 Preliminary results on ANOVA over the simulation condition 
 There were not enough simulation conditions to conduct the full ANOVA as was 
done for the main study. However, the preliminary results indicated that the cluster effect, 
       
70 
 
cluster size, and all two-way interaction effects had a significant effect (and p<0.01) on 
the mean bias estimates. The cluster number appeared not to have a significant effect. 
The ANOVA effects were used for the generation of a plot representing, and so 
facilitating, the interpretation of biases in estimates between the true and mis-specified 
models. 
4.6 Determination of number of replications for the study 
The number of replications to be used in the main study was determined by 
examining the precision of estimates described in section 4.3 over different numbers of 
replications: 20, 40, 80, 100, 200, and 400. Table 10 shows the summary of 90% 
confidence interval and the standard deviation of bias estimates from cluster effect 
condition 1 and mixture proportion condition 1. The variation of standard deviations and 
the range of confidence interval converged around 100 replications, indicating that 100 
replication would be sufficient for the main study.  
  
       
71 
 
Table 10 
 
Precision of Bias Estimates Over a Different Number of Replications 
 
  Number of Replications 
Cluster Type Bias Estimates 20 40 80 100 200 400 
1 5% -0.36 -0.43 -0.37 -0.44 -0.43 -0.43 
 90% 4.01 2.2 2.53 2.17 2.33 2.35 
 SD 1.25 0.81 1.23 1.04 1.11 1.15 
2 5% -0.89 -0.76 -0.65 -0.65 -0.65 -0.65 
 90% 3.57 2.13 2.2 2.28 2.15 2.25 
 SD 1.31 0.85 1.29 1.1 1.14 1.17 
3 5% -1.47 -1.93 -1.64 -1.79 -1.66 -1.7 
 90% 3.28 0.82 1.26 1.06 1.08 1.16 
 SD 1.5 0.8 1.36 1.15 1.13 1.19 
4.7 Other Analysis Issues 
 The convergence and analysis issues were identified by examining Mplus output. 
The first step was to examine if the output files included models for which estimates had 
not been generated, which indicates non-convergence of that model. The second step was 
to read in estimates to: 1) identify negative variances and 2) latent classes with zero 
cases, both of which represent non-informative convergence of that model. New data was 
generated to replace data that had resulted in one or more of these indicators of 
convergence problems, and this process was continued until 100 successful replications 
for each simulation condition were completed. 
4.8 Pilot study summary 
 This small pilot study demonstrated that the automated procedures for generating, 
manipulating, and analyzing data according to the simulation characteristics outlined in 
Chapter 3/Table1 worked, and that the fidelity of simulated data to the simulation design 
was high. Convergence issues were only encountered in the contexts in which they were 
anticipated (i.e., model fitting with over-fitting 3-class MLGMM with small cluster size 
       
72 
 
(20)), and in no other context. It is important to note that there was no convergence issue 
for the model of interest, two-class MLGMM, in which the parameters of interest were 
estimated. Therefore the impact of the minor convergence issues with 3-class MLGMM 
was deemed to be minimal. Each of the summary features described in Chapter 3 
functioned in this pilot data to yield interpretable outcomes and no issues were 
encountered that were not A) expected and B) easily addressed. In summary, the pilot 
results supported the likelihood that the main study would be completed as planned and 
that the impacts on effects and estimates would be representative of the simulation design 
outlined in Chapter 3. 
       
73 
 
Chapter 5: Main Study Results 
 The main study commenced once the pilot study concluded. As outlined in the 
previous chapter, the code was found to provide high fidelity to the simulation objectives; 
specific changes to the design were to generate 100 replications of each model, and to 
utilize four (reduced from five) mixture proportion conditions. Therefore, there were 120 
simulation conditions for the main study, with 100 trials (or samples) per condition, 
yielding 12,000 datasets (i.e., 120 simulation conditions ? 100 replications) that were 
generated and analyzed, as outlined in Chapter 3, with three MLGMMs, that is, with 1-, 
2-, or 3- latent classes (n.b., the 1-latent class MLGMM is equivalent to MLLGM). The 
bias estimates (i.e., the mean of bias over 100 replications) were computed using 
Equation 44 and 90% confidence intervals were constructed, using the variance of these 
100 cluster level estimates as the measure of error, from the mis-specified model (i.e., 
MLLGM or 1-class MLGMM) and from the true model (i.e., 2-class MLGMM). 
ANOVA was performed to compare bias in cluster level effect estimates between the 
mis-specified model and the true model across the simulation conditions of interest. This 
study took approximately 580 hours of continuous computation time and, with the 
exceptions noted above, was carried out exactly as described in Chapters 3 and 4, using 
the code that had been written for the pilot study in Chapter 4.   
Failure of models to converge can be an issue in mixture models (i.e., the 2- or 3- 
class MLGMMs in this simulation), particularly when more latent classes are extracted 
than the true model has, as indicated in the pilot study. However, in the main study, no 
convergence issues occurred for MLGMMs with 1- or 2-latent classes. Table 11 
       
74 
 
summarizes the convergence problems that were encountered over the three sets of 
12,000 replications (i.e., 36,000 total).  
 There were 32 convergence issues out of total of 12,000 estimations with the 
MLGMM with 3 latent classes, and these only occurred for the cluster effect conditions 4 
and 5; the majority (22 out of 32) of errors happened when the cluster size was 20. These 
convergence issues were addressed as described in Chapter 4 and so the results that 
follow describe model fits and parameter estimates from the 3-class MLGMM fit to all 
12,000 replications. 
 
  
       
75 
 
Table 11 
 
Convergence Rate by Simulation Conditions 
 
Mixture  
Proportion 
Cluster  
Effect 
Cluster  
Number 
Cluster  
Size Error 
1 4 30 20 2 
    40 1 
    60 20 4 
2 4 60 20 1 
    40 1 
    90 40 1 
2 5 30 40 1 
    60 20 1 
3 4 30 40 1 
    60 40 1 
3 5 30 20 2 
    40 1 
  60 20 2 
    90 20 2 
4 4 30 20 1 
    40 1 
  60 20 1 
      40 2 
4 5 30 20 5 
  60 20 1 
     
5.1 Results on model identification  
 The first step of the analysis was to identify the number of latent classes extracted 
from the model. The true number of latent classes was always two. Therefore, model 
identification in this study was summarized as the rate of correctly selecting the two-class 
MLGMM across all model fits, utilizing the six information criteria described in Chapter 
3. The recovery of the individual-level latent class membership was deemed in the pilot 
study not to be important for the research questions, and so this data was not captured for 
any replication, and did not contribute to the estimation of correct model identification. 
       
76 
 
 Tables 12 to 16 summarize the model identification rates derived from the six 
information criteria described in Chapter 3, ordered by cluster effect (CE1 through 5). 
These five tables each include all mixture proportion (MP1 to 4) conditions, cluster 
numbers (CN=30, 60, and 90), and cluster sizes (CS=20 and 40).  The rates of successful 
model identification were computed as each information criterion?s (correct) selection of 
the two-class MLGMM over the 100 replications in each set of conditions; higher values 
are better identification rates. 
 AIC and AICc performed very similarly in almost all conditions, except that AICc 
performed slightly better when the cluster size was smallest (i.e., CS=20). The model 
identification rates of AIC and AICc were relatively consistent across simulation 
conditions, but they performed best, where both the cluster number and cluster size were 
small (i.e., CN=30 and CS=20 or 40). However, the AIC and AICc identification rates 
were moderate, 35-65% in all conditions.  
 AIC3, BICB, and SABIC had similar identification rates in all conditions. AIC3 
consistently had higher rates for mixture proportion conditions 1 and 2 (i.e., different 
mixture proportion among cluster types) and when the sample size is smaller (i.e., the 
sample size, CN ? CS< 1200).  BICB and SABIC performed particularly well when the 
mixture proportion was consistent among cluster types (MP3 and MP4) and sample size 
is large (i.e., CN ? CS > 1800). SABIC performed slightly worse than BICB on mixture 
proportion conditions 1 and 2, and with large cluster size (CS=40), but this difference 
between SABIC and BICB was very small with small cluster size (CS=20).   
  
       
77 
 
Table 12 
 
Recovery Rate of Latent Class for Specified Simulation Conditions: CE1 
 
Simulation Condition Information Criteria 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number
 Cluster 
Size AIC AIC3 AICc BIC BICB SABIC 
1 1 30 20 35 31 34 10 25 28 
   40 44 36 45 8 27 21 
  60 20 46 56 48 7 32 36 
    40 62 59 62 9 42 33 
  90 20 56 68 60 11 47 49 
   40 63 89 64 32 86 79 
 2 30 20 42 26 45 12 19 24 
   40 41 26 41 9 22 16 
  60 20 49 43 48 6 26 29 
    40 68 64 67 19 49 36 
  90 20 51 67 52 6 42 42 
     40 54 75 54 11 69 54 
 3 30 20 38 42 39 7 38 39 
   40 67 74 67 16 75 64 
  60 20 57 80 59 13 66 68 
    40 67 90 71 65 96 96 
  90 20 65 84 68 43 93 93 
   40 72 87 73 92 93 94 
 4 30 20 53 56 59 8 52 53 
   40 61 86 66 42 88 89 
  60 20 64 85 67 48 90 91 
    40 69 85 72 92 89 89 
  90 20 57 78 58 74 91 91 
      40 63 81 63 93 89 92 
       
78 
 
Table 13 
 
Recovery Rate of Latent Class for Specified Simulation Conditions: CE2 
 
Simulation Condition Information Criteria 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size AIC AIC3 AICc BIC BICB SABIC 
2 1 30 20 34 26 36 10 22 23 
   40 42 34 41 12 26 15 
  60 20 46 46 47 15 27 33 
    40 60 58 61 16 46 35 
  90 20 63 74 66 14 47 50 
   40 61 80 62 10 76 60 
 2 30 20 38 27 38 10 26 27 
   40 46 51 49 11 44 35 
  60 20 60 62 62 24 46 49 
    40 50 72 51 15 71 65 
  90 20 53 65 54 11 45 49 
     40 45 73 46 39 86 82 
 3 30 20 47 44 48 5 33 38 
   40 66 73 68 10 74 67 
  60 20 71 83 71 14 72 74 
    40 63 86 63 61 92 92 
  90 20 57 77 59 32 78 80 
   40 60 86 60 93 93 93 
 4 30 20 57 59 59 8 51 56 
   40 67 85 69 43 91 92 
  60 20 65 86 69 38 87 86 
    40 51 75 52 87 89 90 
  90 20 68 91 73 73 94 94 
      40 70 85 71 95 92 93 
       
79 
 
Table 14 
 
Recovery Rate of Latent Class for Specified Simulation Conditions: CE3 
 
Simulation Condition Information Criteria 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size AIC AIC3 AICc BIC BICB SABIC 
3 1 30 20 46 19 48 9 17 17 
   40 42 23 41 4 21 13 
  60 20 47 43 48 10 28 29 
    40 60 69 60 7 51 42 
  90 20 59 75 59 13 44 46 
   40 62 88 63 13 79 71 
 2 30 20 51 42 55 20 38 39 
   40 53 57 54 25 54 48 
  60 20 57 63 57 22 51 54 
    40 54 62 54 41 65 65 
  90 20 57 72 58 29 55 57 
     40 57 71 58 60 73 73 
 3 30 20 48 42 49 12 34 38 
   40 65 75 68 5 72 60 
  60 20 59 74 64 7 60 63 
    40 59 83 61 62 90 91 
  90 20 67 84 67 33 90 90 
   40 60 80 60 88 91 91 
 4 30 20 59 47 60 6 38 44 
   40 64 84 68 31 82 85 
  60 20 66 81 72 29 81 84 
    40 59 83 59 94 94 95 
  90 20 66 84 66 76 93 93 
      40 68 86 70 92 91 91 
       
80 
 
Table 15 
 
Recovery Rate of Latent Class for Specified Simulation Conditions: CE4 
 
Simulation Condition Information Criteria 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size AIC AIC3 AICc BIC BICB SABIC 
4 1 30 20 31 14 30 3 10 12 
   40 35 16 35 2 13 10 
  60 20 44 38 43 4 22 22 
    40 45 57 47 3 39 31 
  90 20 41 53 42 1 33 34 
   40 54 65 54 5 60 49 
 2 30 20 39 26 37 11 23 25 
   40 40 26 39 4 22 14 
  60 20 54 40 53 10 21 25 
    40 57 61 57 5 41 34 
  90 20 42 58 42 9 30 33 
     40 53 70 53 8 62 54 
 3 30 20 46 34 46 10 29 31 
   40 50 63 52 4 54 45 
  60 20 65 71 65 11 62 65 
    40 59 87 63 60 90 91 
  90 20 52 81 53 30 80 82 
   40 45 65 46 78 82 82 
 4 30 20 56 54 57 9 49 53 
   40 59 79 60 34 78 78 
  60 20 53 74 59 24 70 70 
    40 63 82 65 85 87 87 
  90 20 46 70 47 52 77 77 
      40 60 79 60 87 87 87 
       
81 
 
Table 16 
 
Recovery Rate of Latent Class for Specified Simulation Conditions: CE5 
 
Simulation Condition Information Criteria 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size AIC AIC3 AICc BIC BICB SABIC 
5 1 30 20 27 13 27 0 12 12 
   40 40 45 42 3 39 31 
  60 20 41 53 42 5 40 42 
    40 50 67 50 16 61 51 
  90 20 40 56 42 11 51 55 
   40 32 52 32 25 56 54 
 2 30 20 17 12 18 2 8 10 
   40 23 19 22 3 18 9 
  60 20 28 35 31 5 18 19 
    40 23 33 25 0 25 16 
  90 20 33 48 35 5 35 40 
     40 21 43 22 7 36 32 
 3 30 20 43 36 45 9 31 33 
   40 53 75 56 6 76 65 
  60 20 50 66 54 11 57 59 
    40 52 81 53 55 89 90 
  90 20 56 81 58 29 83 84 
   40 44 74 46 84 83 83 
 4 30 20 45 61 49 16 62 62 
   40 52 77 54 72 80 80 
  60 20 53 79 54 62 84 82 
    40 34 59 35 66 65 65 
  90 20 54 71 55 76 77 77 
      40 37 50 37 60 60 60 
 
5.2 Results on bias of estimates  
As described in Chapter 3, the bias of the cluster effect estimates, computed as the 
difference between the fixed effect (i.e., simulation value) and the cluster effect 
estimates, from the mis-specified model (i.e., 1-class MLGMM) and from the true model 
(i.e., 2-class MLGMM), were examined over the simulation conditions. These results, 
fully presented in Appendix A, are summarized here and discussed in Chapter 6. 
       
82 
 
Appendix A shows the mean, 90% CIs for the mean, and the standard deviations of bias 
as estimated in all simulation conditions. Tables in Appendix A are ordered first by the 
cluster effect, then by the mixture proportion, and finally by model (true, mis-specified). 
Following is the order of tables:  
? True Model (2-latent class MLGMM) 
o Appendix A.1 to1.4 for cluster effect condition 1 and mixture proportion 1 
through 4 
o Appendix A.5 to 1.8 for cluster effect condition 2 and mixture proportion 
1 through 4 
o Appendix A.9 to 1.12 for cluster effect condition 3 and mixture proportion 
1 through 4 
o Appendix A.13 for cluster effect condition 4 
o Appendix A.14 for cluster effect condition 5 
? Mis-specified Model (1-latent class MLGMM) 
o Appendix A.15 to 1.18 for  cluster effect condition 1 and  mixture 
proportion 1 through 4 
o Appendix A.19 to 1.22 for  cluster effect condition 2 and  mixture 
proportion 1 through 4 
o Appendix A.23 to 1.26 for  cluster effect condition 3 and  mixture 
proportion 1 through 4 
o Appendix A.27 for  cluster effect condition 4 
o Appendix A.28 for  cluster effect condition 5 
The results are discussed below by cluster effect in the subsequent section, and 
summarized by figures capturing the salient features of, and trends in, estimates in order 
to address the research questions. The term ?true effect? (TE) is used in this section to 
represent the individual fixed parameters within each cluster effect (e.g., -1, -0.5, 0, 0.5 
and 1 for cluster effect condition 1). Bias in the cluster estimates from each of the 100 
replications of the mis-specified and the true models were summarized (mean, standard 
deviation), representing the results for each of 120 conditions of this study. These results 
were analyzed by 5?3?4?3?2 (i.e., TE?CT?MP?CN?CS) ANOVA, and the relevant 
results from the ANOVAs that answer the research questions are described below. Focus 
       
83 
 
on the simulation conditions reduced the number of figures to use while providing 
sufficient information to answer the research questions.  
5.2.1 Cluster Effect Condition 1 (CE=1) 
 Cluster effect condition 1 (CE1) was included to evaluate the effect of differential 
mixture proportions, among three cluster types, on the cluster effect estimates from the 
mis-specified and the true models.  CE1 has the same cluster effects (i.e., -1, -0.5, 0, 0.5, 
and 1) in each of three cluster types. The cluster types were defined based on the mixture 
proportion. Therefore, differences in bias were not expected among the four mixture 
proportion conditions (and this was not tested). Instead, the term of interest is the 3-way 
interaction among the three cluster types (CT), four mixture proportions (MP), and five 
effects on the cluster effect 1 (TE); the interaction term was included in ANOVA models 
that were run both with and without cluster size (CS) and number (CN) included. Both 
the 3-way CT?MP?TE and 4-way CS?CT?MP?TE interaction terms were significant at 
the p<.0001 level, but the 4-way CN?CT?MP?TE term was not significant (p=0.97). 
These results are summarized in Figures 9 and 10 (see Appendix A for full results).  
 Each of Figures 9 and 10 includes 15 plots (five TE conditions ?  three cluster 
types) with either CS=20 (Figure 10) or CS=40 (Figure 10). Each plot has two lines 
representing the estimates from the true model (Model T, a line with squares) and the 
mis-specified model (Model M, a line with circles), for four MP conditions. Each row of 
figures is based on the cluster size (20 for figure 9 and 40 for figure 10) and cluster types 
(1st row is cluster type 1, 2nd row is cluster type 2, and 3rd row is cluster type 3). Each 
column of figures is organized by five TE (i.e., -1, -0.5, 0, 0.5, and 1).  
       
84 
 
 Figure 9. Bias estimates for cluster effect 1 (CE1) and cluster size 20 (CS20) 
?
 MP:
 1
 2
 3
 4
 -1.0
 -0.8
 -0.6
 -0.4
 -0.2
 0.0
 0.2
 0.4
 0.6
 0.8
 1.0
 Bia
 s
 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4
 MP:
 1
 2
 3
 4
 -1.0
 -0.8
 -0.6
 -0.4
 -0.2
 0.0
 0.2
 0.4
 0.6
 0.8
 1.0
 Bia
 s
 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4
 TE: -1.0
 MP:
 1
 2
 3
 4
 -1.0-0.8
 -0.6-0.4
 -0.20.0
 0.20.4
 0.60.8
 1.0
 Bia
 s
 TE: -.5
 MP:
 1
 2
 3
 4
 TE: 0
 MP:
 1
 2
 3
 4
 TE: .5
 MP:
 1
 2
 3
 4
 TE: 1.0
 MP:
 1
 2
 3
 4
 CS: 20  Cluster Type: 1
 CS: 20 Cluster Type: 2
 CS: 20 Cluster Type: 3
  Model  M  Model  T
       
85 
 
 Figure 10. Bias estimates for cluster effect 1 (CE1) and cluster size 40 (CS40) 
 Positive bias represents the overestimation of TEs and negative bias represents 
underestimation of TEs. Figures 9 and 10 show that the bias from the true model was 
consistently closer to zero, compared to that of the mis-specified model, for the same data 
from all conditions (i.e., the true model yielded more accurate estimation of parameters in 
terms of recovery of TE values). The effect of cluster size was minimal (comparing 
?
 MP:
 1
 2
 3
 4
 -1.0
 -0.8
 -0.6
 -0.4
 -0.2
 0.0
 0.2
 0.4
 0.6
 0.8
 1.0
 Bia
 s
 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4
 MP:
 1
 2
 3
 4
 -1.0
 -0.8
 -0.6
 -0.4
 -0.2
 0.0
 0.2
 0.4
 0.6
 0.8
 1.0
 Bia
 s
 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4 MP:
 1
 2
 3
 4
 TE: -1.0
 MP:
 1
 2
 3
 4
 -1.0-0.8
 -0.6-0.4
 -0.20.0
 0.20.4
 0.60.8
 1.0
 Bia
 s
 TE: -.5
 MP:
 1
 2
 3
 4
 TE: 0
 MP:
 1
 2
 3
 4
 TE: .5
 MP:
 1
 2
 3
 4
 TE: 1.0
 MP:
 1
 2
 3
 4
 CS: 40 Cluster Type: 1
 CS: 40 Cluster Type: 2
 CS: 40 Cluster Type: 3
  Model  M  Model  T
       
86 
 
Figures 9 and 10), although the magnitude of differences in bias between the true and 
mis-specified models was greater when the cluster size was larger (i.e., CS=40, see 
Appendix A for the detail).  
 Figures 9 and 10 also show that the patterns of bias over the four mixture 
proportions in each cluster type was consistent when cluster effects were non-zero, 
namely, positive bias was observed with positive cluster effects (i.e., TE=0.5 and 1) and 
negative bias observed with negative effects (i.e., TE=-0.5 and -1). The magnitude of bias 
was greater for MP2 than MP1, which differed in the amount of variation in mixture 
proportion. These findings were consistent for the true and mis-specified models although 
the true model yielded much less bias and the difference between bias in estimates 
derived from the true and mis-specified models increased as the variability of mixture 
proportion increased (i.e., from MP1 to MP2).  The trend in bias was reversed for the 
mis-specified model between MP1 and MP2 on cluster types 1 and 2 when the cluster 
effect was zero because there were more cases with the cluster effect of zero for these 
conditions due to the cases in the non-growth group. All bias was positive, i.e., all TE 
were overestimated, when the cluster effect was zero. 
 The comparison of MP3 and 4 showed that the magnitude of bias decreased as the 
proportion of cases in the fast growth group increased, except for the mis-specified model 
when the cluster effect was zero. The difference in bias was greater between the true and 
mis-specified model as the proportion of the fast group decreased for all non-zero cluster 
effects.   
 There was virtually no difference in bias between the true and mis-specified 
models on cluster type 3 of MP1. All cases were from the fast growth group in this 
       
87 
 
condition, suggesting that cluster effect estimates were not influenced by the differential 
mixture proportions in other cluster types. 
5.2.2   Cluster Effect Condition 2 and 3 (CE2 and 3) 
 Cluster effect conditions 2 and 3 were designed to evaluate the potential for 
systematic bias in the cluster effect estimates that could arise from the same cluster 
effects in the presence of different mixture proportions. CE2 and CE3 had the same 
overall cluster effects (i.e., -1, -0.5, 0, 0.5, and 1) but the assignment of effects were 
reversed between cluster type 1 and 3, while cluster type 2 had zero effects in both CE2 
and CE3. The cluster types were again based on the mixture proportions. Therefore, no 
differences in bias were expected and found between CE2 and CE3 for the mixture 
proportion conditions MP3 and MP4 (see Figure 11).  
 The term of interest is the interaction among the two cluster effect conditions 
(CE), four mixture proportions (MP), and five effects (TE) condition, with or without 
cluster size (CS) and number (CN).  The ANOVA determined whether these interaction 
terms affected the amount of bias in estimating the five cluster effect conditions by the 
true and mis-specified models. This 2?4?5?3?2 ANOVA found the three-way 
CE?MP?TE term significant at p<.0001, but the four-way CN?CE?MP?TE (p>.92) and 
CS?CE?MP?TE (p>.49) terms were not significant. Figure 11 summarizes these results 
(see Appendix A for full results).  
 Figure 11 includes a total of eight plots (two cluster effect conditions ? four 
mixture proportions) with two lines representing the estimates from the true model 
(Model T, a line with squares) and the mis-specified model (Model M, a line with 
circles). 
       
88 
 
   
  
Figure 11. Bias estimates for cluster effects 2 and 3 (CE2 and CE3) 
 The x-axis of each figure represents five TEs. Each row of plots represents a 
cluster effect condition (CE2 and CE3). Cluster type was not included in the figure 
because the true effect (TE) is a direct indicator of cluster types.  
?
 TE-1 -0.5 0 0.5 1
 -0.8
 -0.6
 -0.4
 -0.2
 0.0
 0.2
 0.4
 0.6
 0.8
 1.0
 Bia
 s
 TE -1 -0.5 0 0.5 1 TE -1 -0.5 0 0.5 1 TE-1 -0.5 0 0.5 1
 MP: 1
 TE-1 -0.5 0 0.5 1
 -0.8
 -0.6
 -0.4
 -0.2
 0.0
 0.2
 0.4
 0.6
 0.8
 1.0
 Bia
 s
 MP: 2
 TE-1 -0.5 0 0.5 1
 MP: 3
 TE -1 -0.5 0 0.5 1
 MP: 4
 TE-1 -0.5 0 0.5 1
 Cluster Effect 2
 Cluster Effect 3
 Model M Model T
       
89 
 
 A comparison of bias in estimates derived from CE3 vs. CE4 on MP1 and MP2 
magnified the effect of cluster types (i.e., differential mixture proportion) that was 
observed in Figure 11. Cluster type 1 was indicated by TE = -1 and TE= -0.5 for CE3, 
and TE = 1 and TE=0.5 for CE4.  Cluster type 2 had zero TE on both CE levels, and 
cluster type 3 had the reverse TEs compared to cluster type 1. The MP1 condition 
represented lower variation in mixture proportion than MP2, and the magnitude of 
difference in bias was greater between true and mis-specified models in the MP2 
condition as compared to the MP1 condition. The negative bias derived from TE -1 and -
 0.5 was greater for the condition with MP1 and CE3, where more cases were in the fast 
growth group. The positive bias derived from TE 0.5 and 1 on MP2 was much more 
pronounced for CE3 (only 25% of cases in the fast growth group) whereas all cases were 
in the fast growth group in CE2, which had minimal bias in estimates from both the true 
and mis-specified models. The trends in bias under the MP2 condition were similar to 
that under MP1, but the magnitude of bias was greater for MP2. The magnitudes of both 
positive and negative bias increased as the variation in the mixture proportion increased. 
The most pronounced effect was observed in the MP2 and CE3 conditions where the 
positive bias with TE 0.5 and 1 was the highest (only 25% of cases were in the fast 
growth group).  The overall variation in mixture proportions increased the magnitude of 
bias, especially on the non-zero positive TEs.  
 Similar effects of mixture proportion on bias were observed for MP3 and MP4, 
where MP3 had a greater magnitude of bias. There was no difference between CE3 and 
CE4 on MP3 and MP4, as expected, because both MP3 and MP4 had constant mixture 
proportions across the three cluster types. The variation of mixture proportion between 
       
90 
 
the fast and slow growth cases was greater on MP3 (50 fast/50 slow) than MP4 (75 
fast/25 slow). The trend of bias was symmetric for the negative and positive TEs, 
centered around zero bias on TE=0 from both the true and mis-specified models on MP3, 
whereas the magnitude of negative bias was greater on MP4 for the true model. That is, 
positive bias was attenuated when a greater proportion of the cases were in the fast 
growth group. 
5.2.3   Cluster Effect Condition 4 and 5 (CE=4 and CE=5) 
Cluster effect conditions 4 and 5 were designed to assess the impact of variability 
in cluster effect on bias of estimation impact of variability on the cluster effect estimates. 
The cluster effects in these two conditions were randomly generated from the normal 
distribution with the mean of 0 and a variance of 0.5 for CE4 and 1.0 for CE5.  
 The term of interest in these analyses was the interaction between the cluster 
types (CT), mixture proportion (MP) and cluster effect conditions (CE), with or without 
cluster size (CS) and cluster number (CN). The question is whether these interaction 
terms are associated with the amount of bias in estimating random cluster effects under 
the true and mis-specified models. ANOVA found that the two-way MP?CT term 
significant (p<.0001), but other terms were not (all p>0.3), including that of the CE. 
Figure 12 is the only one of these that includes statistically not significant simulation 
conditions (i.e., CE) because for this particular analysis, these conditions directly 
addressed a research question (see Appendix A for full results). These are discussed in 
Chapter 6.   
       
91 
 
 Figure 12. Bias estimates for cluster effect 4 and 5 (CE4 and CE5) 
 Figure 12 includes a total of 12 plots with four mixture proportions and three 
cluster types on the x-axis with two lines in each plot representing the estimates from the 
true model (Model T, a line with squares) and the mis-specified model (Model M, a line 
with circles). Four figures in each row represent the mixture proportion conditions. Each 
row of plots represents a cluster type (CT=1, 2, or 3).  
?
 CE: 4 5
 -0.15
 -0.10
 -0.05
 0.00
 0.05
 0.10
 0.15
 Bia
 s
 CE: 4 5 CE: 4 5 CE: 4 5
 CE: 4 5
 -0.15
 -0.10
 -0.05
 0.00
 0.05
 0.10
 0.15
 Bia
 s
 CE: 4 5 CE: 4 5 CE: 4 5
 MP: 1
 CE: 4 5
 -0.15
 -0.10
 -0.05
 0.00
 0.05
 0.10
 0.15
 Bia
 s
 MP: 2
 CE: 4 5
 MP: 3
 CE: 4 5
 MP: 4
 CE: 4 5
 Cluster Type 1
 Cluster Type 2
 Cluster Type 3
  Model M  Model T
       
92 
 
 The ANOVA results indicated that the variation in true cluster effects (i.e., N(0,1) 
and N(0,0.5) random cluster effects) did not have a significant impact upon the bias of 
estimates, with only a slight increase in the magnitude of bias on CE5 (i.e., the effects 
with a higher variance) on MP1 and MP2.   
 Bias in estimates was minimal in the MP3 and MP4 conditions for both CE4 and 
CE5, indicating that the variability of the cluster effects had limited, if any, impact when 
the mixture proportion was constant. The variability of the cluster effect estimates was 
also small on MP1 and MP2 conditions (Figure 11). The magnitude of bias was very 
similar between CE4 and CE5 for the mis-specified model, while for the true model, the 
magnitude of bias was greater for CE5 than for CE4. 
5.3  Precision of estimates  
 The mis-specified model condition consistently resulted in precision that was 
equal to or greater than that of the true model, as expected. The difference was smallest 
when the most cases were in the fast growth group (i.e., MP1 and cluster type 3), and was 
largest where the fewest cases were in the fast growth group (i.e., MP2 and cluster type 
1). For both the mis-specified and true models, the precision of estimates increased as the 
effective sample size increased. As was described in Chapters 2 and 3, the mis-specified 
model always utilized 100% of the sample for the estimation of parameters, whereas the 
effective sample size was dependent on the mixture proportion for the true model. When 
combined with the previous set of results about bias, this simulation has shown that the 
mis-specified model consistently yields greater bias, with higher precision for those 
biased estimates, as compared to the true model. 
       
93 
 
5.4 Results on classification accuracy  
 Tables 17-21 show the classification accuracy of MLGMMs with five cluster 
effects ranked by quintile level. The cluster number did not have significant effects on 
classification accuracy as indicated by kappa, and kappa values tended to increase as 
cluster size increased, for both the true and mis-specified models. The true model 
performed much better on CE1 and moderately better on CE3, whereas the mis-specified 
model performed significantly better than the true model on CE4 and CE5 conditions. 
The CE2 condition led to virtually identical classification accuracy by both the true and 
mis-specified models. 
       
94 
 
Table 17 
 
Rate of Misclassification at the Quintile Level: Cluster Effect 1 (CE1) 
 
Simulation Condition Kappa (95% CI) 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size MLGMM MLLGM 
1 1 30 20 0.78 (0.76 ,0.80)* 0.64 (0.62 ,0.67) 
   40 0.80 (0.78 ,0.82)* 0.63 (0.61 ,0.66) 
  60 20 0.77 (0.75 ,0.79)* 0.65 (0.63 ,0.67) 
   40 0.81 (0.79 ,0.82)* 0.64 (0.61 ,0.66) 
  90 20 0.77 (0.75 ,0.79)* 0.63 (0.61 ,0.66) 
     40 0.80 (0.78 ,0.82)* 0.64 (0.61 ,0.66) 
 2 30 20 0.68 (0.66 ,0.70)* 0.53 (0.50 ,0.56) 
   40 0.70 (0.68 ,0.72)* 0.56 (0.53 ,0.58) 
  60 20 0.68 (0.66 ,0.70)* 0.53 (0.51 ,0.56) 
   40 0.70 (0.68 ,0.72)* 0.55 (0.52 ,0.57) 
  90 20 0.65 (0.63 ,0.68)* 0.54 (0.51 ,0.56) 
     40 0.70 (0.68 ,0.72)* 0.54 (0.52 ,0.57) 
 3 30 20 0.70 (0.68 ,0.72)* 0.58 (0.56 ,0.61) 
   40 0.72 (0.70 ,0.74)* 0.61 (0.58 ,0.63) 
  60 20 0.70 (0.67 ,0.72)* 0.57 (0.54 ,0.59) 
   40 0.72 (0.70 ,0.74)* 0.61 (0.58 ,0.63) 
  90 20 0.68 (0.66 ,0.70)* 0.58 (0.55 ,0.60) 
     40 0.72 (0.70 ,0.74)* 0.60 (0.58 ,0.63) 
 4 30 20 0.75 (0.73 ,0.77)* 0.61 (0.59 ,0.64) 
   40 0.75 (0.73 ,0.77)* 0.62 (0.59 ,0.64) 
  60 20 0.75 (0.73 ,0.77)* 0.62 (0.59 ,0.64) 
   40 0.75 (0.73 ,0.77)* 0.62 (0.60 ,0.65) 
  90 20 0.73 (0.71 ,0.75)* 0.61 (0.58 ,0.63) 
      40 0.74 (0.72 ,0.76)* 0.62 (0.60 ,0.65) 
* denote that Kappa for the group is significantly  higher at  p<0.05 level   
 
       
95 
 
Table 18 
 
Rate of Misclassification at the Quintile Level: Cluster Effect 2 (CE2) 
 
Simulation Condition Kappa (95% CI) 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size MLGMM MLLGM 
2 1 30 20 0.85 (0.82 ,0.87) 0.89 (0.87 ,0.91) 
   40 0.95 (0.93 ,0.96) 0.93 (0.91 ,0.95) 
  60 20 0.85 (0.82 ,0.87) 0.87 (0.85 ,0.90) 
   40 0.93 (0.91 ,0.95) 0.94 (0.92 ,0.95) 
  90 20 0.85 (0.83 ,0.87) 0.90 (0.88 ,0.92)* 
     40 0.94 (0.92 ,0.95) 0.95 (0.94 ,0.97) 
 2 30 20 0.73 (0.70 ,0.77) 0.76 (0.73 ,0.80) 
   40 0.85 (0.83 ,0.87) 0.85 (0.83 ,0.88) 
  60 20 0.75 (0.71 ,0.78) 0.75 (0.72 ,0.78) 
   40 0.88 (0.86 ,0.90) 0.85 (0.82 ,0.88) 
  90 20 0.75 (0.71 ,0.78) 0.76 (0.73 ,0.79) 
     40 0.85 (0.82 ,0.87) 0.86 (0.83 ,0.88) 
 3 30 20 0.77 (0.74 ,0.80) 0.75 (0.72 ,0.78) 
   40 0.89 (0.86 ,0.91) 0.86 (0.83 ,0.88) 
  60 20 0.80 (0.77 ,0.83) 0.75 (0.72 ,0.78) 
   40 0.87 (0.84 ,0.89) 0.86 (0.83 ,0.88) 
  90 20 0.76 (0.73 ,0.79) 0.77 (0.74 ,0.80) 
     40 0.87 (0.85 ,0.89) 0.84 (0.82 ,0.87) 
 4 30 20 0.85 (0.83 ,0.88) 0.86 (0.84 ,0.89) 
   40 0.93 (0.91 ,0.95) 0.94 (0.93 ,0.96) 
  60 20 0.86 (0.83 ,0.88) 0.86 (0.84 ,0.88) 
   40 0.93 (0.92 ,0.95) 0.91 (0.89 ,0.93) 
  90 20 0.87 (0.85 ,0.90) 0.83 (0.81 ,0.86) 
      40 0.93 (0.91 ,0.94) 0.93 (0.91 ,0.94) 
* denote that Kappa for the group is significantly  higher at  p<0.05 level   
       
96 
 
Table 19 
 
Rate of Misclassification at the Quintile Level: Cluster Effect 3 (CE3) 
 
Simulation Condition Kappa (95% CI) 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size MLGMM MLLGM 
3 1 30 20 0.82 (0.79 ,0.85) 0.82 (0.79 ,0.85) 
   40 0.92 (0.90 ,0.94) 0.89 (0.87 ,0.91) 
  60 20 0.84 (0.81 ,0.86) 0.81 (0.78 ,0.84) 
   40 0.92 (0.90 ,0.93)* 0.86 (0.84 ,0.89) 
  90 20 0.84 (0.82 ,0.87) 0.79 (0.76 ,0.82) 
     40 0.91 (0.89 ,0.93)* 0.85 (0.82 ,0.87) 
 2 30 20 0.71 (0.68 ,0.75)* 0.62 (0.58 ,0.66) 
   40 0.76 (0.73 ,0.80) 0.71 (0.68 ,0.75) 
  60 20 0.75 (0.72 ,0.79)* 0.62 (0.58 ,0.67) 
   40 0.78 (0.75 ,0.81) 0.72 (0.69 ,0.76) 
  90 20 0.69 (0.65 ,0.72) 0.64 (0.59 ,0.68) 
     40 0.80 (0.77 ,0.83)* 0.70 (0.66 ,0.74) 
 3 30 20 0.77 (0.74 ,0.80) 0.75 (0.72 ,0.78) 
   40 0.87 (0.85 ,0.89) 0.84 (0.82 ,0.87) 
  60 20 0.79 (0.76 ,0.82) 0.74 (0.70 ,0.77) 
   40 0.86 (0.84 ,0.89) 0.86 (0.83 ,0.88) 
  90 20 0.78 (0.75 ,0.81) 0.75 (0.71 ,0.78) 
     40 0.89 (0.86 ,0.91) 0.84 (0.82 ,0.87) 
 4 30 20 0.86 (0.83 ,0.88) 0.85 (0.83 ,0.88) 
   40 0.93 (0.92 ,0.95) 0.94 (0.92 ,0.95) 
  60 20 0.85 (0.83 ,0.87) 0.86 (0.83 ,0.88) 
   40 0.94 (0.92 ,0.95) 0.94 (0.92 ,0.96) 
  90 20 0.85 (0.82 ,0.87) 0.86 (0.84 ,0.89) 
      40 0.95 (0.93 ,0.96) 0.94 (0.92 ,0.95) 
* denote that Kappa for the group is significantly  higher at  p<0.05 level   
       
97 
 
Table 20 
 
Rate of Misclassification at the Quintile Level: Cluster Effect 4 (CE4) 
 
Simulation Condition Kappa (95% CI) 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size MLGMM MLLGM 
4 1 30 20 0.47 (0.40 ,0.53) 0.66 (0.61 ,0.71)* 
   40 0.60 (0.55 ,0.66) 0.69 (0.65 ,0.74) 
  60 20 0.47 (0.41 ,0.54) 0.69 (0.64 ,0.74)* 
   40 0.57 (0.51 ,0.63) 0.71 (0.67 ,0.76)* 
  90 20 0.49 (0.42 ,0.55) 0.67 (0.62 ,0.72)* 
     40 0.61 (0.55 ,0.67) 0.74 (0.69 ,0.78)* 
 2 30 20 0.44 (0.37 ,0.51) 0.52 (0.46 ,0.58) 
   40 0.50 (0.43 ,0.56) 0.59 (0.53 ,0.65) 
  60 20 0.49 (0.42 ,0.55) 0.50 (0.44 ,0.57) 
   40 0.52 (0.46 ,0.59) 0.55 (0.49 ,0.60) 
  90 20 0.42 (0.35 ,0.49) 0.56 (0.50 ,0.62)* 
     40 0.53 (0.46 ,0.59) 0.64 (0.59 ,0.69) 
 3 30 20 0.51 (0.44 ,0.58) 0.52 (0.46 ,0.58) 
   40 0.53 (0.47 ,0.60) 0.63 (0.57 ,0.68) 
  60 20 0.52 (0.46 ,0.58) 0.63 (0.57 ,0.68) 
   40 0.69 (0.64 ,0.74) 0.68 (0.63 ,0.73) 
  90 20 0.53 (0.46 ,0.59) 0.54 (0.47 ,0.60) 
     40 0.61 (0.56 ,0.66) 0.72 (0.67 ,0.77)* 
 4 30 20 0.59 (0.53 ,0.65) 0.70 (0.65 ,0.75)* 
   40 0.68 (0.63 ,0.74) 0.79 (0.75 ,0.83)* 
  60 20 0.57 (0.51 ,0.63) 0.68 (0.63 ,0.73)* 
   40 0.66 (0.61 ,0.71) 0.75 (0.71 ,0.80) 
  90 20 0.56 (0.50 ,0.63) 0.74 (0.69 ,0.78)* 
      40 0.71 (0.66 ,0.76) 0.77 (0.72 ,0.81) 
* denote that Kappa for the group is significantly  higher at  p<0.05 level   
 
       
98 
 
Table 21 
 
Rate of Misclassification at the Quintile Level: Cluster Effect 5 (CE5) 
 
Simulation Condition Kappa (95% CI) 
Cluster 
Effect 
Mixture 
Proportion 
Cluster 
Number 
Cluster 
Size MLGMM MLLGM 
5 1 30 20 0.60 (0.54 ,0.66) 0.77 (0.72 ,0.81)* 
   40 0.57 (0.51 ,0.63) 0.79 (0.75 ,0.83)* 
  60 20 0.67 (0.62 ,0.72) 0.78 (0.74 ,0.82)* 
   40 0.72 (0.67 ,0.77) 0.79 (0.75 ,0.83) 
  90 20 0.64 (0.58 ,0.69) 0.78 (0.74 ,0.82)* 
     40 0.59 (0.53 ,0.65) 0.80 (0.76 ,0.83)* 
 2 30 20 0.41 (0.34 ,0.48) 0.68 (0.63 ,0.73)* 
   40 0.41 (0.34 ,0.48) 0.70 (0.66 ,0.75)* 
  60 20 0.41 (0.34 ,0.48) 0.70 (0.66 ,0.75)* 
   40 0.48 (0.41 ,0.55) 0.74 (0.70 ,0.79)* 
  90 20 0.46 (0.39 ,0.52) 0.73 (0.68 ,0.77)* 
     40 0.42 (0.35 ,0.48) 0.75 (0.70 ,0.79)* 
 3 30 20 0.56 (0.50 ,0.62) 0.70 (0.66 ,0.75)* 
   40 0.74 (0.70 ,0.79) 0.81 (0.78 ,0.85) 
  60 20 0.58 (0.52 ,0.64) 0.78 (0.74 ,0.82)* 
   40 0.76 (0.72 ,0.81) 0.80 (0.77 ,0.84) 
  90 20 0.68 (0.63 ,0.73) 0.77 (0.73 ,0.81) 
     40 0.70 (0.65 ,0.75) 0.82 (0.78 ,0.86)* 
 4 30 20 0.68 (0.63 ,0.73) 0.82 (0.79 ,0.86)* 
   40 0.75 (0.71 ,0.80) 0.86 (0.83 ,0.89)* 
  60 20 0.75 (0.71 ,0.80) 0.80 (0.76 ,0.84) 
   40 0.66 (0.61 ,0.72) 0.88 (0.85 ,0.90)* 
  90 20 0.72 (0.67 ,0.77) 0.88 (0.84 ,0.91)* 
      40 0.58 (0.51 ,0.64) 0.89 (0.86 ,0.91)* 
* denote that Kappa for the group is significantly  higher at  p<0.05 level   
  
 Table 22 shows the average kappa obtained over the cluster effects. The average 
values reflect the finding above. The classification rate by the true model, but not by the 
mis-specified model, was affected by the random cluster effects (i.e., CE4 and 5).  
       
99 
 
Table 22 
 
Average Kappa by Cluster Effect 
 
 Average Kappa 
Cluster Effect MLGMM MLLGM 
1 0.73 0.60 
2 0.85 0.85 
3 0.84 0.80 
4 0.55 0.65 
5 0.61 0.78 
   
 The results on the classification accuracy agreed with the evaluation of bias in the 
cluster effects, where lower bias and higher precision lead to the higher classification 
accuracy. The true model had the highest classification rate at CE1 condition where it had 
a minimal bias, but the classification rate suffered on CE4 and 5 where the bias were 
higher than the mis-specified model and the precision was lower. 
 
       
100 
 
Chapter 6: Discussion 
 The pilot study described in Chapter 4 tested the code that was used in the main 
study, identifying convergence issues and fidelity of the programs and the code 
coordinating these programs to simulate, fit models to, and analyze 36,000 individual 
runs of the three models that represent 100 replications for each combination of 
conditions shown in Table 1. The results of these models were presented in Chapter 5. 
Since model convergence was perfect on 1- and 2-class MLGMM (1-class MLGMM is 
equivalent to MLLGM), and a very low incidence of convergence problem occurred with 
3-class MLGMM (32 issues in these 12,000 runs), this chapter discusses how the results 
in Chapter 5 address the main research questions that motivated this study. 
 The goal of this study was to investigate the impact on the teacher?s (or cluster) 
effect estimates that might arise from having different proportions of students in two 
growth groups within a single classroom, as described in the example in Chapter 1. 
Fairness in evaluation could not be established if there was systematic bias in estimates of 
any teacher?s effect or effectiveness. This simulation study manipulated a variety of 
conditions in order to investigate the magnitude of bias in teacher?s effect estimates 
resulting from heterogeneous student growth within a class. In particular, the research 
questions were:   
1) What information criteria can be used to identify the true number of latent 
class variable levels in the MLGMM context? 
2) Are the level-2 parameter estimates in the multilevel model affected (in terms 
of bias, and precision) by incorrectly-modeled level 1 effects? 
       
101 
 
The brief summary of the findings presented in Chapter 5, and exemplified in Tables 11-
 22, Figures 9-12, and Appendix A is that: 
a. AIC3, BICB, and SABIC performed well to identify MLGMM 
with the correct number of latent classes, although AIC and AICc were 
the only information criteria to perform well with smaller sample size. 
BIC performed poorly, contrary to previous research findings. BIC over-
 penalized model because a total sample size did not reflect the true 
sample size of data.   
b. Model misspecification leads to systematic bias in level-2 
parameter estimates in multi-level models, especially when there is more 
variability in some classroom (represented by mixture proportions). This 
bias is attenuated when the proportion of students belonging to a high-
 growth group is equal to, or greater than, that of the slow growth (e.g., 
PLP) group. However, when MLGMM is used instead of simple 
MLLGM for the level-2 parameter estimates; the bias is greatly reduced, 
loses all systematicity, and appears unaffected by any of the other 
features that were manipulated in the simulation.  
c. Bias in estimation of teacher effects was significantly reduced by 
accounting for the student level heterogeneity in most simulation 
conditions, except for a few conditions described later in this discussion 
(Section 6.3.3).   
d. Precision of the estimated teacher effects was affected 
systematically by each of the conditions under study in this project. 
       
102 
 
Effects of the various conditions on precision tended to vary depending 
on the proportion of students in the fast growth group, for all sample 
sizes, underscoring a specific effect that unmodeled heterogeneity in the 
classroom can have on the estimation of teacher effects.   
Taken together, these results suggest that the evaluation of teachers, in terms of 
their effects/effectiveness, using VAM, can proceed fairly across a wide spectrum of 
contexts (and school, class or district sizes) ? but only if bias can be controlled as 
discussed below. In fact, the potential for controlling bias is the most important feature of 
MLGMM, specifically because high levels of precision for biased estimates could lead to 
greater (misplaced) confidence in such incorrect estimates.  
More worrisome is the pattern of bias in the results for better (positive cluster 
effect estimates) and worse (negative cluster effect) teaching. Figures 9-12 show that if a 
cluster effect is positive, then the bias tends to be positive (overestimation), and that the 
greater the absolute value of this cluster effect, the greater the bias. Increasingly better 
teachers will appear even better due to the bias and overestimation of positive cluster 
(teacher) effects. Figures 9-12 shows that, if a cluster effect is negative, then the bias 
tends to be negative, representing overestimation of a negative effect of the teacher. 
Similar to the overestimation of positive cluster effects, the overestimation (bias) of 
negative effects also increases with the absolute value of the cluster effect estimate. Thus, 
increasingly worse teachers will appear even worse due to this bias and overestimation of 
negative teacher effects. 
In the following sections, the results presented in Chapter 5 are discussed with 
respect to their contributions to these conclusions and the future steps suggested by the 
       
103 
 
results and their implications for the effective and fair application of VAM, using 
MLGMM, for policy making and teacher evaluations. 
 In the following sections, the results presented in Chapter 5 are discussed with 
respect to their contributions to these conclusions and the future steps suggested by the 
results and their implications for the effective and fair application of VAM, using 
MLGMM, for policy making and teacher evaluations. As described in Chapter 2, the PLP 
students identified by Lazarus et al. (2010) may well describe the slow growth group 
simulated in this study. If so, then it highlights the importance of accounting for the 
presence of PLP students in the estimation of teacher effects. However, the exclusion of a 
subgroup of students (i.e., PLP) from the estimation of a teacher?s effect assumes a great 
deal of the ?truth? of the statistical identification of such a class of students. Including 
some demographic indicators (observed variables), for example, those identified by 
Lazarus et al. (2010), in the model as covariates may help to reduce the impact of PLP 
students without assuming that the latent classes inferred from the data have identified 
this class correctly, although as Palardy and Vermunt (2009) indicated, including 
covariates in VAM analyses can make the identification of latent classes more difficult. 
As noted in the introduction, the collection and analysis of any data must be driven by 
clear statements of the assumptions being made and the sources of variability to be 
modeled; users of the VAM approach must justify whether observed variables ? proxies 
for potentially important latent classes ? are closer to the ?truth? than the latent variables 
might be. 
       
104 
 
6.1 Model convergence of the multilevel growth mixture model 
 The convergence rates of MLGMMs were much better than expected, resulting in 
very few issues (32/12,000, all within a small fraction of the 12,000 model estimations). 
Convergence issues only occurred for the over-estimating condition (i.e., MLGMM with 
more latent classes than the true model). Combining multilevel structure with GMM did 
not seem to affect the model convergence, as was shown in Table 10. Problems occurred 
most often when the cluster size was small (i.e., CS=20) and had fewer cases in the high 
growth, relative to the low growth, group (i.e., MP2). The effect of cluster number was 
minimal and consistent across all conditions involving MLGMMs. However, due to 
modeling constraints for MLGMMs, the cluster size was held constant across all clusters, 
and while this inflated the rate of model convergence, equal cluster size is not realistic. 
Therefore, this work supports the combination of MLM and GMM approaches, but future 
work should proceed with more realistic data (with different sized clusters), which might 
have an impact on the model convergence and interpretability. 
6.2 Information criteria performance 
 Six information criteria were used to identify the model that fit the data best 
among the 1-, 2-, and 3-class MLGMMs (recall that the 1-class MLGMM is MLLGM). 
BIC performed quite poorly, but not uniformly worst, despite performing extremely well 
in previous research (e.g., Chen et al., 2010; Muth?n & Asparouhov, 2009; Nylund, 
Asparouhov, & Muth?n, 2007). Palardy and Vermunt (2010) proposed BICB, which 
utilizes the number of clusters instead of overall sample size (as BIC does) for the sample 
size adjustment factor. It is possible that BIC was hampered in the multilevel modeling 
conditions of this simulation because the sample size penalty factor was too severe (i.e., 
       
105 
 
selecting the simpler model too often). BIC only worked well when the cluster effect was 
random (i.e., CE4 or 5) with a large sample size; with random cluster effects and small 
samples together, it performed poorly and inconsistently (see Tables 15 and 16). BICB 
and SABIC both worked well, especially when the cluster size (i.e., CS=40) and cluster 
number (i.e., CN=60 or 90) were larger. The sample size penalty adjustment of SABIC, 
as compared to the over-penalization of BIC, led to superior and more consistent 
performance of SABIC. However, BICB generally outperformed SABIC on almost all 
simulation settings, and especially with smaller sample sizes. Overall, the cluster number 
seems to have been a good proxy for the sample size in MLGMM, and in Table 12 this is 
most obvious. However, both BICB and SABIC still tended to over-penalize models in 
conditions with both smaller cluster size (i.e., 20) and cluster number (i.e., 40 or 60). 
 AIC and AICc performed similarly, correctly identifying models in conditions 
with smaller cluster number and sizes, whereas BIC did not function well in these 
conditions. The correct identification rates of AIC and AICc were not as good as BICB or 
SABIC as the overall sample size increased from 600 to 3,600, but all four of these 
criteria performed fairly consistently across all conditions. AIC3 had a model 
identification profile very similar to BICB and SABIC, with marginally better 
identification rates than these two criteria in conditions with smaller overall sample size, 
but significantly worse identification than AIC and AICc in these same conditions, An 
interesting observation was that AIC3 performed slightly worse than AIC and AICc in 
conditions with larger overall sample sizes, defined by cluster number and size (i.e., 
sample size = cluster number ?  cluster size). 
       
106 
 
 The six information criteria each performed differently across conditions; none of 
them was consistently best (or worst). AIC and AICc performed well with smaller sample 
sizes, AIC3 performed well in mixture proportions conditions 1 and 2 with a smaller 
sample sizes, and BICB and SABIC performed well in mixture conditions 3 and 4 with 
larger sample size. The pattern of results for BIC was similar to those of BICB and 
SABIC, but at a lower level; therefore BIC may have a very limited use in the model 
identification in MLGMM. 
 Within both the pilot and main studies, BIC did not perform well for most 
conditions, which is surprising given its applicability to simulation studies (i.e., it works 
only when the true model is known to be among those in question) and its excellent 
performance in other work (e.g., Muth?n & Asparouhov, 2009). BIC has tendency to 
prefer a simple model and only performs well where the cluster size is large (i.e., 40) and 
the mixture proportion is consistent (i.e., MP = 3 or 4). 
   Based on these results, given the motivation for the conditions that were 
included in the simulation (i.e., emphasizing the accuracy of estimation of the teacher?s 
effect or value-added after taking account the growth profiles, clusters and cluster sizes), 
the best model selection performance will be obtained by combining either AIC or AICc 
? which were indistinguishable in these results ? together with either AIC3 or BICB. AIC 
and AICc are recommended for cases involving smaller sample sizes, roughly less than 
1,200 total cases, and AIC3 and BICB are recommended in cases with a large sample size 
(1,200 cases or more). Contrary to previous findings, these findings suggest that BIC 
should not be used for MLGMM model identification.  
       
107 
 
6.3 Evaluation of systematic biases across the simulation condition 
 This section is divided to discuss cluster effects (CE=1, 2 and 3, and 4 and 5) as 
they were presented in Chapter 5. The purpose of this study was to identify potential 
systematic biases introduced by model misspecification, which should always be 
considered in any modeling enterprise and which may negatively influence the 
otherwise fair comparison of parameters. As noted in Chapter 1, the simulation was set 
up to address this question by determining whether level-2 parameter estimates in a 
multilevel model were or could be affected (in terms of bias and precision) by 
incorrectly-modeled level-1 effects (i.e., by ignoring the heterogeneity in the student 
population). A systematic manipulation of the cluster effects and the mixture proportion 
described in Chapter 2, one way to estimate these effects, was employed in the 
simulation design, and the question was addressed by examining the patterns of bias on 
the fixed cluster effects introduced through the cluster effect conditions 1 through 3. 
6.3.1 Cluster effect condition 1 (CE=1) 
 Cluster effect condition 1 had the same five true effects (i.e., -1, -0.5, 0, 0.5, 1) 
across three cluster types in the mixture proportion conditions, and this was designed to 
investigate how the bias on each true effect behaved when mixture proportions vary (see 
Chapter 3). 
 As seen in Figures 9 and 10, model misspecification led to greater, and 
systematic, bias, as compared to conditions involving the true model. If a cluster effect is 
negative, then the bias tends to be negative, representing overestimation of a negative 
effect of the teacher. If a cluster effect is positive, then the bias tends to be positive 
       
108 
 
(overestimation), and that the greater the absolute value of this cluster effect, the greater 
the bias.  
In addition to varying with the sign of the true effect (TE), bias was also greater in 
magnitude for greater TE ?whether positive or negative. (i.e., bias increased in absolute 
value as TE moved away from zero). In conditions with increased overall variability in 
the mixture proportion (i.e., MP1 and 2) the magnitude of bias increased substantially. 
The within-sample variability defined by cluster type exhibited the same pattern, namely, 
that magnitude of bias increased with mixture proportion variability. Greater bias was 
observed for conditions with greater variability in mixture proportion (MP2) as compared 
to conditions with less variability in this proportion (MP1). 
 In addition to these systematic effects of mixture proportion, true effect of cluster, 
or their combination on the magnitude and sign of the bias on the cluster effect estimates, 
there were also effects on bias coming from model misspecification. Significant positive 
bias was observed in conditions with zero TE when the model was mis-specified, but not 
for the true model in this condition. The most pronounced positive bias at zero TE 
occurred for MP1 and MP4 conditions ? suggesting sensitivity to a higher proportion of 
fast growth cases within any cluster. This means that student heterogeneity contributes to 
the inflation of the cluster effect estimates. The only exception was at cluster type 3 on 
MP1, in which all of cases were in the fast growth group (i.e., 1 class); bias in estimates 
was negligible for both mis-specified and true models in this condition.  
 These findings suggest that the potential for bias in estimating a teacher?s effect 
(represented by cluster effect estimates) is greatest in the following conditions: 
       
109 
 
1. Higher overall variation, between cluster type, in terms of in mixture 
proportion (MP=2). 
2. Smaller proportion of cases in the fast growth group (cluster type 1 in MP2 
and 3). 
3. Higher overall variation in the mixture proportion (MP3). 
The effect of model misspecification on the bias in estimating cluster effect is 
similar; namely, the magnitude of bias with zero TE was influenced by the overall 
relative proportions of fast and slow growth groups, especially for the mis-specified 
model.  Unlike in the true model conditions, in the misspecification condition, all cases in 
the slow growth group had zero TEs and these mis-specified models included all cases in 
their estimation, whereas the true models attempted to separate the fast and slow growth 
groups during estimation.  
 The true model reduced bias much better than the mis-specified model, implying 
that the effects of model misspecification would be greatest in a school district having 
schools with a wide range of performances and/or classes within a school encompassing a 
wide performance range. For instance, the highest magnitude of bias would occur in a 
classroom with the smallest proportion of fast growth students within a school that also 
has a small proportion of fast growth students. This might represent a heterogeneous 
urban school district. All teachers, good or bad, would be most affected if evaluated in 
the context of schools with few fast growers, but bad teachers would be more negatively 
affected (i.e., further lowering the negative cluster effect estimates) than good teachers 
being evaluated in schools with many fast growers. This situation would likely occur 
within more heterogeneous urban school districts. The evaluation of teachers in more 
       
110 
 
homogeneous suburban school districts where majority of students belonged to the fast 
growth group, and where most schools in the district maintain a high achievement level, 
would yield the least biased estimates of teacher effect, even with a mis-specified VAM. 
6.3.2 Cluster Effect Conditions 2 and 3 (CE=2 and 3) 
 Cluster effect conditions 2 and 3 were designed to evaluate the potential 
systematic biases comparing the same cluster effects in the different mixture proportions. 
CE2 and CE3 had the same overall cluster effects (i.e., -1, -0.5, 0, 0.5, and 1) but the 
assignment of effects was reversed between cluster types 1 and 3, while cluster type 2 
had zero effects on both CE conditions. The cluster types were defined based on the 
mixture proportion. Therefore, no differences in bias were observed comparing CE2 and 
CE3 for mixture proportion conditions MP3 and MP4.   
The positive TEs were more strongly associated with differential mixture 
proportions, in terms of bias resulting from conditions CE2 and CE3 with both MP1 and 
MP2. The positive bias was most pronounced when the fast growth group had the 
smallest proportion. The negative bias on the negative TEs was less pronounced when the 
overall variation in mixture proportion was low (MP1). The magnitude of this negative 
bias was fairly low in estimates derived from both true and mis-specified models. 
However, this was not observed when the variation in mixture proportion was high 
(MP2), as outline in the preceding section.  
There was significant negative bias on the negative TEs regardless of the 
proportion of cases in the fast growth group (compare TE -1 and -0.5 in MP2 between 
CE2 and CE3 in Figure 11).  The inclusion of lower proportions of the fast growth group 
greatly increased positive bias on the positive TEs, especially for MP2.  Comparison of 
       
111 
 
MP3 and MP4 profiles in Figure 11 confirms that the lower proportion of cases in the fast 
growth group was the source of this bias. 
The pattern of bias on the cluster effect estimates across conditions was 
informative: 
1. Low overall variation in growth profile (MP1) conferred the least penalty on 
the cluster effect estimates (i.e., a low magnitude of decrease in the cluster 
effect estimates) when combined with negative TEs, regardless of the actual 
proportion of cases in the fast (or either) growth group (cluster type 1 & 3), 
but the bias on positive TEs was greatly increased (i.e., higher magnitude of 
parameter change than for negative TE) when the proportion of fast growth 
cases was low (MP1, CE3, and cluster type 1). 
2. High overall variation in growth group proportions (MP2) led to strong 
negative bias for negative TEs (i.e., a higher magnitude of decrease in the 
cluster effect estimates), compare to MP1. The positive bias on the cluster 
effect estimates for positive TEs on MP2 had the same pattern as MP1. 
3. Overall bias was greatest in MP2 and CE3 conditions, where the variation in 
the sample and among cluster types were the highest, exaggerating 
overestimation of a negative effect of poor teachers and overestimation of 
positive effect for good teachers. Greater variation in the mixture proportion 
increased bias. 
4. The potential for unfairness, if VAM without accounting for student-level 
heterogeneity is employed to estimate teacher effect, is very high due to the 
tendency for increasing student-level heterogeneity to lead to overestimation 
       
112 
 
of positive effects for good teachers and overestimation of a negative effect of 
poor teachers. 
The true models yielded estimates that were less biased than those of the mis-
 specified models, but the magnitudes of bias were greater for CE2 and 3 conditions than 
were those from CE1 conditions. If the TEs from conditions similar to CE2 and CE3 
existed when a given teacher was being evaluated using VAM (by MLGMM), then there 
is a strong likelihood of significantly overestimating teachers? positive effects, while 
overestimation of teachers? negative effect to a lesser degree.  That is, in a district with a 
wide range of students in terms of growth profiles, including low-starting, fast growth 
students in low performing schools and high-starting, fast growth students in high 
performing schools, or in a single school with these characteristics in the classrooms. 
These mixtures will create bias in evaluation that heavily favors, and also inflates the 
effects of good performing teachers. They do much better, in a sense, at identifying poor 
performing teacher by overestimating the negative effects of poorer performing teachers. 
Differential bias that depends on the actual capability of teachers cannot provide fair 
evaluations. In cases where a class has fewer students in the fast growth group, the VAM 
approach will strongly favor teachers with a positive effect and will severely penalize 
those teachers with negative effects.  
 In addition to having significant implications for the fairness of decision-making 
and policy based on VAM results, these results can also affect the choices that teachers 
make ? they might feel that schools with higher proportions of fast growing students are 
the only contexts in which they have a chance of being evaluated fairly. The issue of 
fairness ? and its perception ? in evaluation affects all parties in these decisions. 
       
113 
 
6.3.3  Cluster Effect Condition 4 and 5 (CE=4 and 5) 
Cluster effect condition 4 and 5 were included to assess the impact of overall 
variability in the sample on the cluster effect estimates. As described in Chapter 3, the 
cluster effects were randomly generated from the normal distribution with the mean of 0 
and a different variance (i.e., 0.5 for CE4 and 1.0 for CE5).  
 The bias in the cluster effect estimates was quite limited for CE4 and 5 
conditions, and this was observed for both the true and mis-specified models. The effects 
of MP, CS, and CN were also very limited in these conditions. By contrast, increased 
variance on the TE was the only manipulated feature that actually increased the bias 
derived from the true model in a meaningful way.  
Interestingly, the BIC criterion only functioned as expected for CE4 and 5 
conditions when the sample size was large (e.g., CN=90 and CS=40) and on MP3 or 4 
(i.e., constant mixture proportion). A constant mixture proportion was used in the 
reviewed research by Nylund et al.,(2007), Muth?n and Asparouhov (2009), and Chen et 
al. (2010). This might account for the strong performance of BIC reported in those 
analyses, and explain why BIC performed so poorly in virtually all of the analyses 
reported here (as discussed above in Section 6.2). 
6.3.4 Precision of estimates 
As expected, the precision of estimates was greater whenever the number of cases 
in the condition was higher. This finding is important for the teacher evaluation example 
described in Chapter 1 because of the potential for this same higher precision to also arise 
when the estimate is biased. The estimates from the mis-specified models tended to be 
biased, substantially in some cases (see Figures 9-11), which only serves to compound 
       
114 
 
the problems associated with the mis-specified model. Additionally, because the 
precision for mis-specified models tends to be improved by larger sample sizes, just as 
that of true models, if the model misspecification is undetected, it will give erroneous 
confidence in the biased estimates.  
However, if the cluster effects (e.g., teacher?s effect) within a cluster unit (e.g., 
school district) are similar to CE4 and CE5 (i.e., normally distributed around zero) with 
equal proportions of students in the growth profiles, then model misspecification is 
performs equally well or better than the true model. As discussed in Section 6.5 below, 
the zero cluster effects assigned for the slow growth group could have reduced the bias in 
the cluster effect estimates, particularly for the mis-specified model. The samples in CE4 
and CE5 had greater numbers of cases with a cluster effect of zero, which acts to further 
reduce the variance of overall cluster effects in these conditions. This combination of 
variance ?shrinkage? effects explains the reduction in bias. 
6.4 Results on classification accuracy 
 Classification accuracy at the quintile level was included as an alternative 
measure of bias and precision of cluster effects because it takes both bias and precision of 
cluster effect estimates into account (analyzed by kappa) to summarize model 
performance. This study found that the true and mis-specified models each performed 
better in specific simulation conditions; the true model outperformed the mis-specified 
model (in terms of bias and precision) in CE1 conditions, but the true model performed 
poorly in terms of bias on CE4 and 5 conditions, most likely due to the condition where 
the mean of TEs for the fast growth group and the effect of slow growth group were both 
zero. The classification method would be particularly useful when the evaluating the 
       
115 
 
effect of a certain criterion, such as the threshold or cutoff to be proficient or non-
 proficient, which was the beyond the scope of this research. 
6.5 Limitations of the research 
 This study was designed to address specific questions as outlined earlier. 
Simulation projects require fixed characteristics, and as such, these led to several 
limitations. One such limitation is the use of only two growth profiles. This might be 
more realistic than assuming homogeneous growth within a cluster, but it is far more 
likely that there are more than just two growth profiles in any classroom or school. A 
related challenge was that no latent classes were included to represent the cluster level 
(e.g., between-level or teacher?s level) where interactions between individuals and 
teachers are very likely. Further, some mixture proportions were unrealistic (i.e., MP3 
and 4) because they represent homogeneous growth within clusters; these conditions 
were needed in order to contextualize these results with those published previously. The 
mixture proportions used for MP1 and 2 might not reflect reality either, but they do 
represent the assumption that there is variation in these growth class proportions (i.e., 
proportions of student in each growth profile) within a given cluster, and that this 
variation is unlikely to be consistent across all clusters in a given modeling situation. The 
results do suggest that variation in those proportions has a significant impact on 
estimation and thereby, on decision-making that might be based on those teacher effect 
estimates. Future studies could explore whether a wider, more realistic, range of variation 
in growth class proportions yields a clearer picture of this impact and possible ways of 
addressing it in simulations.  
       
116 
 
 The impact of higher proportions of slow growth group members, which had zero 
cluster effect, was especially apparent in simulation conditions CE4 and 5 and was not 
expected. The use of positive mean true random effects (i.e., N(1,1)) or a negative non-
 zero mean true effect to represent the slow growth group could potentially alleviate the 
issue encountered for CE4 and 5 (see section 6.3.3), and could more clearly demonstrate 
the differences in inferences that are supported by  the true and mis-specified model 
conditions. An option for realizing these features, while not causing the issues described, 
is to center the true effect of the fast growth group for CE1 through CE3 on a positive 
value (e.g., 1) instead of zero, which was the value used in this study.  In spite of these 
limitations, the conclusions outlined at the start of the chapter support a general argument 
about the impact of using MLGMM over MLLGM. Coupled with the results and lessons 
learned from the zero cluster effect characteristics, this research could be a useful guide 
for further investigations, as well as applications, utilizing MLGMM. 
 An issue of inferential robustness, when alternative models with similar model fit 
(i.e., information criteria select alternative models instead of true model) lead to a 
different interpretations or conclusions, has not fully addressed in this dissertation. The 
simulation conditions of this study were designed to minimize the influence of 
uncontrolled effects to avoid this issue. However in more complex real life data, it is 
extremely important to carefully investigate alternative models in order to make a valid 
interpretation of results. 
Finally, although this study was designed to investigate the impact of unmodeled 
heterogeneity at the classroom level on the potential for fair VAM-derived teacher 
evaluations, the greatest challenges to fair decision-making that is based on teacher 
       
117 
 
effects (or value-added effect by teachers) is not the actual values of these estimates, but 
rather, it is the distinction between proficient and not-proficient teachers ? a two-level 
classification. The simulation, and therefore, results, do not speak to that two-level 
situation, but the finding that teachers with more positive and more negative cluster 
effects will actually generate differentially biased estimates suggests that any 
proficient/non-proficient classification will require very careful attention to the ?non-
 proficient? characterization. Further, the estimation of changes in teacher effects would 
be critical, because these results suggest that ?improvement? in teacher effect would be 
more easily recognizable in better teachers and would be more difficult to recognize in 
those who may need, or indeed may be struggling, to improve the most. 
6.6 Future directions 
 This research was limited in scope but it achieved the stated goal of providing 
evidence supporting the use, and interpretability, of MLGMM as a tool to control bias 
and improve fairness in the evaluation of teacher effects in value-added modeling 
contexts. In addition to different approaches to address the limitations outlined above, 
future work in this domain should test the effect of unequal cluster sizes in the estimation 
and identification of MLGMM. Unequal cluster sizes, together with a more realistic 
variety in the latent growth profiles, have not been studied and would represent a greater 
range of real-world conditions in which MLGMM should be tested. The introduction of 
the between-level latent classes to incorporate, or explore, interaction between the 
between-level (e.g., teachers) and the within-level (e.g., students) latent classes might be 
useful projects, depending on the type of evaluations that are of interest (and on the 
emphasis on potential sources of the value that is believed to have been added in 
       
118 
 
decision-making). Studying the effects of different growth profile parameters, including 
the shape and rate of growth and the number of growth profiles, could also strengthen the 
estimation, applicability, and interpretability of cluster effects that are estimated with 
MLGMM. At some point, estimation of change in teacher effect will become a very 
important topic, possibly supporting the proficient/not-proficient classification based on 
VAM estimates, as long as the bias is controlled and is no longer differential depending 
on whether the teacher is stronger or weaker. In sum, this study supports the continued 
exploration of MLGMM for fair decision-making in educational contexts. 
  
  
       
119 
 
Appendix A 
Estimation Results for All Simulation Conditions 
 
 
 
       
120 
 
Appendix A.1: Bias and Error of group estimates: Mixture Proportion 1 (MP1) and 
Cluster Effect 1 (CE1) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.29 -0.56 0.08 0.21 -0.19 -0.56 0.15 0.2 -0.12 -0.46 0.18 0.2 
  -0.5 -0.12 -0.56 0.18 0.23 -0.07 -0.42 0.24 0.19 -0.09 -0.42 0.19 0.19 
  0 0.14 -0.76 0.91 0.51 0.07 -0.84 1 0.54 -0.05 -0.37 0.25 0.19 
  0.5 0.15 -0.19 0.55 0.21 0.06 -0.22 0.34 0.18 -0.01 -0.28 0.25 0.18 
  1 0.28 -0.06 0.63 0.22 0.17 -0.17 0.5 0.21 0.05 -0.21 0.29 0.16 
 40 -1 -0.23 -0.48 -0.03 0.15 -0.17 -0.41 0.09 0.15 -0.13 -0.31 0.1 0.12 
  -0.5 -0.1 -0.34 0.14 0.14 -0.08 -0.36 0.16 0.15 -0.08 -0.32 0.14 0.14 
  0 -0.1 -0.9 0.89 0.56 0.15 -0.89 1.03 0.64 -0.04 -0.28 0.18 0.14 
  0.5 0.17 -0.1 0.36 0.14 0.02 -0.18 0.27 0.14 -0.01 -0.24 0.22 0.15 
  1 0.28 -0.01 0.63 0.19 0.11 -0.15 0.38 0.15 0.01 -0.25 0.27 0.14 
60 20 -1 -0.28 -0.64 0.01 0.22 -0.2 -0.5 0.1 0.19 -0.17 -0.51 0.13 0.19 
  -0.5 -0.13 -0.43 0.22 0.2 -0.14 -0.45 0.2 0.2 -0.09 -0.41 0.22 0.19 
  0 0.04 -0.92 0.9 0.55 0.2 -0.57 0.91 0.49 -0.03 -0.33 0.24 0.18 
  0.5 0.14 -0.2 0.49 0.22 0.04 -0.29 0.37 0.2 0 -0.33 0.29 0.2 
  1 0.25 -0.04 0.59 0.21 0.11 -0.23 0.43 0.19 0.04 -0.27 0.36 0.18 
 40 -1 -0.22 -0.48 0.09 0.16 -0.16 -0.44 0.07 0.14 -0.09 -0.34 0.14 0.14 
  -0.5 -0.1 -0.33 0.16 0.15 -0.08 -0.25 0.13 0.11 -0.1 -0.32 0.08 0.13 
  0 -0.12 -0.94 0.97 0.59 0.14 -0.9 0.93 0.58 -0.06 -0.27 0.13 0.13 
  0.5 0.15 -0.12 0.4 0.15 0.03 -0.23 0.3 0.15 -0.02 -0.22 0.16 0.12 
  1 0.25 -0.11 0.57 0.2 0.07 -0.19 0.33 0.15 0.01 -0.19 0.26 0.14 
90 20 -1 -0.26 -0.65 0.13 0.24 -0.17 -0.5 0.18 0.2 -0.14 -0.45 0.14 0.18 
  -0.5 -0.16 -0.59 0.2 0.23 -0.09 -0.34 0.23 0.18 -0.09 -0.4 0.14 0.18 
  0 0.1 -0.93 0.95 0.6 0.29 -0.54 1.02 0.51 -0.1 -0.37 0.2 0.2 
  0.5 0.16 -0.18 0.53 0.23 0.07 -0.21 0.37 0.18 0.02 -0.31 0.27 0.18 
  1 0.25 -0.14 0.59 0.23 0.11 -0.23 0.41 0.2 0.04 -0.25 0.29 0.18 
 40 -1 -0.27 -0.5 0 0.15 -0.15 -0.46 0.16 0.16 -0.11 -0.34 0.11 0.14 
  -0.5 -0.08 -0.36 0.18 0.16 -0.09 -0.29 0.18 0.14 -0.08 -0.28 0.1 0.13 
  0 -0.14 -0.88 0.74 0.52 0.08 -1.03 0.99 0.64 -0.06 -0.29 0.16 0.13 
  0.5 0.17 -0.12 0.42 0.17 0.01 -0.24 0.27 0.15 -0.03 -0.31 0.18 0.15 
  1 0.28 -0.01 0.56 0.19 0.08 -0.17 0.33 0.14 -0.02 -0.23 0.19 0.12 
       
121 
 
Appendix A.2: Bias and Error of group estimates: Mixture Proportion 2 (MP2) and 
Cluster Effect 1 (CE1) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.26 -0.65 0.13 0.24 -0.17 -0.5 0.18 0.2 -0.14 -0.45 0.14 0.18 
  -0.5 -0.16 -0.59 0.2 0.23 -0.09 -0.34 0.23 0.18 -0.09 -0.4 0.14 0.18 
  0 0.1 -0.93 0.95 0.6 0.29 -0.54 1.02 0.51 -0.1 -0.37 0.2 0.2 
  0.5 0.16 -0.18 0.53 0.23 0.07 -0.21 0.37 0.18 0.02 -0.31 0.27 0.18 
  1 0.25 -0.14 0.59 0.23 0.11 -0.23 0.41 0.2 0.04 -0.25 0.29 0.18 
 40 -1 -0.27 -0.5 0 0.15 -0.15 -0.46 0.16 0.16 -0.11 -0.34 0.11 0.14 
  -0.5 -0.08 -0.36 0.18 0.16 -0.09 -0.29 0.18 0.14 -0.08 -0.28 0.1 0.13 
  0 -0.14 -0.88 0.74 0.52 0.08 -1.03 0.99 0.64 -0.06 -0.29 0.16 0.13 
  0.5 0.17 -0.12 0.42 0.17 0.01 -0.24 0.27 0.15 -0.03 -0.31 0.18 0.15 
  1 0.28 -0.01 0.56 0.19 0.08 -0.17 0.33 0.14 -0.02 -0.23 0.19 0.12 
60 20 -1 -0.46 -0.83 -0.07 0.23 -0.31 -0.72 0.07 0.25 -0.22 -0.58 0.13 0.22 
  -0.5 -0.23 -0.6 0.12 0.22 -0.17 -0.55 0.2 0.23 -0.17 -0.42 0.1 0.16 
  0 0.12 -0.54 0.74 0.4 0.12 -0.91 0.94 0.5 0.18 -0.5 0.91 0.43 
  0.5 0.31 -0.13 0.71 0.24 0.1 -0.31 0.39 0.23 0.03 -0.27 0.33 0.19 
  1 0.53 0.03 0.93 0.28 0.29 -0.04 0.71 0.24 0.11 -0.19 0.46 0.2 
 40 -1 -0.36 -0.68 -0.01 0.2 -0.27 -0.54 0.01 0.16 -0.2 -0.47 0.11 0.17 
  -0.5 -0.17 -0.51 0.1 0.18 -0.13 -0.42 0.11 0.16 -0.14 -0.37 0.14 0.16 
  0 0.2 -0.59 0.94 0.46 0.19 -0.76 0.84 0.52 0.2 -0.58 1 0.5 
  0.5 0.28 -0.06 0.54 0.18 0.09 -0.23 0.38 0.2 -0.01 -0.26 0.24 0.14 
  1 0.5 0.05 0.91 0.25 0.21 -0.13 0.48 0.18 0.05 -0.19 0.28 0.16 
90 20 -1 -0.43 -0.79 0 0.24 -0.33 -0.65 0.22 0.24 -0.25 -0.58 0.03 0.2 
  -0.5 -0.18 -0.51 0.2 0.23 -0.16 -0.53 0.23 0.22 -0.17 -0.49 0.21 0.21 
  0 0.21 -0.49 1 0.43 0.21 -0.67 0.82 0.44 0.16 -0.38 0.89 0.42 
  0.5 0.29 -0.14 0.71 0.26 0.14 -0.22 0.58 0.22 0.05 -0.22 0.34 0.17 
  1 0.45 0 0.78 0.25 0.24 -0.06 0.56 0.19 0.12 -0.19 0.42 0.2 
 40 -1 -0.4 -0.68 -0.04 0.2 -0.27 -0.61 0.01 0.19 -0.2 -0.47 0.07 0.16 
  -0.5 -0.18 -0.41 0.13 0.17 -0.14 -0.39 0.16 0.17 -0.13 -0.39 0.14 0.17 
  0 0.16 -0.69 0.8 0.47 0.22 -0.85 0.97 0.54 0.19 -0.48 0.83 0.44 
  0.5 0.29 0.01 0.59 0.18 0.1 -0.17 0.39 0.17 -0.03 -0.3 0.22 0.16 
  1 0.46 0.13 0.81 0.21 0.17 -0.14 0.5 0.19 0.03 -0.24 0.34 0.17 
       
122 
 
Appendix A.3: Bias and Error of group estimates: Mixture Proportion 3 (MP3) and 
Cluster Effect 1 (CE1) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.43 -0.79 0 0.24 -0.33 -0.65 0.22 0.24 -0.25 -0.58 0.03 0.2 
  -0.5 -0.18 -0.51 0.2 0.23 -0.16 -0.53 0.23 0.22 -0.17 -0.49 0.21 0.21 
  0 0.21 -0.49 1 0.43 0.21 -0.67 0.82 0.44 0.16 -0.38 0.89 0.42 
  0.5 0.29 -0.14 0.71 0.26 0.14 -0.22 0.58 0.22 0.05 -0.22 0.34 0.17 
  1 0.45 0 0.78 0.25 0.24 -0.06 0.56 0.19 0.12 -0.19 0.42 0.2 
 40 -1 -0.4 -0.68 -0.04 0.2 -0.27 -0.61 0.01 0.19 -0.2 -0.47 0.07 0.16 
  -0.5 -0.18 -0.41 0.13 0.17 -0.14 -0.39 0.16 0.17 -0.13 -0.39 0.14 0.17 
  0 0.16 -0.69 0.8 0.47 0.22 -0.85 0.97 0.54 0.19 -0.48 0.83 0.44 
  0.5 0.29 0.01 0.59 0.18 0.1 -0.17 0.39 0.17 -0.03 -0.3 0.22 0.16 
  1 0.46 0.13 0.81 0.21 0.17 -0.14 0.5 0.19 0.03 -0.24 0.34 0.17 
60 20 -1 -0.47 -0.96 -0.04 0.26 -0.32 -0.67 0.05 0.22 -0.24 -0.59 0.12 0.23 
  -0.5 -0.21 -0.66 0.26 0.27 -0.18 -0.56 0.22 0.22 -0.15 -0.5 0.24 0.22 
  0 0.22 -0.49 0.79 0.39 0.28 -0.49 0.93 0.45 0.17 -0.56 0.85 0.42 
  0.5 0.29 -0.2 0.67 0.27 0.09 -0.28 0.51 0.25 0.05 -0.28 0.39 0.2 
  1 0.48 0.01 0.93 0.27 0.25 -0.13 0.56 0.22 0.12 -0.27 0.51 0.23 
 40 -1 -0.4 -0.71 -0.07 0.2 -0.26 -0.5 0.06 0.17 -0.17 -0.43 0.1 0.16 
  -0.5 -0.18 -0.42 0.1 0.15 -0.15 -0.41 0.12 0.16 -0.15 -0.43 0.09 0.16 
  0 0.14 -0.8 0.8 0.49 0.27 -0.92 0.9 0.53 0.3 -0.37 0.95 0.46 
  0.5 0.24 -0.08 0.52 0.19 0.06 -0.18 0.29 0.15 -0.03 -0.31 0.24 0.16 
  1 0.44 0.07 0.81 0.23 0.14 -0.16 0.44 0.19 0.06 -0.19 0.34 0.15 
90 20 -1 -0.29 -0.64 0.05 0.21 -0.32 -0.72 0.14 0.24 -0.32 -0.66 0.08 0.22 
  -0.5 -0.19 -0.57 0.18 0.22 -0.15 -0.48 0.23 0.22 -0.18 -0.51 0.2 0.23 
  0 0.13 -0.63 0.78 0.45 0.12 -0.59 0.81 0.44 0.16 -0.52 0.76 0.45 
  0.5 0.12 -0.31 0.49 0.23 0.12 -0.21 0.49 0.22 0.08 -0.34 0.41 0.22 
  1 0.31 -0.07 0.66 0.22 0.26 -0.15 0.59 0.22 0.28 -0.14 0.59 0.23 
 40 -1 -0.24 -0.54 0 0.17 -0.24 -0.56 0.09 0.19 -0.25 -0.53 0.04 0.17 
  -0.5 -0.11 -0.42 0.2 0.19 -0.13 -0.37 0.13 0.15 -0.11 -0.36 0.16 0.16 
  0 0.15 -0.84 0.88 0.53 0.18 -0.68 0.94 0.54 0.16 -0.9 0.92 0.54 
  0.5 0.09 -0.2 0.35 0.17 0.11 -0.21 0.42 0.18 0.08 -0.19 0.34 0.16 
  1 0.19 -0.08 0.44 0.17 0.19 -0.1 0.52 0.19 0.18 -0.11 0.5 0.19 
       
123 
 
Appendix A.4: Bias and Error of group estimates: Mixture Proportion 4 (MP4) and 
Cluster Effect 1 (CE1) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.29 -0.64 0.05 0.21 -0.32 -0.72 0.14 0.24 -0.32 -0.66 0.08 0.22 
  -0.5 -0.19 -0.57 0.18 0.22 -0.15 -0.48 0.23 0.22 -0.18 -0.51 0.2 0.23 
  0 0.13 -0.63 0.78 0.45 0.12 -0.59 0.81 0.44 0.16 -0.52 0.76 0.45 
  0.5 0.12 -0.31 0.49 0.23 0.12 -0.21 0.49 0.22 0.08 -0.34 0.41 0.22 
  1 0.31 -0.07 0.66 0.22 0.26 -0.15 0.59 0.22 0.28 -0.14 0.59 0.23 
 40 -1 -0.24 -0.54 0 0.17 -0.24 -0.56 0.09 0.19 -0.25 -0.53 0.04 0.17 
  -0.5 -0.11 -0.42 0.2 0.19 -0.13 -0.37 0.13 0.15 -0.11 -0.36 0.16 0.16 
  0 0.15 -0.84 0.88 0.53 0.18 -0.68 0.94 0.54 0.16 -0.9 0.92 0.54 
  0.5 0.09 -0.2 0.35 0.17 0.11 -0.21 0.42 0.18 0.08 -0.19 0.34 0.16 
  1 0.19 -0.08 0.44 0.17 0.19 -0.1 0.52 0.19 0.18 -0.11 0.5 0.19 
60 20 -1 -0.26 -0.68 0.12 0.24 -0.33 -0.72 0.12 0.24 -0.3 -0.67 0.15 0.26 
  -0.5 -0.15 -0.49 0.34 0.25 -0.15 -0.54 0.31 0.26 -0.12 -0.49 0.27 0.22 
  0 0.07 -0.76 0.86 0.48 0.16 -0.67 0.86 0.46 0.3 -0.46 0.89 0.4 
  0.5 0.12 -0.21 0.45 0.2 0.12 -0.25 0.47 0.22 0.1 -0.34 0.45 0.24 
  1 0.2 -0.2 0.58 0.24 0.25 -0.15 0.72 0.26 0.27 -0.09 0.64 0.22 
 40 -1 -0.23 -0.5 0.08 0.18 -0.23 -0.48 0.02 0.16 -0.21 -0.5 0.09 0.18 
  -0.5 -0.14 -0.44 0.15 0.19 -0.12 -0.36 0.16 0.16 -0.11 -0.42 0.17 0.16 
  0 0.2 -0.76 0.95 0.54 0.29 -0.74 0.95 0.54 0.27 -0.49 0.96 0.49 
  0.5 0.08 -0.21 0.37 0.18 0.09 -0.15 0.32 0.15 0.09 -0.19 0.34 0.16 
  1 0.17 -0.08 0.49 0.17 0.18 -0.16 0.48 0.19 0.18 -0.09 0.44 0.16 
90 20 -1 -0.32 -0.72 0.1 0.26 -0.35 -0.71 -0.04 0.23 -0.32 -0.7 0.05 0.23 
  -0.5 -0.16 -0.5 0.2 0.24 -0.16 -0.53 0.18 0.21 -0.16 -0.54 0.19 0.22 
  0 0.13 -0.76 0.81 0.48 0.24 -0.69 0.86 0.44 0.21 -0.64 0.81 0.43 
  0.5 0.12 -0.3 0.53 0.24 0.11 -0.25 0.48 0.21 0.09 -0.24 0.49 0.22 
  1 0.24 -0.13 0.66 0.24 0.22 -0.18 0.65 0.24 0.22 -0.17 0.55 0.21 
 40 -1 -0.21 -0.56 0.1 0.2 -0.21 -0.46 0.11 0.18 -0.24 -0.53 0.02 0.17 
  -0.5 -0.14 -0.44 0.22 0.19 -0.11 -0.4 0.22 0.19 -0.11 -0.41 0.16 0.17 
  0 0.2 -0.85 1 0.54 0.26 -0.73 1.05 0.53 0.33 -0.66 0.93 0.49 
  0.5 0.06 -0.23 0.34 0.18 0.04 -0.27 0.31 0.18 0.05 -0.26 0.37 0.18 
  1 0.13 -0.21 0.41 0.19 0.14 -0.13 0.42 0.18 0.13 -0.18 0.44 0.17 
       
124 
 
Appendix A.5: Bias and Error of group estimates: Mixture Proportion 1 (MP1) and 
Cluster Effect 2 (CE2) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.21 -0.56 0.18 0.22 0 0 0 0 0 0 0 0 
  -0.5 -0.09 -0.48 0.28 0.23 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.01 -0.3 0.35 0.2 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.03 -0.26 0.26 0.16 
  1 0 0 0 0 0 0 0 0 0.14 -0.16 0.41 0.19 
 40 -1 -0.16 -0.36 0.05 0.13 0 0 0 0 0 0 0 0 
  -0.5 -0.03 -0.28 0.24 0.16 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.04 -0.19 0.22 0.12 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.05 -0.17 0.31 0.16 
  1 0 0 0 0 0 0 0 0 0.05 -0.2 0.29 0.14 
60 20 -1 -0.19 -0.53 0.21 0.22 0 0 0 0 0 0 0 0 
  -0.5 -0.07 -0.43 0.24 0.21 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0 -0.33 0.26 0.2 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.07 -0.23 0.38 0.19 
  1 0 0 0 0 0 0 0 0 0.11 -0.2 0.41 0.2 
 40 -1 -0.2 -0.44 0.01 0.16 0 0 0 0 0 0 0 0 
  -0.5 -0.06 -0.29 0.2 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.21 0.28 0.14 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.05 -0.16 0.26 0.13 
  1 0 0 0 0 0 0 0 0 0.08 -0.14 0.25 0.12 
90 20 -1 -0.19 -0.54 0.15 0.22 0 0 0 0 0 0 0 0 
  -0.5 -0.07 -0.42 0.29 0.21 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.01 -0.3 0.37 0.2 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.06 -0.25 0.35 0.19 
  1 0 0 0 0 0 0 0 0 0.13 -0.17 0.37 0.16 
 40 -1 -0.15 -0.41 0.12 0.16 0 0 0 0 0 0 0 0 
  -0.5 -0.05 -0.27 0.19 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.24 0.31 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.03 -0.21 0.24 0.14 
  1 0 0 0 0 0 0 0 0 0.08 -0.15 0.3 0.14 
 
  
       
125 
 
Appendix A.6: Bias and Error of group estimates: Mixture Proportion 2 (MP2) and 
Cluster Effect 2 (CE2) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.19 -0.54 0.15 0.22 0 0 0 0 0 0 0 0 
  -0.5 -0.07 -0.42 0.29 0.21 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.01 -0.3 0.37 0.2 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.06 -0.25 0.35 0.19 
  1 0 0 0 0 0 0 0 0 0.13 -0.17 0.37 0.16 
 40 -1 -0.15 -0.41 0.12 0.16 0 0 0 0 0 0 0 0 
  -0.5 -0.05 -0.27 0.19 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.24 0.31 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.03 -0.21 0.24 0.14 
  1 0 0 0 0 0 0 0 0 0.08 -0.15 0.3 0.14 
60 20 -1 -0.47 -0.87 -0.03 0.27 0 0 0 0 0 0 0 0 
  -0.5 -0.12 -0.57 0.43 0.29 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.04 -0.38 0.45 0.25 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.08 -0.29 0.41 0.21 
  1 0 0 0 0 0 0 0 0 0.22 -0.12 0.62 0.22 
 40 -1 -0.35 -0.66 -0.02 0.19 0 0 0 0 0 0 0 0 
  -0.5 -0.07 -0.41 0.23 0.19 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.04 -0.3 0.27 0.19 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.06 -0.25 0.33 0.17 
  1 0 0 0 0 0 0 0 0 0.13 -0.12 0.37 0.14 
90 20 -1 -0.44 -0.89 -0.03 0.27 0 0 0 0 0 0 0 0 
  -0.5 -0.17 -0.49 0.22 0.24 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.38 0.37 0.24 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.1 -0.26 0.48 0.21 
  1 0 0 0 0 0 0 0 0 0.21 -0.15 0.65 0.24 
 40 -1 -0.34 -0.69 -0.05 0.2 0 0 0 0 0 0 0 0 
  -0.5 -0.1 -0.36 0.22 0.18 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.2 0.37 0.17 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.04 -0.23 0.31 0.17 
  1 0 0 0 0 0 0 0 0 0.07 -0.22 0.29 0.16 
 
  
       
126 
 
Appendix A.7: Bias and Error of group estimates: Mixture Proportion 3 (MP3) and 
Cluster Effect 2 (CE2) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.44 -0.89 -0.03 0.27 0 0 0 0 0 0 0 0 
  -0.5 -0.17 -0.49 0.22 0.24 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.38 0.37 0.24 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.1 -0.26 0.48 0.21 
  1 0 0 0 0 0 0 0 0 0.21 -0.15 0.65 0.24 
 40 -1 -0.34 -0.69 -0.05 0.2 0 0 0 0 0 0 0 0 
  -0.5 -0.1 -0.36 0.22 0.18 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.2 0.37 0.17 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.04 -0.23 0.31 0.17 
  1 0 0 0 0 0 0 0 0 0.07 -0.22 0.29 0.16 
60 20 -1 -0.44 -0.88 0.01 0.29 0 0 0 0 0 0 0 0 
  -0.5 -0.17 -0.62 0.29 0.27 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.03 -0.41 0.4 0.24 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.1 -0.33 0.5 0.23 
  1 0 0 0 0 0 0 0 0 0.21 -0.15 0.57 0.22 
 40 -1 -0.36 -0.67 -0.02 0.2 0 0 0 0 0 0 0 0 
  -0.5 -0.12 -0.47 0.2 0.2 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.05 -0.25 0.33 0.17 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.02 -0.23 0.32 0.16 
  1 0 0 0 0 0 0 0 0 0.12 -0.18 0.41 0.17 
90 20 -1 -0.35 -0.73 -0.01 0.22 0 0 0 0 0 0 0 0 
  -0.5 -0.18 -0.52 0.17 0.21 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.05 -0.43 0.33 0.22 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.15 -0.21 0.54 0.24 
  1 0 0 0 0 0 0 0 0 0.26 -0.18 0.67 0.26 
 40 -1 -0.26 -0.56 0 0.17 0 0 0 0 0 0 0 0 
  -0.5 -0.13 -0.4 0.2 0.19 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.03 -0.3 0.22 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.08 -0.19 0.3 0.16 
  1 0 0 0 0 0 0 0 0 0.15 -0.2 0.44 0.19 
 
  
       
127 
 
Appendix A.8: Bias and Error of group estimates: Mixture Proportion 4 (MP4) and 
Cluster Effect 2 (CE2) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.35 -0.73 -0.01 0.22 0 0 0 0 0 0 0 0 
  -0.5 -0.18 -0.52 0.17 0.21 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.05 -0.43 0.33 0.22 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.15 -0.21 0.54 0.24 
  1 0 0 0 0 0 0 0 0 0.26 -0.18 0.67 0.26 
 40 -1 -0.26 -0.56 0 0.17 0 0 0 0 0 0 0 0 
  -0.5 -0.13 -0.4 0.2 0.19 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.03 -0.3 0.22 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.08 -0.19 0.3 0.16 
  1 0 0 0 0 0 0 0 0 0.15 -0.2 0.44 0.19 
60 20 -1 -0.35 -0.85 0.08 0.25 0 0 0 0 0 0 0 0 
  -0.5 -0.16 -0.58 0.17 0.22 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.04 -0.38 0.25 0.2 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.13 -0.24 0.49 0.21 
  1 0 0 0 0 0 0 0 0 0.23 -0.07 0.52 0.19 
 40 -1 -0.26 -0.56 0 0.18 0 0 0 0 0 0 0 0 
  -0.5 -0.12 -0.35 0.15 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.01 -0.33 0.31 0.19 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.08 -0.16 0.34 0.16 
  1 0 0 0 0 0 0 0 0 0.17 -0.12 0.44 0.18 
90 20 -1 -0.4 -0.85 -0.03 0.24 0 0 0 0 0 0 0 0 
  -0.5 -0.21 -0.56 0.16 0.22 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.03 -0.47 0.33 0.25 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.12 -0.3 0.47 0.22 
  1 0 0 0 0 0 0 0 0 0.23 -0.13 0.6 0.23 
 40 -1 -0.22 -0.51 0.06 0.19 0 0 0 0 0 0 0 0 
  -0.5 -0.15 -0.46 0.1 0.17 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0 -0.28 0.28 0.18 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.05 -0.21 0.32 0.17 
  1 0 0 0 0 0 0 0 0 0.18 -0.08 0.48 0.17 
 
  
       
128 
 
Appendix A.9: Bias and Error of group estimates: Mixture Proportion 1 (MP1) and 
Cluster Effect 3 (CE3) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.26 -0.53 0.08 0.19 
  -0.5 0 0 0 0 0 0 0 0 -0.19 -0.48 0.15 0.19 
  0 0 0 0 0 -0.07 -0.38 0.23 0.18 0 0 0 0 
  0.5 0.09 -0.24 0.41 0.21 0 0 0 0 0 0 0 0 
  1 0.26 -0.14 0.76 0.27 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.22 -0.41 0 0.13 
  -0.5 0 0 0 0 0 0 0 0 -0.17 -0.37 0.04 0.13 
  0 0 0 0 0 -0.09 -0.3 0.11 0.13 0 0 0 0 
  0.5 0.1 -0.18 0.36 0.15 0 0 0 0 0 0 0 0 
  1 0.2 -0.08 0.48 0.17 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.3 -0.62 0.02 0.2 
  -0.5 0 0 0 0 0 0 0 0 -0.19 -0.5 0.11 0.19 
  0 0 0 0 0 -0.09 -0.36 0.19 0.18 0 0 0 0 
  0.5 0.08 -0.25 0.43 0.21 0 0 0 0 0 0 0 0 
  1 0.21 -0.09 0.54 0.2 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.21 -0.43 -0.02 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.18 -0.38 0.04 0.13 
  0 0 0 0 0 -0.11 -0.35 0.13 0.14 0 0 0 0 
  0.5 0.07 -0.16 0.33 0.15 0 0 0 0 0 0 0 0 
  1 0.16 -0.06 0.43 0.15 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.26 -0.56 0.01 0.18 
  -0.5 0 0 0 0 0 0 0 0 -0.23 -0.54 0.04 0.17 
  0 0 0 0 0 -0.11 -0.47 0.2 0.19 0 0 0 0 
  0.5 0.09 -0.29 0.49 0.23 0 0 0 0 0 0 0 0 
  1 0.19 -0.09 0.5 0.2 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.21 -0.43 0 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.2 -0.44 0.03 0.14 
  0 0 0 0 0 -0.13 -0.36 0.09 0.14 0 0 0 0 
  0.5 0.07 -0.19 0.29 0.14 0 0 0 0 0 0 0 0 
  1 0.18 -0.06 0.42 0.15 0 0 0 0 0 0 0 0 
 
  
       
129 
 
Appendix A.10: Bias and Error of group estimates: Mixture Proportion 2 (MP2) and 
Cluster Effect 3 (CE3) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.26 -0.56 0.01 0.18 
  -0.5 0 0 0 0 0 0 0 0 -0.23 -0.54 0.04 0.17 
  0 0 0 0 0 -0.11 -0.47 0.2 0.19 0 0 0 0 
  0.5 0.09 -0.29 0.49 0.23 0 0 0 0 0 0 0 0 
  1 0.19 -0.09 0.5 0.2 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.21 -0.43 0 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.2 -0.44 0.03 0.14 
  0 0 0 0 0 -0.13 -0.36 0.09 0.14 0 0 0 0 
  0.5 0.07 -0.19 0.29 0.14 0 0 0 0 0 0 0 0 
  1 0.18 -0.06 0.42 0.15 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.44 -0.72 -0.07 0.2 
  -0.5 0 0 0 0 0 0 0 0 -0.26 -0.51 0.11 0.18 
  0 0 0 0 0 -0.06 -0.43 0.26 0.19 0 0 0 0 
  0.5 0.25 -0.1 0.57 0.21 0 0 0 0 0 0 0 0 
  1 0.52 0.01 0.87 0.26 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.36 -0.6 -0.11 0.17 
  -0.5 0 0 0 0 0 0 0 0 -0.26 -0.46 -0.03 0.14 
  0 0 0 0 0 -0.15 -0.4 0.1 0.15 0 0 0 0 
  0.5 0.26 -0.05 0.55 0.18 0 0 0 0 0 0 0 0 
  1 0.5 0.15 0.8 0.19 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.43 -0.74 -0.05 0.2 
  -0.5 0 0 0 0 0 0 0 0 -0.28 -0.56 -0.03 0.17 
  0 0 0 0 0 -0.1 -0.4 0.2 0.17 0 0 0 0 
  0.5 0.24 -0.09 0.58 0.21 0 0 0 0 0 0 0 0 
  1 0.48 0.03 0.87 0.24 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.35 -0.68 -0.05 0.19 
  -0.5 0 0 0 0 0 0 0 0 -0.24 -0.45 0.03 0.15 
  0 0 0 0 0 -0.1 -0.38 0.2 0.16 0 0 0 0 
  0.5 0.22 -0.08 0.52 0.18 0 0 0 0 0 0 0 0 
  1 0.48 0.13 0.77 0.2 0 0 0 0 0 0 0 0 
 
  
       
130 
 
Appendix A.11: Bias and Error of group estimates: Mixture Proportion 3 (MP3) and 
Cluster Effect 3 (CE3) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.43 -0.74 -0.05 0.2 
  -0.5 0 0 0 0 0 0 0 0 -0.28 -0.56 -0.03 0.17 
  0 0 0 0 0 -0.1 -0.4 0.2 0.17 0 0 0 0 
  0.5 0.24 -0.09 0.58 0.21 0 0 0 0 0 0 0 0 
  1 0.48 0.03 0.87 0.24 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.35 -0.68 -0.05 0.19 
  -0.5 0 0 0 0 0 0 0 0 -0.24 -0.45 0.03 0.15 
  0 0 0 0 0 -0.1 -0.38 0.2 0.16 0 0 0 0 
  0.5 0.22 -0.08 0.52 0.18 0 0 0 0 0 0 0 0 
  1 0.48 0.13 0.77 0.2 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.46 -0.78 -0.12 0.21 
  -0.5 0 0 0 0 0 0 0 0 -0.26 -0.52 0 0.18 
  0 0 0 0 0 -0.1 -0.42 0.27 0.21 0 0 0 0 
  0.5 0.25 -0.01 0.53 0.17 0 0 0 0 0 0 0 0 
  1 0.54 0.17 0.93 0.23 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.34 -0.58 -0.07 0.16 
  -0.5 0 0 0 0 0 0 0 0 -0.25 -0.45 -0.03 0.13 
  0 0 0 0 0 -0.11 -0.35 0.11 0.15 0 0 0 0 
  0.5 0.23 -0.04 0.51 0.17 0 0 0 0 0 0 0 0 
  1 0.49 0.18 0.8 0.2 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.34 -0.76 0.1 0.25 
  -0.5 0 0 0 0 0 0 0 0 -0.18 -0.49 0.14 0.19 
  0 0 0 0 0 -0.02 -0.43 0.4 0.24 0 0 0 0 
  0.5 0.12 -0.18 0.53 0.22 0 0 0 0 0 0 0 0 
  1 0.3 -0.1 0.63 0.22 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.28 -0.58 0.09 0.19 
  -0.5 0 0 0 0 0 0 0 0 -0.16 -0.38 0.13 0.18 
  0 0 0 0 0 -0.02 -0.31 0.23 0.17 0 0 0 0 
  0.5 0.11 -0.17 0.34 0.15 0 0 0 0 0 0 0 0 
  1 0.2 -0.08 0.51 0.18 0 0 0 0 0 0 0 0 
 
  
       
131 
 
Appendix A.12: Bias and Error of group estimates: Mixture Proportion 4 (MP4) and 
Cluster Effect 3 (CE3) for the true model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.34 -0.76 0.1 0.25 
  -0.5 0 0 0 0 0 0 0 0 -0.18 -0.49 0.14 0.19 
  0 0 0 0 0 -0.02 -0.43 0.4 0.24 0 0 0 0 
  0.5 0.12 -0.18 0.53 0.22 0 0 0 0 0 0 0 0 
  1 0.3 -0.1 0.63 0.22 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.28 -0.58 0.09 0.19 
  -0.5 0 0 0 0 0 0 0 0 -0.16 -0.38 0.13 0.18 
  0 0 0 0 0 -0.02 -0.31 0.23 0.17 0 0 0 0 
  0.5 0.11 -0.17 0.34 0.15 0 0 0 0 0 0 0 0 
  1 0.2 -0.08 0.51 0.18 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.37 -0.74 -0.03 0.22 
  -0.5 0 0 0 0 0 0 0 0 -0.21 -0.59 0.1 0.21 
  0 0 0 0 0 -0.01 -0.39 0.36 0.22 0 0 0 0 
  0.5 0.13 -0.17 0.51 0.21 0 0 0 0 0 0 0 0 
  1 0.25 -0.1 0.66 0.22 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.27 -0.55 0.06 0.19 
  -0.5 0 0 0 0 0 0 0 0 -0.12 -0.33 0.13 0.16 
  0 0 0 0 0 -0.04 -0.29 0.31 0.19 0 0 0 0 
  0.5 0.1 -0.18 0.38 0.17 0 0 0 0 0 0 0 0 
  1 0.18 -0.12 0.42 0.16 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.34 -0.76 0.04 0.24 
  -0.5 0 0 0 0 0 0 0 0 -0.17 -0.56 0.17 0.22 
  0 0 0 0 0 -0.03 -0.39 0.37 0.23 0 0 0 0 
  0.5 0.1 -0.26 0.47 0.21 0 0 0 0 0 0 0 0 
  1 0.26 -0.03 0.59 0.19 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.26 -0.6 0.11 0.2 
  -0.5 0 0 0 0 0 0 0 0 -0.13 -0.42 0.16 0.18 
  0 0 0 0 0 -0.03 -0.27 0.22 0.16 0 0 0 0 
  0.5 0.1 -0.17 0.35 0.17 0 0 0 0 0 0 0 0 
  1 0.17 -0.16 0.44 0.19 0 0 0 0 0 0 0 0 
 
  
       
132 
 
Appendix A.13: Bias and Error of group estimates: Cluster Effect 4 (CE4) for the true 
model 
 
?   Cluster Type 
?   1 2 3 
?   Bias Error Bias Error Bias Error
 Mi
 xtu
 re 
Pro
 po
 rtio
 n 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
1? 30 20 0.06 -0.07 0.22 0.35 0.02 -0.1 0.15 0.34 -0.01 -0.17 0.13 0.32 
?  40 0.03 -0.07 0.18 0.36 -0.02 -0.15 0.12 0.41 -0.05 -0.2 0.09 0.43 
? 60 20 0.05 -0.03 0.16 0.31 0.02 -0.09 0.1 0.33 -0.01 -0.1 0.09 0.41 
?  40 0.05 -0.03 0.16 0.36 0.01 -0.09 0.09 0.44 -0.02 -0.12 0.06 0.39 
? 90 20 0.04 -0.03 0.12 0.33 0.01 -0.08 0.09 0.34 -0.01 -0.13 0.07 0.39 
?  40 0.05 -0.01 0.14 0.38 0 -0.08 0.08 0.34 -0.03 -0.13 0.04 0.42 
2? 30 20 0.03 -0.09 0.15 0.23 0.01 -0.14 0.2 0.26 -0.01 -0.18 0.12 0.31 
?  40 0.07 -0.04 0.21 0.27 0.01 -0.13 0.12 0.31 -0.03 -0.2 0.1 0.36 
? 60 20 0.05 -0.05 0.14 0.23 0.01 -0.09 0.12 0.26 -0.02 -0.12 0.07 0.31 
?  40 0.06 -0.01 0.15 0.2 0.01 -0.08 0.09 0.37 -0.04 -0.14 0.06 0.36 
? 90 20 0.04 -0.03 0.13 0.25 0.01 -0.1 0.09 0.25 -0.01 -0.09 0.06 0.32 
?  40 0.07 0 0.16 0.26 0.01 -0.07 0.09 0.36 -0.04 -0.13 0.05 0.35 
3? 30 20 0 -0.14 0.12 0.27 0.02 -0.1 0.13 0.29 0.01 -0.16 0.15 0.28 
?  40 0 -0.15 0.14 0.33 0.01 -0.11 0.14 0.34 0.01 -0.14 0.15 0.29 
? 60 20 0.02 -0.09 0.11 0.29 0.01 -0.08 0.13 0.3 0.02 -0.09 0.13 0.34 
?  40 0.01 -0.1 0.1 0.31 0 -0.11 0.12 0.41 0 -0.11 0.11 0.37 
? 90 20 0.01 -0.08 0.09 0.26 0.01 -0.06 0.1 0.36 0.01 -0.1 0.09 0.3 
?  40 0.01 -0.07 0.1 0.38 0.01 -0.09 0.11 0.36 0.01 -0.07 0.09 0.37 
4? 30 20 0.01 -0.15 0.16 0.41 0 -0.15 0.14 0.38 0.02 -0.11 0.17 0.42 
?  40 -0.01 -0.17 0.13 0.46 0 -0.15 0.13 0.46 0 -0.12 0.13 0.42 
? 60 20 0.01 -0.11 0.11 0.36 0 -0.13 0.13 0.38 0 -0.13 0.11 0.35 
?  40 0.02 -0.09 0.12 0.39 0.01 -0.09 0.13 0.43 0.01 -0.09 0.11 0.42 
? 90 20 0.01 -0.06 0.1 0.29 0.01 -0.08 0.09 0.4 0.01 -0.08 0.09 0.41 
?  40 0.01 -0.07 0.08 0.41 0.01 -0.08 0.08 0.41 0.01 -0.07 0.09 0.49 
 
       
133 
 
Appendix A.14: Bias and Error of group estimates: Cluster Effect 5 (CE5) for the true 
model 
 
?   Cluster Type 
?   1 2 3 
?   Bias Error Bias Error Bias Error 
Mi
 xtu
 re 
Pro
 po
 rtio
 n 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
1? 30 20 0.05 -0.21 0.35 0.8 0.01 -0.26 0.35 0.83 -0.04 -0.31 0.29 0.66 
?  40 0.06 -0.19 0.32 0.82 0.01 -0.29 0.27 1.21 -0.02 -0.31 0.21 0.79 
? 60 20 0.06 -0.08 0.19 0.92 0.02 -0.14 0.18 0.97 -0.01 -0.2 0.13 0.89 
?  40 0.07 -0.11 0.33 0.96 0.01 -0.2 0.23 1.09 -0.02 -0.23 0.19 0.92 
? 90 20 0.05 -0.09 0.2 0.95 0.01 -0.14 0.14 0.85 -0.01 -0.2 0.13 0.88 
?  40 0.06 -0.04 0.23 1.07 0.01 -0.09 0.16 1.15 -0.01 -0.18 0.11 0.63 
2? 30 20 0.07 -0.13 0.37 0.54 0.06 -0.15 0.32 0.53 0.03 -0.22 0.31 0.61 
?  40 0.08 -0.09 0.31 0.36 0.03 -0.22 0.21 0.55 0.01 -0.25 0.17 0.66 
? 60 20 0.07 -0.06 0.24 0.41 0.01 -0.16 0.17 0.61 -0.01 -0.17 0.13 0.64 
?  40 0.06 -0.09 0.2 0.5 0.01 -0.16 0.14 0.62 -0.02 -0.21 0.13 0.68 
? 90 20 0.06 -0.06 0.21 0.59 0.01 -0.15 0.17 0.56 -0.02 -0.19 0.11 0.7 
?  40 0.06 -0.04 0.19 0.44 0.02 -0.1 0.11 0.68 0 -0.15 0.1 0.74 
3? 30 20 0.02 -0.21 0.24 0.63 0.01 -0.24 0.27 0.58 0 -0.22 0.24 0.64 
?  40 0.03 -0.19 0.25 0.81 0.04 -0.19 0.25 0.84 0.03 -0.16 0.3 0.69 
? 60 20 0.02 -0.17 0.24 0.64 0.03 -0.19 0.23 0.65 0.03 -0.18 0.25 0.58 
?  40 0.02 -0.16 0.21 0.73 0.03 -0.18 0.2 0.8 0.03 -0.16 0.2 0.76 
? 90 20 0.03 -0.11 0.16 0.74 0.03 -0.12 0.18 0.76 0.03 -0.14 0.18 0.77 
?  40 0.03 -0.11 0.17 0.65 0.02 -0.13 0.18 0.76 0.02 -0.13 0.16 0.78 
4? 30 20 0.02 -0.32 0.35 0.87 0.02 -0.3 0.28 0.8 0.03 -0.28 0.32 0.95 
?  40 0.04 -0.23 0.29 1.01 0.05 -0.25 0.33 0.96 0.05 -0.27 0.32 1.06 
? 60 20 0.02 -0.23 0.22 1.01 0.02 -0.25 0.25 0.91 0.02 -0.22 0.22 0.97 
?  40 0.02 -0.16 0.18 1.16 0.02 -0.14 0.19 0.97 0.01 -0.13 0.19 1.1 
? 90 20 0.03 -0.1 0.17 1.01 0.03 -0.11 0.2 1.16 0.03 -0.1 0.19 0.95 
?  40 0.01 -0.16 0.17 0.79 0.01 -0.14 0.16 0.94 0.01 -0.14 0.16 0.98 
 
       
134 
 
Appendix A.15: Bias and Error of group estimates: Mixture Proportion 1 (MP1) and 
Cluster Effect 1 (CE1) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.47 -0.67 -0.21 0.14 -0.33 -0.6 -0.06 0.16 -0.18 -0.44 0.03 0.14 
  -0.5 -0.18 -0.47 0.06 0.16 -0.19 -0.42 0.01 0.14 -0.15 -0.42 0.09 0.16 
  0 0.53 0.33 0.79 0.14 0.67 0.4 0.94 0.16 -0.07 -0.32 0.14 0.15 
  0.5 0.36 0.09 0.62 0.15 0.17 -0.04 0.41 0.15 -0.04 -0.32 0.18 0.15 
  1 0.65 0.38 0.89 0.15 0.32 0.05 0.56 0.15 -0.01 -0.25 0.23 0.15 
 40 -1 -0.41 -0.58 -0.21 0.11 -0.28 -0.45 -0.12 0.11 -0.15 -0.3 0.04 0.1 
  -0.5 -0.16 -0.35 0.02 0.12 -0.15 -0.34 0.02 0.12 -0.15 -0.34 0.03 0.12 
  0 0.59 0.42 0.79 0.11 0.72 0.55 0.88 0.11 -0.11 -0.32 0.07 0.12 
  0.5 0.36 0.19 0.54 0.11 0.14 -0.05 0.34 0.12 -0.08 -0.24 0.1 0.11 
  1 0.63 0.44 0.84 0.12 0.3 0.12 0.49 0.11 -0.06 -0.26 0.15 0.12 
60 20 -1 -0.47 -0.76 -0.16 0.17 -0.34 -0.56 -0.11 0.15 -0.18 -0.46 0.08 0.18 
  -0.5 -0.2 -0.49 0.02 0.16 -0.17 -0.44 0.05 0.15 -0.16 -0.4 0.1 0.15 
  0 0.53 0.24 0.84 0.17 0.66 0.44 0.89 0.15 -0.07 -0.32 0.15 0.15 
  0.5 0.37 0.12 0.63 0.15 0.17 -0.08 0.41 0.15 -0.04 -0.32 0.2 0.16 
  1 0.63 0.4 0.9 0.14 0.32 0.02 0.6 0.18 0 -0.28 0.26 0.17 
 40 -1 -0.41 -0.6 -0.23 0.12 -0.29 -0.47 -0.1 0.12 -0.16 -0.31 0.02 0.11 
  -0.5 -0.15 -0.4 0.07 0.13 -0.13 -0.28 0.03 0.1 -0.13 -0.32 0.08 0.12 
  0 0.59 0.4 0.77 0.12 0.71 0.53 0.9 0.12 -0.1 -0.31 0.1 0.12 
  0.5 0.36 0.14 0.53 0.11 0.14 -0.04 0.32 0.12 -0.08 -0.25 0.13 0.12 
  1 0.63 0.45 0.84 0.11 0.3 0.1 0.49 0.12 -0.06 -0.23 0.15 0.12 
90 20 -1 -0.47 -0.72 -0.22 0.14 -0.3 -0.56 -0.07 0.16 -0.2 -0.46 0.04 0.16 
  -0.5 -0.18 -0.41 0.06 0.15 -0.13 -0.33 0.09 0.14 -0.16 -0.41 0.12 0.16 
  0 0.53 0.28 0.78 0.14 0.7 0.44 0.93 0.16 -0.09 -0.35 0.23 0.17 
  0.5 0.37 0.1 0.64 0.16 0.18 -0.1 0.43 0.16 -0.05 -0.3 0.2 0.16 
  1 0.64 0.39 0.86 0.15 0.32 0.05 0.61 0.16 0.01 -0.25 0.31 0.17 
 40 -1 -0.43 -0.66 -0.21 0.13 -0.31 -0.48 -0.09 0.12 -0.15 -0.33 0.01 0.11 
  -0.5 -0.18 -0.35 0.02 0.11 -0.15 -0.36 0.04 0.11 -0.14 -0.31 0.04 0.11 
  0 0.57 0.34 0.79 0.13 0.69 0.52 0.91 0.12 -0.11 -0.33 0.08 0.12 
  0.5 0.37 0.19 0.56 0.11 0.14 -0.04 0.33 0.11 -0.1 -0.31 0.08 0.12 
  1 0.65 0.45 0.87 0.12 0.28 0.06 0.46 0.12 -0.05 -0.24 0.12 0.1 
 
  
       
135 
 
Appendix A.16: Bias and Error of group estimates: Mixture Proportion 2 (MP2) and 
Cluster Effect 1 (CE1) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.47 -0.72 -0.22 0.14 -0.3 -0.56 -0.07 0.16 -0.2 -0.46 0.04 0.16 
  -0.5 -0.18 -0.41 0.06 0.15 -0.13 -0.33 0.09 0.14 -0.16 -0.41 0.12 0.16 
  0 0.53 0.28 0.78 0.14 0.7 0.44 0.93 0.16 -0.09 -0.35 0.23 0.17 
  0.5 0.37 0.1 0.64 0.16 0.18 -0.1 0.43 0.16 -0.05 -0.3 0.2 0.16 
  1 0.64 0.39 0.86 0.15 0.32 0.05 0.61 0.16 0.01 -0.25 0.31 0.17 
 40 -1 -0.43 -0.66 -0.21 0.13 -0.31 -0.48 -0.09 0.12 -0.15 -0.33 0.01 0.11 
  -0.5 -0.18 -0.35 0.02 0.11 -0.15 -0.36 0.04 0.11 -0.14 -0.31 0.04 0.11 
  0 0.57 0.34 0.79 0.13 0.69 0.52 0.91 0.12 -0.11 -0.33 0.08 0.12 
  0.5 0.37 0.19 0.56 0.11 0.14 -0.04 0.33 0.11 -0.1 -0.31 0.08 0.12 
  1 0.65 0.45 0.87 0.12 0.28 0.06 0.46 0.12 -0.05 -0.24 0.12 0.1 
60 20 -1 -0.72 -0.93 -0.53 0.12 -0.62 -0.86 -0.41 0.13 -0.48 -0.71 -0.23 0.15 
  -0.5 -0.32 -0.5 -0.12 0.12 -0.3 -0.51 -0.03 0.14 -0.3 -0.52 -0.09 0.14 
  0 0.28 0.07 0.47 0.12 0.38 0.14 0.59 0.13 0.52 0.29 0.77 0.15 
  0.5 0.47 0.22 0.69 0.14 0.27 0.03 0.45 0.14 0.14 -0.09 0.34 0.13 
  1 0.88 0.66 1.09 0.12 0.62 0.42 0.86 0.14 0.3 0.07 0.54 0.15 
 40 -1 -0.68 -0.88 -0.47 0.12 -0.54 -0.67 -0.38 0.09 -0.42 -0.61 -0.22 0.11 
  -0.5 -0.29 -0.47 -0.11 0.11 -0.27 -0.45 -0.12 0.09 -0.25 -0.43 -0.04 0.11 
  0 0.32 0.12 0.53 0.12 0.46 0.33 0.62 0.09 0.58 0.39 0.78 0.11 
  0.5 0.49 0.32 0.67 0.11 0.29 0.1 0.46 0.11 0.07 -0.13 0.27 0.12 
  1 0.87 0.67 1.06 0.12 0.55 0.32 0.69 0.11 0.23 -0.01 0.42 0.12 
90 20 -1 -0.73 -0.97 -0.51 0.14 -0.62 -0.87 -0.36 0.17 -0.5 -0.71 -0.27 0.13 
  -0.5 -0.33 -0.55 -0.11 0.13 -0.29 -0.47 -0.08 0.12 -0.26 -0.49 0 0.14 
  0 0.27 0.03 0.49 0.14 0.38 0.13 0.64 0.17 0.5 0.29 0.73 0.13 
  0.5 0.48 0.25 0.68 0.14 0.31 0.05 0.57 0.16 0.14 -0.09 0.37 0.15 
  1 0.87 0.68 1.08 0.13 0.6 0.44 0.77 0.11 0.3 0.04 0.53 0.15 
 40 -1 -0.69 -0.89 -0.53 0.11 -0.54 -0.71 -0.36 0.1 -0.43 -0.58 -0.26 0.1 
  -0.5 -0.29 -0.45 -0.11 0.11 -0.27 -0.49 -0.1 0.12 -0.27 -0.45 -0.08 0.11 
  0 0.31 0.11 0.47 0.11 0.46 0.29 0.64 0.1 0.57 0.42 0.74 0.1 
  0.5 0.49 0.32 0.62 0.09 0.27 0.06 0.45 0.12 0.07 -0.13 0.24 0.11 
  1 0.87 0.69 1.03 0.1 0.57 0.39 0.74 0.11 0.24 0.05 0.44 0.11 
 
  
       
136 
 
Appendix A.17: Bias and Error of group estimates: Mixture Proportion 3 (MP3) and 
Cluster Effect 1 (CE1) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.73 -0.97 -0.51 0.14 -0.62 -0.87 -0.36 0.17 -0.5 -0.71 -0.27 0.13 
  -0.5 -0.33 -0.55 -0.11 0.13 -0.29 -0.47 -0.08 0.12 -0.26 -0.49 0 0.14 
  0 0.27 0.03 0.49 0.14 0.38 0.13 0.64 0.17 0.5 0.29 0.73 0.13 
  0.5 0.48 0.25 0.68 0.14 0.31 0.05 0.57 0.16 0.14 -0.09 0.37 0.15 
  1 0.87 0.68 1.08 0.13 0.6 0.44 0.77 0.11 0.3 0.04 0.53 0.15 
 40 -1 -0.69 -0.89 -0.53 0.11 -0.54 -0.71 -0.36 0.1 -0.43 -0.58 -0.26 0.1 
  -0.5 -0.29 -0.45 -0.11 0.11 -0.27 -0.49 -0.1 0.12 -0.27 -0.45 -0.08 0.11 
  0 0.31 0.11 0.47 0.11 0.46 0.29 0.64 0.1 0.57 0.42 0.74 0.1 
  0.5 0.49 0.32 0.62 0.09 0.27 0.06 0.45 0.12 0.07 -0.13 0.24 0.11 
  1 0.87 0.69 1.03 0.1 0.57 0.39 0.74 0.11 0.24 0.05 0.44 0.11 
60 20 -1 -0.72 -0.93 -0.51 0.13 -0.59 -0.8 -0.32 0.14 -0.49 -0.69 -0.24 0.14 
  -0.5 -0.32 -0.53 -0.1 0.13 -0.28 -0.48 -0.09 0.13 -0.26 -0.48 -0.02 0.14 
  0 0.28 0.07 0.49 0.13 0.41 0.2 0.68 0.14 0.51 0.31 0.76 0.14 
  0.5 0.48 0.25 0.69 0.13 0.29 0.1 0.47 0.12 0.14 -0.1 0.31 0.11 
  1 0.87 0.66 1.08 0.14 0.61 0.35 0.84 0.15 0.3 0.06 0.53 0.14 
 40 -1 -0.67 -0.88 -0.5 0.11 -0.57 -0.74 -0.4 0.09 -0.43 -0.59 -0.29 0.1 
  -0.5 -0.3 -0.5 -0.12 0.11 -0.26 -0.47 -0.09 0.12 -0.28 -0.47 -0.13 0.11 
  0 0.33 0.12 0.5 0.11 0.43 0.27 0.6 0.09 0.57 0.41 0.71 0.1 
  0.5 0.5 0.32 0.68 0.11 0.3 0.13 0.49 0.11 0.06 -0.12 0.24 0.11 
  1 0.87 0.69 1.06 0.11 0.55 0.4 0.74 0.1 0.24 0.01 0.43 0.11 
90 20 -1 -0.64 -0.84 -0.4 0.13 -0.64 -0.86 -0.43 0.13 -0.63 -0.82 -0.43 0.13 
  -0.5 -0.32 -0.51 -0.1 0.13 -0.32 -0.5 -0.12 0.12 -0.32 -0.54 -0.08 0.14 
  0 0.36 0.16 0.6 0.13 0.36 0.14 0.57 0.13 0.37 0.18 0.57 0.13 
  0.5 0.33 0.12 0.53 0.12 0.35 0.14 0.57 0.13 0.31 0.12 0.51 0.13 
  1 0.63 0.38 0.87 0.14 0.62 0.39 0.85 0.14 0.62 0.37 0.82 0.13 
 40 -1 -0.57 -0.75 -0.37 0.11 -0.56 -0.72 -0.4 0.09 -0.55 -0.75 -0.39 0.11 
  -0.5 -0.28 -0.43 -0.15 0.1 -0.29 -0.48 -0.1 0.12 -0.28 -0.46 -0.11 0.1 
  0 0.43 0.25 0.63 0.11 0.44 0.28 0.6 0.09 0.45 0.25 0.61 0.11 
  0.5 0.28 0.11 0.46 0.1 0.28 0.11 0.45 0.11 0.29 0.13 0.46 0.1 
  1 0.57 0.39 0.76 0.11 0.54 0.34 0.72 0.12 0.56 0.37 0.79 0.11 
 
  
       
137 
 
Appendix A.18: Bias and Error of group estimates: Mixture Proportion 4 (MP4) and 
Cluster Effect 1 (CE1) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.64 -0.84 -0.4 0.13 -0.64 -0.86 -0.43 0.13 -0.63 -0.82 -0.43 0.13 
  -0.5 -0.32 -0.51 -0.1 0.13 -0.32 -0.5 -0.12 0.12 -0.32 -0.54 -0.08 0.14 
  0 0.36 0.16 0.6 0.13 0.36 0.14 0.57 0.13 0.37 0.18 0.57 0.13 
  0.5 0.33 0.12 0.53 0.12 0.35 0.14 0.57 0.13 0.31 0.12 0.51 0.13 
  1 0.63 0.38 0.87 0.14 0.62 0.39 0.85 0.14 0.62 0.37 0.82 0.13 
 40 -1 -0.57 -0.75 -0.37 0.11 -0.56 -0.72 -0.4 0.09 -0.55 -0.75 -0.39 0.11 
  -0.5 -0.28 -0.43 -0.15 0.1 -0.29 -0.48 -0.1 0.12 -0.28 -0.46 -0.11 0.1 
  0 0.43 0.25 0.63 0.11 0.44 0.28 0.6 0.09 0.45 0.25 0.61 0.11 
  0.5 0.28 0.11 0.46 0.1 0.28 0.11 0.45 0.11 0.29 0.13 0.46 0.1 
  1 0.57 0.39 0.76 0.11 0.54 0.34 0.72 0.12 0.56 0.37 0.79 0.11 
60 20 -1 -0.61 -0.84 -0.41 0.13 -0.63 -0.85 -0.4 0.13 -0.61 -0.8 -0.4 0.13 
  -0.5 -0.28 -0.49 -0.05 0.14 -0.34 -0.58 -0.09 0.15 -0.32 -0.54 -0.03 0.15 
  0 0.39 0.16 0.59 0.13 0.37 0.15 0.61 0.13 0.39 0.2 0.6 0.13 
  0.5 0.31 0.09 0.53 0.13 0.32 0.04 0.52 0.15 0.3 0.1 0.54 0.13 
  1 0.6 0.36 0.83 0.14 0.62 0.39 0.86 0.14 0.63 0.38 0.84 0.14 
 40 -1 -0.57 -0.72 -0.41 0.1 -0.56 -0.72 -0.41 0.1 -0.55 -0.74 -0.39 0.11 
  -0.5 -0.26 -0.48 -0.11 0.12 -0.29 -0.46 -0.11 0.11 -0.28 -0.46 -0.11 0.11 
  0 0.43 0.28 0.59 0.1 0.44 0.28 0.59 0.1 0.45 0.26 0.61 0.11 
  0.5 0.3 0.15 0.52 0.11 0.3 0.12 0.46 0.1 0.26 0.11 0.46 0.11 
  1 0.59 0.42 0.74 0.1 0.57 0.4 0.77 0.12 0.56 0.4 0.74 0.1 
90 20 -1 -0.61 -0.82 -0.39 0.13 -0.61 -0.86 -0.38 0.14 -0.63 -0.84 -0.38 0.15 
  -0.5 -0.31 -0.54 -0.12 0.13 -0.29 -0.52 -0.09 0.14 -0.31 -0.53 -0.08 0.14 
  0 0.39 0.18 0.61 0.13 0.39 0.14 0.62 0.14 0.37 0.16 0.62 0.15 
  0.5 0.29 0.07 0.47 0.13 0.33 0.09 0.53 0.12 0.31 0.1 0.52 0.13 
  1 0.64 0.4 0.91 0.14 0.64 0.42 0.86 0.13 0.6 0.41 0.83 0.13 
 40 -1 -0.55 -0.7 -0.4 0.09 -0.55 -0.7 -0.37 0.1 -0.55 -0.74 -0.35 0.12 
  -0.5 -0.28 -0.43 -0.1 0.1 -0.29 -0.45 -0.12 0.11 -0.26 -0.44 -0.07 0.11 
  0 0.45 0.3 0.6 0.09 0.45 0.3 0.63 0.1 0.45 0.26 0.65 0.12 
  0.5 0.28 0.12 0.49 0.11 0.3 0.14 0.48 0.11 0.29 0.13 0.43 0.1 
  1 0.55 0.34 0.72 0.11 0.56 0.41 0.71 0.09 0.58 0.44 0.79 0.11 
 
  
       
138 
 
Appendix A.19: Bias and Error of group estimates: Mixture Proportion 1 (MP1) and 
Cluster Effect 2 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.34 -0.63 -0.09 0.17 0 0 0 0 0 0 0 0 
  -0.5 -0.05 -0.26 0.19 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.14 -0.11 0.4 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.04 -0.25 0.29 0.17 
  1 0 0 0 0 0 0 0 0 0.08 -0.15 0.32 0.15 
 40 -1 -0.3 -0.49 -0.1 0.13 0 0 0 0 0 0 0 0 
  -0.5 -0.03 -0.28 0.19 0.13 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.11 -0.08 0.29 0.11 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.04 -0.11 0.21 0.1 
  1 0 0 0 0 0 0 0 0 0.05 -0.11 0.2 0.1 
60 20 -1 -0.31 -0.63 -0.07 0.17 0 0 0 0 0 0 0 0 
  -0.5 -0.05 -0.32 0.22 0.16 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.12 -0.14 0.37 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.06 -0.19 0.36 0.18 
  1 0 0 0 0 0 0 0 0 0.08 -0.22 0.33 0.17 
 40 -1 -0.29 -0.5 -0.14 0.11 0 0 0 0 0 0 0 0 
  -0.5 -0.02 -0.22 0.16 0.12 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.13 -0.06 0.31 0.12 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.02 -0.17 0.19 0.11 
  1 0 0 0 0 0 0 0 0 0.04 -0.13 0.21 0.11 
90 20 -1 -0.33 -0.55 -0.05 0.15 0 0 0 0 0 0 0 0 
  -0.5 -0.06 -0.29 0.18 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.11 -0.12 0.38 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.04 -0.22 0.25 0.13 
  1 0 0 0 0 0 0 0 0 0.06 -0.2 0.37 0.17 
 40 -1 -0.28 -0.45 -0.09 0.12 0 0 0 0 0 0 0 0 
  -0.5 -0.01 -0.18 0.16 0.1 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.13 -0.08 0.34 0.12 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.03 -0.14 0.22 0.11 
  1 0 0 0 0 0 0 0 0 0.04 -0.16 0.22 0.12 
 
  
       
139 
 
Appendix A.20: Bias and Error of group estimates: Mixture Proportion 1 (MP2) and 
Cluster Effect 2 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.33 -0.55 -0.05 0.15 0 0 0 0 0 0 0 0 
  -0.5 -0.06 -0.29 0.18 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.11 -0.12 0.38 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.04 -0.22 0.25 0.13 
  1 0 0 0 0 0 0 0 0 0.06 -0.2 0.37 0.17 
 40 -1 -0.28 -0.45 -0.09 0.12 0 0 0 0 0 0 0 0 
  -0.5 -0.01 -0.18 0.16 0.1 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.13 -0.08 0.34 0.12 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.03 -0.14 0.22 0.11 
  1 0 0 0 0 0 0 0 0 0.04 -0.16 0.22 0.12 
60 20 -1 -0.62 -0.84 -0.38 0.14 0 0 0 0 0 0 0 0 
  -0.5 -0.21 -0.48 0.05 0.16 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.1 -0.11 0.3 0.13 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.21 0 0.46 0.15 
  1 0 0 0 0 0 0 0 0 0.43 0.17 0.68 0.15 
 40 -1 -0.57 -0.73 -0.43 0.1 0 0 0 0 0 0 0 0 
  -0.5 -0.15 -0.33 0.02 0.11 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.13 -0.05 0.3 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.17 -0.01 0.37 0.12 
  1 0 0 0 0 0 0 0 0 0.32 0.15 0.5 0.11 
90 20 -1 -0.6 -0.87 -0.3 0.17 0 0 0 0 0 0 0 0 
  -0.5 -0.23 -0.52 0.01 0.16 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.1 -0.17 0.35 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.24 -0.02 0.46 0.14 
  1 0 0 0 0 0 0 0 0 0.43 0.21 0.64 0.14 
 40 -1 -0.57 -0.77 -0.34 0.12 0 0 0 0 0 0 0 0 
  -0.5 -0.17 -0.36 0.01 0.11 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.13 -0.03 0.27 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.18 0.02 0.37 0.11 
  1 0 0 0 0 0 0 0 0 0.35 0.18 0.52 0.11 
 
  
       
140 
 
Appendix A.21: Bias and Error of group estimates: Mixture Proportion 1 (MP3) and 
Cluster Effect 2 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.6 -0.87 -0.3 0.17 0 0 0 0 0 0 0 0 
  -0.5 -0.23 -0.52 0.01 0.16 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.1 -0.17 0.35 0.16 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.24 -0.02 0.46 0.14 
  1 0 0 0 0 0 0 0 0 0.43 0.21 0.64 0.14 
 40 -1 -0.57 -0.77 -0.34 0.12 0 0 0 0 0 0 0 0 
  -0.5 -0.17 -0.36 0.01 0.11 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.13 -0.03 0.27 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.18 0.02 0.37 0.11 
  1 0 0 0 0 0 0 0 0 0.35 0.18 0.52 0.11 
60 20 -1 -0.63 -0.9 -0.42 0.14 0 0 0 0 0 0 0 0 
  -0.5 -0.22 -0.44 0.06 0.15 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.11 -0.1 0.36 0.14 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.22 -0.04 0.46 0.14 
  1 0 0 0 0 0 0 0 0 0.42 0.17 0.67 0.15 
 40 -1 -0.57 -0.75 -0.4 0.11 0 0 0 0 0 0 0 0 
  -0.5 -0.17 -0.35 -0.01 0.11 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.11 -0.08 0.32 0.11 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.19 0.03 0.36 0.1 
  1 0 0 0 0 0 0 0 0 0.34 0.15 0.55 0.12 
90 20 -1 -0.63 -0.85 -0.4 0.14 0 0 0 0 0 0 0 0 
  -0.5 -0.32 -0.48 -0.09 0.12 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0 -0.2 0.19 0.12 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.32 0.06 0.56 0.14 
  1 0 0 0 0 0 0 0 0 0.64 0.43 0.84 0.13 
 40 -1 -0.59 -0.78 -0.43 0.11 0 0 0 0 0 0 0 0 
  -0.5 -0.28 -0.44 -0.1 0.09 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.03 -0.19 0.14 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.3 0.14 0.46 0.1 
  1 0 0 0 0 0 0 0 0 0.57 0.41 0.78 0.11 
 
  
       
141 
 
Appendix A.22: Bias and Error of group estimates: Mixture Proportion 1 (MP4) and 
Cluster Effect 2 (CE2) for Misspecified model  
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 -0.63 -0.85 -0.4 0.14 0 0 0 0 0 0 0 0 
  -0.5 -0.32 -0.48 -0.09 0.12 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0 -0.2 0.19 0.12 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.32 0.06 0.56 0.14 
  1 0 0 0 0 0 0 0 0 0.64 0.43 0.84 0.13 
 40 -1 -0.59 -0.78 -0.43 0.11 0 0 0 0 0 0 0 0 
  -0.5 -0.28 -0.44 -0.1 0.09 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.03 -0.19 0.14 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.3 0.14 0.46 0.1 
  1 0 0 0 0 0 0 0 0 0.57 0.41 0.78 0.11 
60 20 -1 -0.62 -0.8 -0.37 0.13 0 0 0 0 0 0 0 0 
  -0.5 -0.33 -0.57 -0.1 0.14 0 0 0 0 0 0 0 0 
  0 0 0 0 0 -0.01 -0.25 0.23 0.14 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.3 0.1 0.53 0.13 
  1 0 0 0 0 0 0 0 0 0.61 0.41 0.82 0.13 
 40 -1 -0.58 -0.74 -0.42 0.1 0 0 0 0 0 0 0 0 
  -0.5 -0.27 -0.46 -0.07 0.11 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.01 -0.16 0.16 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.29 0.14 0.47 0.1 
  1 0 0 0 0 0 0 0 0 0.56 0.34 0.74 0.12 
90 20 -1 -0.62 -0.85 -0.38 0.14 0 0 0 0 0 0 0 0 
  -0.5 -0.33 -0.54 -0.11 0.13 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0.04 -0.19 0.24 0.14 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.34 0.19 0.49 0.1 
  1 0 0 0 0 0 0 0 0 0.63 0.44 0.83 0.12 
 40 -1 -0.59 -0.75 -0.42 0.1 0 0 0 0 0 0 0 0 
  -0.5 -0.28 -0.44 -0.06 0.11 0 0 0 0 0 0 0 0 
  0 0 0 0 0 0 -0.15 0.15 0.1 0 0 0 0 
  0.5 0 0 0 0 0 0 0 0 0.29 0.1 0.48 0.11 
  1 0 0 0 0 0 0 0 0 0.57 0.4 0.74 0.1 
 
  
       
142 
 
Appendix A.23: Bias and Error of group estimates: Mixture Proportion 1 (MP1) and 
Cluster Effect 3 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.37 -0.59 -0.13 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.27 -0.45 -0.08 0.12 
  0 0 0 0 0 -0.08 -0.34 0.13 0.14 0 0 0 0 
  0.5 0.25 0.04 0.49 0.14 0 0 0 0 0 0 0 0 
  1 0.55 0.28 0.8 0.15 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.32 -0.49 -0.16 0.11 
  -0.5 0 0 0 0 0 0 0 0 -0.24 -0.42 -0.04 0.11 
  0 0 0 0 0 -0.11 -0.28 0.07 0.1 0 0 0 0 
  0.5 0.25 0.09 0.46 0.1 0 0 0 0 0 0 0 0 
  1 0.52 0.36 0.68 0.1 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.36 -0.56 -0.13 0.13 
  -0.5 0 0 0 0 0 0 0 0 -0.28 -0.49 -0.07 0.13 
  0 0 0 0 0 -0.08 -0.3 0.13 0.13 0 0 0 0 
  0.5 0.28 0.06 0.51 0.14 0 0 0 0 0 0 0 0 
  1 0.55 0.34 0.78 0.14 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.31 -0.5 -0.13 0.12 
  -0.5 0 0 0 0 0 0 0 0 -0.25 -0.46 -0.08 0.12 
  0 0 0 0 0 -0.12 -0.28 0.06 0.12 0 0 0 0 
  0.5 0.26 0.06 0.44 0.11 0 0 0 0 0 0 0 0 
  1 0.53 0.39 0.69 0.1 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.38 -0.67 -0.08 0.15 
  -0.5 0 0 0 0 0 0 0 0 -0.25 -0.45 0.01 0.14 
  0 0 0 0 0 -0.12 -0.39 0.18 0.15 0 0 0 0 
  0.5 0.25 0.03 0.5 0.14 0 0 0 0 0 0 0 0 
  1 0.56 0.28 0.79 0.15 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.32 -0.53 -0.11 0.13 
  -0.5 0 0 0 0 0 0 0 0 -0.25 -0.46 -0.03 0.12 
  0 0 0 0 0 -0.12 -0.3 0.06 0.11 0 0 0 0 
  0.5 0.27 0.06 0.46 0.12 0 0 0 0 0 0 0 0 
  1 0.55 0.38 0.72 0.12 0 0 0 0 0 0 0 0 
 
  
       
143 
 
Appendix A.24: Bias and Error of group estimates: Mixture Proportion 1 (MP2) and 
Cluster Effect 3 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.38 -0.67 -0.08 0.15 
  -0.5 0 0 0 0 0 0 0 0 -0.25 -0.45 0.01 0.14 
  0 0 0 0 0 -0.12 -0.39 0.18 0.15 0 0 0 0 
  0.5 0.25 0.03 0.5 0.14 0 0 0 0 0 0 0 0 
  1 0.56 0.28 0.79 0.15 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.32 -0.53 -0.11 0.13 
  -0.5 0 0 0 0 0 0 0 0 -0.25 -0.46 -0.03 0.12 
  0 0 0 0 0 -0.12 -0.3 0.06 0.11 0 0 0 0 
  0.5 0.27 0.06 0.46 0.12 0 0 0 0 0 0 0 0 
  1 0.55 0.38 0.72 0.12 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.65 -0.82 -0.45 0.12 
  -0.5 0 0 0 0 0 0 0 0 -0.41 -0.59 -0.25 0.12 
  0 0 0 0 0 -0.09 -0.28 0.11 0.12 0 0 0 0 
  0.5 0.4 0.2 0.61 0.12 0 0 0 0 0 0 0 0 
  1 0.81 0.64 0.99 0.11 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.58 -0.74 -0.42 0.1 
  -0.5 0 0 0 0 0 0 0 0 -0.39 -0.55 -0.23 0.1 
  0 0 0 0 0 -0.1 -0.23 0.02 0.08 0 0 0 0 
  0.5 0.4 0.18 0.57 0.11 0 0 0 0 0 0 0 0 
  1 0.78 0.6 0.94 0.09 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.66 -0.85 -0.46 0.12 
  -0.5 0 0 0 0 0 0 0 0 -0.41 -0.58 -0.21 0.11 
  0 0 0 0 0 -0.07 -0.26 0.12 0.11 0 0 0 0 
  0.5 0.4 0.2 0.59 0.12 0 0 0 0 0 0 0 0 
  1 0.83 0.63 1.03 0.13 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.59 -0.75 -0.44 0.09 
  -0.5 0 0 0 0 0 0 0 0 -0.39 -0.52 -0.26 0.08 
  0 0 0 0 0 -0.09 -0.24 0.08 0.1 0 0 0 0 
  0.5 0.39 0.24 0.53 0.09 0 0 0 0 0 0 0 0 
  1 0.8 0.64 0.93 0.09 0 0 0 0 0 0 0 0 
 
  
       
144 
 
Appendix A.25: Bias and Error of group estimates: Mixture Proportion 1 (MP3) and 
Cluster Effect 3 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.66 -0.85 -0.46 0.12 
  -0.5 0 0 0 0 0 0 0 0 -0.41 -0.58 -0.21 0.11 
  0 0 0 0 0 -0.07 -0.26 0.12 0.11 0 0 0 0 
  0.5 0.4 0.2 0.59 0.12 0 0 0 0 0 0 0 0 
  1 0.83 0.63 1.03 0.13 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.59 -0.75 -0.44 0.09 
  -0.5 0 0 0 0 0 0 0 0 -0.39 -0.52 -0.26 0.08 
  0 0 0 0 0 -0.09 -0.24 0.08 0.1 0 0 0 0 
  0.5 0.39 0.24 0.53 0.09 0 0 0 0 0 0 0 0 
  1 0.8 0.64 0.93 0.09 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.67 -0.84 -0.51 0.11 
  -0.5 0 0 0 0 0 0 0 0 -0.4 -0.59 -0.19 0.12 
  0 0 0 0 0 -0.06 -0.25 0.15 0.11 0 0 0 0 
  0.5 0.43 0.25 0.59 0.1 0 0 0 0 0 0 0 0 
  1 0.83 0.64 1.04 0.12 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.61 -0.76 -0.45 0.1 
  -0.5 0 0 0 0 0 0 0 0 -0.4 -0.54 -0.24 0.09 
  0 0 0 0 0 -0.1 -0.26 0.07 0.09 0 0 0 0 
  0.5 0.38 0.24 0.51 0.09 0 0 0 0 0 0 0 0 
  1 0.8 0.65 0.98 0.1 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.64 -0.84 -0.42 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.31 -0.51 -0.09 0.13 
  0 0 0 0 0 0.01 -0.24 0.2 0.12 0 0 0 0 
  0.5 0.31 0.12 0.51 0.12 0 0 0 0 0 0 0 0 
  1 0.65 0.44 0.91 0.14 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.57 -0.73 -0.39 0.11 
  -0.5 0 0 0 0 0 0 0 0 -0.3 -0.48 -0.14 0.1 
  0 0 0 0 0 0.02 -0.13 0.18 0.09 0 0 0 0 
  0.5 0.27 0.08 0.45 0.11 0 0 0 0 0 0 0 0 
  1 0.57 0.4 0.72 0.11 0 0 0 0 0 0 0 0 
 
  
       
145 
 
Appendix A.26: Bias and Error of group estimates: Mixture Proportion 1 (MP4) and 
Cluster Effect 3 (CE2) for Misspecified model 
 
   Cluster Type 
   1 2 3 
   Bias Error Bias Error Bias Error
 Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Clu
 ste
 r E
 ffe
 ct 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
30 20 -1 0 0 0 0 0 0 0 0 -0.64 -0.84 -0.42 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.31 -0.51 -0.09 0.13 
  0 0 0 0 0 0.01 -0.24 0.2 0.12 0 0 0 0 
  0.5 0.31 0.12 0.51 0.12 0 0 0 0 0 0 0 0 
  1 0.65 0.44 0.91 0.14 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.57 -0.73 -0.39 0.11 
  -0.5 0 0 0 0 0 0 0 0 -0.3 -0.48 -0.14 0.1 
  0 0 0 0 0 0.02 -0.13 0.18 0.09 0 0 0 0 
  0.5 0.27 0.08 0.45 0.11 0 0 0 0 0 0 0 0 
  1 0.57 0.4 0.72 0.11 0 0 0 0 0 0 0 0 
60 20 -1 0 0 0 0 0 0 0 0 -0.63 -0.88 -0.42 0.13 
  -0.5 0 0 0 0 0 0 0 0 -0.33 -0.54 -0.1 0.14 
  0 0 0 0 0 0.02 -0.18 0.28 0.14 0 0 0 0 
  0.5 0.33 0.11 0.54 0.13 0 0 0 0 0 0 0 0 
  1 0.67 0.47 0.89 0.13 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.58 -0.8 -0.41 0.11 
  -0.5 0 0 0 0 0 0 0 0 -0.29 -0.45 -0.12 0.1 
  0 0 0 0 0 0 -0.13 0.16 0.09 0 0 0 0 
  0.5 0.3 0.15 0.48 0.11 0 0 0 0 0 0 0 0 
  1 0.56 0.4 0.7 0.09 0 0 0 0 0 0 0 0 
90 20 -1 0 0 0 0 0 0 0 0 -0.65 -0.88 -0.41 0.14 
  -0.5 0 0 0 0 0 0 0 0 -0.32 -0.55 -0.11 0.14 
  0 0 0 0 0 -0.02 -0.24 0.19 0.13 0 0 0 0 
  0.5 0.31 0.15 0.51 0.11 0 0 0 0 0 0 0 0 
  1 0.64 0.41 0.84 0.12 0 0 0 0 0 0 0 0 
 40 -1 0 0 0 0 0 0 0 0 -0.58 -0.75 -0.4 0.1 
  -0.5 0 0 0 0 0 0 0 0 -0.29 -0.46 -0.09 0.11 
  0 0 0 0 0 0 -0.19 0.15 0.11 0 0 0 0 
  0.5 0.28 0.13 0.46 0.1 0 0 0 0 0 0 0 0 
  1 0.58 0.4 0.75 0.1 0 0 0 0 0 0 0 0 
 
  
       
146 
 
Appendix A.27: Bias and Error of group estimates: Cluster Effect 4 (CE4) for 
Misspecified model 
 
?   Cluster Type 
?   1 2 3 
?   Bias Error Bias Error Bias Error
 Mi
 xtu
 re 
Pro
 po
 rtio
 n 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
1? 30 20 0.09 -0.01 0.21 0.22 0.02 -0.09 0.14 0.36 -0.08 -0.21 0.05 0.44 
?  40 0.09 -0.03 0.2 0.23 -0.01 -0.13 0.12 0.39 -0.12 -0.25 0.03 0.47 
? 60 20 0.09 0.01 0.17 0.26 0.02 -0.07 0.09 0.27 -0.08 -0.17 0.01 0.44 
?  40 0.11 0.04 0.18 0.23 0.01 -0.08 0.08 0.38 -0.09 -0.18 -0.01 0.47 
? 90 20 0.08 0.02 0.15 0.25 0 -0.07 0.07 0.39 -0.08 -0.17 0.01 0.41 
?  40 0.1 0.03 0.18 0.26 0 -0.06 0.08 0.4 -0.11 -0.18 -0.02 0.41 
2? 30 20 0.05 -0.03 0.13 0.15 0 -0.11 0.11 0.21 -0.06 -0.21 0.06 0.3 
?  40 0.09 0 0.17 0.15 0 -0.1 0.09 0.24 -0.09 -0.21 0.01 0.35 
? 60 20 0.06 0 0.14 0.13 0 -0.09 0.09 0.21 -0.06 -0.15 0.02 0.28 
?  40 0.08 0.02 0.13 0.15 0 -0.06 0.06 0.22 -0.09 -0.16 -0.02 0.3 
? 90 20 0.06 0.02 0.1 0.14 0 -0.06 0.05 0.2 -0.07 -0.13 0 0.3 
?  40 0.09 0.04 0.14 0.15 0 -0.05 0.05 0.23 -0.09 -0.16 -0.03 0.32 
3? 30 20 0 -0.09 0.08 0.18 0.01 -0.08 0.1 0.2 0 -0.11 0.09 0.19 
?  40 0 -0.09 0.09 0.22 0 -0.08 0.08 0.22 0 -0.09 0.09 0.22 
? 60 20 0 -0.06 0.07 0.2 0 -0.06 0.07 0.19 0.01 -0.06 0.08 0.21 
?  40 0 -0.07 0.07 0.23 0 -0.07 0.06 0.19 -0.01 -0.07 0.06 0.2 
? 90 20 0 -0.06 0.04 0.17 0 -0.05 0.06 0.18 0 -0.07 0.05 0.17 
?  40 0 -0.04 0.04 0.23 0 -0.07 0.06 0.2 0 -0.05 0.05 0.25 
4? 30 20 0 -0.15 0.12 0.34 -0.01 -0.15 0.11 0.31 0.01 -0.11 0.12 0.34 
?  40 -0.01 -0.14 0.09 0.4 -0.01 -0.14 0.11 0.37 -0.01 -0.15 0.12 0.39 
? 60 20 -0.01 -0.1 0.09 0.36 -0.01 -0.1 0.1 0.32 0 -0.1 0.11 0.38 
?  40 0.01 -0.08 0.1 0.33 0.01 -0.09 0.1 0.35 0 -0.07 0.1 0.33 
? 90 20 0 -0.07 0.08 0.32 0 -0.07 0.08 0.33 0 -0.08 0.08 0.35 
?  40 0 -0.08 0.07 0.36 0 -0.09 0.06 0.37 0 -0.07 0.07 0.32 
 
       
147 
 
Appendix A.28: Bias and Error of group estimates: Cluster Effect 5 (CE5) for 
Misspecified model 
 
?   Cluster Type 
?   1 2 3 
?   Bias Error Bias Error Bias Error
 Mi
 xtu
 re 
Pro
 po
 rtio
 n 
Clu
 ste
 r N
 um
 ber
  
Clu
 ste
 r S
 ize
  
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
Me
 an 
2.5
 % 
97.
 5%
  
Me
 an 
1? 30 20 0.09 -0.14 0.39 0.52 -0.01 -0.23 0.3 0.6 -0.12 -0.34 0.2 1.01 
?  40 0.11 -0.12 0.4 0.47 0.01 -0.23 0.3 0.63 -0.1 -0.34 0.18 1.04 
? 60 20 0.09 -0.05 0.2 0.48 0 -0.14 0.13 0.78 -0.1 -0.26 0.02 0.98 
?  40 0.11 -0.07 0.32 0.53 0 -0.18 0.19 0.74 -0.11 -0.29 0.08 1.05 
? 90 20 0.1 -0.04 0.21 0.45 0.01 -0.13 0.14 0.71 -0.08 -0.24 0.04 0.96 
?  40 0.11 -0.01 0.23 0.44 0 -0.13 0.15 0.73 -0.1 -0.26 0.05 1.02 
2? 30 20 0.09 -0.06 0.24 0.27 0.03 -0.14 0.2 0.44 -0.05 -0.24 0.17 0.62 
?  40 0.09 -0.05 0.25 0.24 0 -0.15 0.17 0.5 -0.1 -0.3 0.08 0.68 
? 60 20 0.09 -0.02 0.18 0.27 0 -0.12 0.11 0.45 -0.09 -0.22 0.06 0.62 
?  40 0.09 -0.04 0.2 0.23 -0.01 -0.14 0.11 0.53 -0.11 -0.25 0.01 0.68 
? 90 20 0.08 0 0.17 0.29 0 -0.11 0.1 0.5 -0.09 -0.2 0.02 0.67 
?  40 0.09 0.01 0.16 0.25 0 -0.08 0.07 0.46 -0.11 -0.21 -0.03 0.66 
3? 30 20 0 -0.14 0.15 0.4 -0.01 -0.17 0.14 0.38 -0.01 -0.16 0.15 0.39 
?  40 0 -0.13 0.12 0.48 0 -0.12 0.16 0.46 0 -0.12 0.16 0.51 
? 60 20 -0.01 -0.12 0.13 0.42 0 -0.13 0.13 0.48 0 -0.12 0.13 0.45 
?  40 0 -0.12 0.1 0.4 0 -0.11 0.12 0.5 0 -0.11 0.12 0.4 
? 90 20 0 -0.09 0.09 0.44 0.01 -0.09 0.1 0.45 0.01 -0.1 0.1 0.41 
?  40 0 -0.08 0.1 0.44 -0.01 -0.08 0.09 0.51 -0.01 -0.08 0.08 0.47 
4? 30 20 0 -0.26 0.25 0.69 0 -0.2 0.26 0.68 0 -0.23 0.24 0.69 
?  40 0.01 -0.23 0.23 0.68 0.02 -0.22 0.24 0.67 0.02 -0.24 0.23 0.82 
? 60 20 0 -0.19 0.16 0.68 0 -0.21 0.18 0.63 0 -0.2 0.19 0.69 
?  40 0 -0.16 0.14 0.8 0 -0.14 0.15 0.67 -0.01 -0.14 0.14 0.63 
? 90 20 0 -0.13 0.12 0.73 0 -0.13 0.16 0.8 0 -0.13 0.14 0.7 
?  40 0 -0.14 0.13 0.77 0 -0.13 0.13 0.74 0 -0.14 0.13 0.72 
 
       
148 
 
References 
Anderson, D. R. (2008). Model based inference in the life sciences: A primer on 
evidence. Springer: New York, NY. 
Asparouhov, T. & Muth?n, B. (2008). Multilevel mixture models. In G. R. Hancock & K. 
M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 27-51). 
Charlotte, NC: Information Age Publishing, Inc. 
Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis, 
London: Arnold. 
Bauer, D. J., & Curran, P. J. (2004). The integration of continuous and discrete latent 
variable models: Potential problems and promising opportunities. Psychological 
Methods, 9, 3-29. 
Bollen, K. A., & Curran, P. J. (2006). Latent curve models: A structural equation 
approach. New York, NY: John Wiley. 
Boscardin, C., Muth?n, B., Francis, D., & Baker, E. (2008). Early identification of 
reading difficulties using heterogeneous developmental trajectories. Journal of 
Educational Psychology, 100, 192-208.  
Bozdogan, H. (1987). Model selection and Akaike?s information criterion (AIC): The 
general theory and its analytical extensions. Psychometrika, 52, 345,370. 
Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models. Newbury Park, 
CA: Sage. 
Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: Understanding AIC 
and BIC in model selection. Sociological Methods and Research, 33, 261-304. 
       
149 
 
Burstein, L., Linn, R. L., & Capell, F.J. (1978). Analyzing multilevel data in the presence 
of heterogeneous within-class regressions. Journal of Educational Statistics, 3, 
547-385. 
Chudowsky, N., Chudowsky, V., & Kober, N. (2007). Answering the question that 
matters most: Has student achievement increased since No Child Left Behind? 
Washington DC: Center on Educational Policy. 
Clogg, C.C., & Goodman, L.A. (1985). Simultaneous latent structural analysis in several 
groups. In N. B. Tuma (Ed.), Sociological Methodology (pp. 81-110). San 
Francisco: Jossey-Bass Publishers. 
Croudace, T. J., Jarvelin, M. R., Wadsworth, M. E., & Jones, P. B. (2003). 
Developmental typology of trajectories to nighttime bladder control: 
Epidemiologic application of longitudinal latent class analysis. American Journal 
of Epidemiology, 157, 834-842. 
Dayton, C. M (1991). Educational applications of latent class analysis. Measurement and 
Evaluation in Counseling and Development, 24, 131-141. 
Doran, H. C., & Lockwood, J. R. (2006). Fitting value-added models in R. Journal of 
Educational and Behavioral Statistics, 31, 205-230. 
Duncan, T.E., Duncan, S.C., & Strycker, L.A. (2006). An introduction to latent variable 
growth curve modeling: Concepts, issues, and applications (2nd ed.).  Mahwah, 
NJ: Lawrence Erlbaum Associates. 
Enders, C. K., & Tofighi, D. (2008). The impact of misspecifying class-specific residual 
variances in growth mixture models. Structural Equation Modeling, 15, 75-95. 
       
150 
 
Feldman, B. J., Masyn, K. E.. & Conger, R. D. (2009). New approaches to studying 
problem behaviors: A comparison of methods for modeling longitudinal, 
categorical adolescent drinking data. Developmental Psychology, 45, 3, 652-676.  
Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and 
unidentifiable models. Biometrika, 61, 215-231. 
Hancock, G. R., & Lawrence, F. R.  (2006). Using latent growth models to evaluate 
longitudinal change.  In G. R. Hancock & R. O. Mueller (Eds.), Structural 
equation modeling: A second course (pp. 171-196). Greenwood, CT: Information 
Age Publishing, Inc. 
Henry, K., & Muth?n, B. (2010). Multilevel latent class analysis: An application of 
adolescent smoking typologies with individual and contextual predictors. 
Structural Equation Modeling, 17, 193-215. 
Jo, B. (2002). Estimation of intervention effects with noncompliance: Alternative model 
specifications. Journal of Educational and Behavioral Statistics, 27, 385-409. 
Kreft, I. G. G., & de Leeuw, J. (1998). Introducing multilevel modeling. London, UK: 
Sage Publications. 
Kreft, I. G. G., de Leeuw, J, & Aiken, L. (1995). The effect of different forms of 
centering in hierarchical linear models. Multivariate Behavioral Research, 30, 1-
 22. 
Kreuter, F., & Muth?n, B. (2008). Analyzing criminal trajectory profiles: Bridging 
multilevel and group-based approaches using growth mixture modeling. Journal 
of Quantitative Criminology, 24, 1-31. 
       
151 
 
Kreuter, F., Yan, T., & Tourangeau, R. (2008). Good item or bad ? can latent class 
analysis tell?: The utility of latent class analysis for the evaluation of survey 
questions. Journal of the Royal Statistical Society, Series A, 171, 723-738.  
Lazarsfeld, P.  F. (1950). The logical and mathematical foundation of latent structure 
analysis. In S. A. Stouffer, L. Guttman, E. A. Suchman, P. F. Lazarsfeld, S. A. 
Star, & J. A. Clausen (Eds.), Measurement and prediction. Princeton: Princeton 
University Press. 
Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis. Boston: Houghton 
Mifflin. 
Lazarus, S. S., Wu, Y., Altman, J., & Thurlow, M. L. (2010). The characteristics of low 
performing students on large-scale assessments. NCEO brief. Minneapolis: 
National Center on Educational Outcomes, University of Minnesota. 
Lindley, D. V., & Smith, A. F. M. (1972). Bayes estimates for the linear model. Journal 
of the Royal Statistical Society, Series B, 34, 1-41. 
Lockwood, J. R., & McCaffrey, D. F. (2007). Controlling for individual heterogeneity in 
longitudinal models, with applications to student achievement. Electronic Journal 
of Statistics, 1, 223-252. 
Lubke, G., & Neale, M. C. (2006). Distinguishing between latent classes and continuous 
factors: Resolution by maximum likelihood? Multivariate Behavioral Research, 
41, 499-532. 
Marsh, H. W., Ludtke, O., Robitzsch, A., Trautwein, U., Asparouhov, T., Muth?n, B., & 
Nagengast, B. (2009). Doubly-latent models of school contextual effects: 
       
152 
 
Integrating multilevel and structural equation approaches to control measurement 
and sampling errors. Multivariate Behavioral Research, 44, 764-802.  
McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York: Wiley & Sons.  
McQuarrie, A. D .R., & Tsai, C. L. (1998). Regression and time series model selection. 
World Scientific, London, UK. 
Miles, J., & Shevlin, M. (2000). Applying regression and correlation: A guide for 
students and researchers. Thousand Oaks, CA: Sage. 
Muth?n, B. (1989). Latent variable modeling in heterogeneous populations. 
Psychometrika, 54, 557-585. 
Muth?n, B. (1991). Analysis of longitudinal data using latent variable models with 
varying parameters. In L. Collins & J. Horn (eds.), Best methods for the analysis 
of change. Recent advances, unanswered questions, future directions (pp. 1-17). 
Washington DC: American Psychological Association. 
Muth?n, B. (2000). Methodological issues in random coefficient growth modeling using a 
latent variable framework: Applications to the development of heavy drinking. In 
J. Rose, L. Chassin, C. Presson & J. Sherman (Eds.). Multivariate applications in 
substance use research (pp. 113-140). Hillsdale, NJ: Erlbaum.  
Muth?n, B. (2001). Latent variable mixture modeling. In G. A. Marcoulides & R. E. 
Schumacker (Eds.), New developments and techniques in structural equation 
modeling (pp. 1-33). Mahwah, NJ: Lawrence Erlbaum Associates. 
Muth?n, B. (2004). Latent variable analysis: Growth mixture modeling and related 
techniques for longitudinal data. In D. Kaplan (Ed.), Handbook of quantitative 
       
153 
 
methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage 
Publications.  
Muth?n, B. (2006). The potential of growth mixture modeling. Infant and Child 
Development, 15, 623-625. 
Muth?n, B., & Asparouhov, T. (2009). Multilevel regression mixture analysis. Journal of 
the Royal Statistical Society, Series A, 172, 639-657.  
Muth?n, B., Brown, C. H., Masyn, K., Jo, B., Khoo, S.T ., Yang, C. C., Wang, C. P., 
Kellam, S., Carlin, J. & Liao, J. (2002). General growth mixture modeling for 
randomized preventive interventions. Biostatistics, 3, 459-475. 
Muth?n, B., Khoo, S.T., Francis, D. & Kim Boscardin, C. (2003). Analysis of reading 
skills development from Kindergarten through first grade: An application of 
growth mixture modeling to sequential processes. In S. R. Reise & N. Duan (Eds). 
Multilevel Modeling: Methodological Advances, Issues, and Applications (pp.71-
 89). Mahwah, NJ: Lawrence Erlbaum Associates.  
Muth?n, B. O., & Muth?n, L. K. (2000). The development of heavy drinking and alcohol-
 related problems from ages 18 to 37 in a U.S. national sample. Journal of Studies 
on Alcohol, 61, 290-300. 
Muth?n, B., & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using 
the EM algorithm. Biometrics, 55, 463-469.  
Muth?n, L. K., & Muth?n, B. O. (2010). Mplus user?s guide (V6.1). Los Angeles: Muth?n 
& Muth?n. 
Nagin, D. S. (1999). Analyzing developmental trajectories: A semi-parametric, group-
 based approach. Psychological Methods, 4, 139-157. 
       
154 
 
Nagin, D. S., & Land, K. C. (1993). Age, criminal careers, and population heterogeneity: 
Specification and estimation of a nonparametric, mixed Poisson model. 
Criminology, 31, 327-362. 
Nezlek, J. B., & Zyzniewski, L. E. (1998). Using hierarchical linear modeling to analyze 
group data. Group Dynamics: Theory, Research, and Practice, 2, 313-320. 
No Child Left Behind Act of 2001, 20 U.S.C. ? 6161. 
Palardy, G., & Vermunt, J. K. (2010). Multilevel growth mixture models for classifying 
groups. Journal of Educational and Behavioral Statistics, 35, 532-565. 
Pollack, B. N. (1998). Hierarchical linear modeling and the "unit of analysis" problem: A 
solution for analyzing responses of intact group members. Group Dynamics, 2, 
299-312. 
Preacher, K. J., Wichman, A. L., MacCallum, R. C., & Briggs, N. E. (2008). Latent 
growth models. Quantitative applications in the social sciences, Thousand Oaks, 
CA: Sage. 
Preacher, K., Zyphur, M. & Zhang, Z. (2010). A general multilevel SEM framework for 
assessing multilevel mediation. Psychological Methods, 15, 209-233. 
Quandt, R. E. (1958). The estimation of the parameters of a linear regression system 
obeying two separate regimes. Journal of the American Statistical Association, 
53, 873-880. 
Quandt, R. E., & Ramsey J. B. (1972). Estimating mixtures of normal distributions and 
switching regressions. Journal of the American Statistical Association, 73, 730-
 752. 
 
       
155 
 
Raudenbush, S. W., & Bryk, A.S. (2002). Hierarchical linear models: Applications and 
data analysis methods (2nd ed.). Newbury Park, CA: Sage Publications. 
Sanders, W. L., & Rivers, J. C. (1996). Cumulative and residual effects of teachers on 
future student academic achievement. Knoxville, TN: University of Tennessee 
Value-Added Research Center. 
SAS Institute. (2008-2010). SAS, release 9.2 [Computer software]. Cary, NC: Author. 
Schaeffer, C. M., Petras, H., Ialongo, N., Masyn, K. E., Hubbard, S., Poduska, J., & 
Sheppard, K. (2006). A comparison of girl's and boy's aggressive-disruptive 
behavior trajectories across elementary school: Prediction to young adult 
antisocial outcomes. Journal of Consulting and Clinical Psychology, 74, 500-510.  
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461?
 464. 
Singer, J. D. (1999). Using SAS Proc Mixed to fit multilevel models, hierarchical 
models, and individual growth models. Journal of Educational and Behavioral 
Statistics, 23, 323-355. 
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: 
multilevel, longitudinal and structural equation models. London, England: 
Chapman & Hall/CRC. 
Springer, M.G., Ballou, D., Hamilton, L., Le, V., Lockwood, J.R., McCaffrey, D., 
Pepper, M., & Stecher, B. (2010). Teacher pay for performance: Experimental 
evidence from the project on incentives in teaching. Nashville, TN: National 
Center on Performance Incentives at Vanderbilt University 
       
156 
 
Titterington, D.M., Smith, A.F.M. & Makov, U.E. (1985). Statistical analysis of finite 
mixture distributions. Chichester, U.K.: John Wiley & Sons. 
Tofighi, D., & Enders, C. K. (2008). Identifying the correct number of classes in a growth 
mixture model. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent 
variable mixture models (pp. 317?341). Greenwich, CT: Information Age. 
U.S. Department of Education (2006). Letter to Chief State School Officers: Assessment 
Requirements of NCLB and the Growth Model Pilot Project. 
http://www2.ed.gov/policy/elsec/guid/secletter/060221.html 
U.S. Department of Education (2010). Race to the Top Program Executive Summary. 
http://www2.ed.gov/programs/racetothetop/executive-summary.pdf  
Verbeke, G., & Lesaffre, E. (1996). A linear mixed-effects model with heterogeneity in 
the random-effects population. Journal of the American Statistical Association, 
91, 217-221. 
Verbeke, G., & Molenberghs, G. (2000). Linear mixed models for longitudinal data. New 
York: Springer. 
Vermunt, J. K., & Magidson, J. (2002). Latent class cluster analysis. In J.A. Hagenaars & 
A.L. McCutcheon (Eds.), Applied latent class analysis (pp. 89-106). Cambridge, 
UK: Cambridge University Press.  
Vermunt, J. K., & Magidson, J. (2008). Latent GOLD 4.5 user?s manual. Belmont, MA: 
Statistical Innovations. 
Wright, S. P., Horn, S. P., & Sanders, W. L. (1997). Teacher and classroom context 
effects on student achievement: Implications for teacher evaluation. Journal of 
Personnel Evaluation in Education, 11, 57-67. 
       
157 
 
Wright, S. P., White, J. T., Snaders, W. L., & Rivers, J. C. (2010). SAS EVAAS statistical 
models. SAS Institute Inc. http://www.sas.com/resources/asset/SAS-EVAAS-
 Statistical-Models.pdf