A Crash Course on Social Psychology Research from Michael Zyphur: Regression - Pathway analysis - Model fit
Combination of using Mplus and R
Drawing a pathway graph/causal graph using DiagrammeR
package
While working on project Harmony at MRC-IHRP, my primary task is building an R package to build the tables and plots from alignment analysis designed for the multi-factor categorical case by extracting the Mplus output’s information. Since I have a chance to expose myself to Mplus, why don’t I self-learn a new tool and share it on my blog?
The Mplus and I see that a change to dive into Social Psychology Research. What on earth? The same concepts with statistics but a whole new world of terminology!!!
These can separate into five parts (equivalent to 5 days, then I divided into five topics) of taking notes on Structural Equation and Multilevel Modeling in the Mplus workshop of Prof. Michael Zyphur.
Here’s a first part (1) of the so-called ‘A Crash Course on Social Psychology Research.’
Structural Equation and Multilevel Modeling in Mplus
The material can be downloaded here
Reminder:
- Path analysis: regression for observed variables
- CFA: regression from latent --> observed variables
- SEM: regression among latent variables
- Multilevel models: regression at multiple 'levels'
- Letant growth: model change with latent variables
<- seq(-5,5,by=0.1)
p
<- data.frame(p)
df ggplot(data=df, aes(x=p))+
stat_function(fun=dnorm, args=list(mean=0, sd=sqrt(.2)), aes(colour = "mu=0,sigma2=.2")) +
stat_function(fun=dnorm, args=list(mean=0, sd=sqrt(1.0)), aes(colour = "mu=0,sigma2=1.0")) +
stat_function(fun=dnorm, args=list(mean=0, sd=sqrt(5.0)), aes(colour = "mu=0,sigma2=5.0")) +
stat_function(fun=dnorm, args=list(mean=-2, sd=sqrt(2.0)), aes(colour = "mu=-2,sigma2=2")) +
scale_y_continuous(limits=c(0,1.0)) +
scale_colour_manual("", values = c("palegreen", "orange", "olivedrab", "blue")) +
ylab("Density") +
ggtitle("PDF of Normal Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
#generate the data
<-function (n, rho) {
gibbs<- matrix(ncol = 2, nrow = n)
mat <- 0
x <- 0
y 1, ] <- c(x, y)
mat[for (i in 2:n) {
<- rnorm(1, rho * y, (1 - rho^2))
x <- rnorm(1, rho * x, (1 - rho^2))
y <- c(x, y)
mat[i, ]
}
mat
}<- gibbs(10000, 0.98)
bvn
#setup
library(rgl) # plot3d, quads3d, lines3d, grid3d, par3d, axes3d, box3d, mtext3d
library(car) # dataEllipse
#process the data
<- hist(bvn[,2], plot=FALSE)
hx <- hx$density / sum(hx$density)
hxs <- hist(bvn[,1], plot=FALSE)
hy <- hy$density / sum(hy$density)
hys
## [xy]max: so that there's no overlap in the adjoining corner
<- tail(hx$breaks, n=1) + diff(tail(hx$breaks, n=2))
xmax <- tail(hy$breaks, n=1) + diff(tail(hy$breaks, n=2))
ymax <- max(hxs, hys)
zmax
#Basic scatterplot on the floor
## the base scatterplot
plot3d(bvn[,2], bvn[,1], 0, zlim=c(0, zmax), pch='.',
xlab='X', ylab='Y', zlab='', axes=FALSE)
par3d(scale=c(1,1,3))
#Histograms on the back walls
## manually create each histogram
for (ii in seq_along(hx$counts)) {
quads3d(hx$breaks[ii]*c(.9,.9,.1,.1) + hx$breaks[ii+1]*c(.1,.1,.9,.9),
rep(ymax, 4),
*c(0,1,1,0), color='gray80')
hxs[ii]
}for (ii in seq_along(hy$counts)) {
quads3d(rep(xmax, 4),
$breaks[ii]*c(.9,.9,.1,.1) + hy$breaks[ii+1]*c(.1,.1,.9,.9),
hy*c(0,1,1,0), color='gray80')
hys[ii]
}
#Summary Lines
## I use these to ensure the lines are plotted "in front of" the
## respective dot/hist
<- par3d('bbox')
bb <- 0.02 # percent off of the floor/wall for lines
inset <- bb[1] + (1-inset)*diff(bb[1:2])
x1 <- bb[3] + (1-inset)*diff(bb[3:4])
y1 <- bb[5] + inset*diff(bb[5:6])
z1
## even with draw=FALSE, dataEllipse still pops up a dev, so I create
## a dummy dev and destroy it ... better way to do this?
###dev.new()
<- dataEllipse(bvn[,1], bvn[,2], draw=FALSE, levels=0.95)
de ###dev.off()
## the ellipse
lines3d(de[,2], de[,1], z1, color='green', lwd=3)
## the two density curves, probability-style
<- density(bvn[,2])
denx lines3d(denx$x, rep(y1, length(denx$x)), denx$y / sum(hx$density), col='red', lwd=3)
<- density(bvn[,1])
deny lines3d(rep(x1, length(deny$x)), deny$x, deny$y / sum(hy$density), col='blue', lwd=3)
#Beautifications
grid3d(c('x+', 'y+', 'z-'), n=10)
box3d()
axes3d(edges=c('x-', 'y-', 'z+'))
<- 1.2 # place text outside of bbox *this* percentage
outset mtext3d('P(X)', edge='x+', pos=c(0, ymax, outset * zmax))
mtext3d('P(Y)', edge='y+', pos=c(xmax, 0, outset * zmax))
\[ y_i = \nu + \beta x_i + \epsilon_i \]
Causal statements are helpful for action
Conditions for \(x \rightarrow y\) causality
Regression assesses the second, and helps with the third
Removing the effect of \(z\) in \(x \rightarrow y\) effect is
These all mean the same thing:
We somehow make \(z\) irrelevant for \(x \rightarrow y\)
Allows estimating independent effects
\[ y_i = \nu + \beta_1 x_i + \beta_2 z_i+ \epsilon_i \]
Mplus Team
Mplus Overview
Mplus Features
Mplus Caveat
Example
Title:
Start input section with the heading and a colon, then type specific commands/options and end each statement with a semi-colon;
* To ignore a command line, start it with an asteriks
Data:
File is data.dat;
Variable:
Names are employ salary education height weight;
Usevariables are employ salary height education;
Analysis:
Estimator = ML;
Model:
Employ salary education on height;
Output:
Standardized;
Individual example.inp
cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/Individual Example.inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is individual.dat;
Analysis:
Estimator = ml;
!Estimator = Bayes;
Variable:
names are y x1 x2;
Model:
y on x1 x2;
Output:
Standardized sampstat TECH1 TECH8;
Summary example.inp
cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/Summary Example.inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is summary.dat;
Type = means stdeviations correlation;
Nobservations = 500;
Variable:
names are y x1 x2;
Analysis:
Estimator = ML;
Model:
y on x1 x2;
Output:
Standardized sampstat TECH1;
Individual.dat
cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/individual.dat', n = 20), sep = '\n')
-0.354517 0.573051 -0.175230
0.561655 -0.368095 1.090042
0.315551 -0.577052 0.425472
3.347049 1.088520 1.149353
-0.122389 -0.694153 -0.766538
-0.251276 -0.017487 -1.367410
-0.517996 -0.817974 -1.559255
1.888854 -0.658335 1.007614
0.461254 0.463916 -0.898300
2.237483 1.533398 0.180512
0.480991 -0.096545 -0.352276
0.165901 -1.341994 -1.445909
1.864947 1.027419 0.677408
-0.466245 -0.138712 -0.759287
2.567804 0.483444 0.959731
-0.024201 -0.507631 -0.517296
-1.912698 0.761720 -1.901134
-1.350069 -0.736562 2.318569
0.433773 0.723880 0.111837
-0.977083 0.155868 -0.897112
cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/summary.dat'), sep = '\n')
.485 .001 -.042
1.552 1.046 .978
1.0
.665 1.0
.427 .028 1.0
Usevariables are
y x1 x2;Usevariables are
Type = General
is defaultType = Twolevel
or Threelevel
means multilevelType = Random
implies random slopesEstimator = ML
, or Estimator = Bayes
Frequentist: Estimation uses data & model
Bayes: Estimation uses data, model, & prior prob.
Model: ON
Command
ON
refers to a regression slope \(\beta\)
Model: WITH
Command
WITH
refers to covariance \(\Theta\) or \(\Psi\)
Model: BY
Command
BY
refers to factor loadings (slopes)
Model notation
Mplus notation for freeing and fixing estimates
y ON x;
* freely-estimated \(\beta\)
y ON x@.5;
* \(\beta\) is NOT estimated, but fixed to .5
[y@0];
* an intercept for y1 constrained to 0.0
f1 BY y*;
* * = freely estimate, so estimates factor loading for y on factor “f1”
f1 BY y1-y5@1;
* factor loadings for “f1” = 1 for variables y1, y2, y3, y4, y5
Labeling and constraining/fixing estimates
y ON x (b1);
y ON x z (b2);
Model Constraint
TITLE: this is an example of a two-group twin
model for continuous outcomes using parameter constraints
DATA: FILE = ex5.21.dat;
VARIABLE: NAMES = y1 y2 g;
GROUPING = g(1 = mz 2 = dz);
MODEL: [y1-y2] (1);
y1-y2 (var);
y1 WITH y2 (covmz);
MODEL dz: y1 WITH y2 (covdz);
MODEL CONSTRAINT:
NEW(a c e h);
var = a**2 + c**2 + e**2;
covmz = a**2 + c**2;
covdz = 0.5*a**2 + c**2;
h = a**2/(a**2 + c**2 + e**2);
Model Priors
For Bayes, we can specify prior probabilities
These are distributions…
Model:
y2 ON y1 (b1);
Model Priors:
b1~N(.25, 1);
We say “b1” is distributed as (\(\sim\)) normal (\(N\)) with mean and variance (\(\mu, \sigma^2\)) of .25 and 1
Mplus has defaults that are ‘diffuse priors’
Model Indirect
Our job is to
grViz("
digraph causal{
# a 'graph' statement
graph [overlap = true, fontsize = 10]
# several 'node' statements
node [shape = box,
fontname = Helvetica]
WRS [label = 'Work Role Stress (WRS)']
WFC [label = 'Work Family Conflict (WFC)']
JD [label = 'Job Distress (JD)']
TI [label = 'Turnover Intentions (TI)']
LD [label = 'Life Distress (LD)']
# Edges
edge[color=black, arrowhead=vee]
WRS->WFC [label=<β<SUB>1</SUB>>]
WRS->TI [label=<β<SUB>2</SUB>>]
WRS->JD [label=<β<SUB>3</SUB>>]
WFC->JD [label=<β<SUB>4</SUB>>]
WFC->LD [label=<β<SUB>5</SUB>>]
JD->TI [label=<β<SUB>6</SUB>>]
JD->LD [label=<β<SUB>7</SUB>>]
TI->LD[dir=both, label=<ψ>]
d1->WFC
d1 [shape=plaintext,label='']
d2->JD
d2 [shape=plaintext,label='']
d3->TI
d3 [shape=plaintext,label='']
d4->LD
d4 [shape=plaintext,label='']
{rank = same; WRS; WFC}
{rank = same; TI; LD}
}")
Variable:
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
OR
TI LD on JD;
JD on WRS WFC;
TI WFC on WRS; LD on WFC;
Code in Mplus:
cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999_Hai.inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem
Usevariables are
WRS TI WFC JD LD;
Analysis:
Estimator = ML;
Model:
TI on JD WRS; ! left-side: dependent vbl; right-side: predictors
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Output:
Standardized Sampstat TECH1 TECH8;
Mplus output
cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999_Hai.out'), sep = '\n')
MODEL RESULTS ! Unstandardized Output
Two-Tailed
Estimate S.E. Est./S.E. P-Value
TI ON
JD 0.536 0.112 4.767 0.000 ! 1-unit increase on JD, .536 increase on TI
WRS 0.092 0.114 0.812 0.417
LD ON
JD 0.682 0.073 9.399 0.000
WFC 0.186 0.057 3.255 0.001
JD ON
WRS 0.381 0.077 4.947 0.000
WFC 0.296 0.060 4.947 0.000
WFC ON
WRS 0.631 0.098 6.458 0.000
LD WITH
TI -0.006 0.042 -0.151 0.880 ! Residual covariances
Intercepts
TI 0.539 0.277 1.950 0.051
WFC 1.853 0.256 7.228 0.000
JD 0.555 0.208 2.669 0.008
LD 0.373 0.185 2.019 0.043
Residual Variances
TI 0.745 0.092 8.124 0.000
WFC 0.800 0.098 8.124 0.000
JD 0.377 0.046 8.124 0.000
LD 0.311 0.038 8.124 0.000
QUALITY OF NUMERICAL RESULTS
Condition Number for the Information Matrix 0.141E-02
(ratio of smallest to largest eigenvalue)
STANDARDIZED MODEL RESULTS
STDYX Standardization ! Standardized Output (STDYX)
Two-Tailed
Estimate S.E. Est./S.E. P-Value
TI ON
JD 0.438 0.086 5.114 0.000
WRS 0.075 0.092 0.813 0.416
LD ON
JD 0.628 0.058 10.848 0.000
WFC 0.218 0.066 3.273 0.001
JD ON
WRS 0.376 0.073 5.178 0.000
WFC 0.376 0.073 5.178 0.000
WFC ON
WRS 0.490 0.066 7.408 0.000
LD WITH
TI -0.013 0.087 -0.151 0.880
Intercepts
TI 0.547 0.300 1.823 0.068
WFC 1.806 0.327 5.528 0.000
JD 0.688 0.288 2.389 0.017
LD 0.425 0.228 1.864 0.062
Residual Variances
TI 0.766 0.065 11.870 0.000
WFC 0.760 0.065 11.724 0.000
JD 0.579 0.065 8.854 0.000
LD 0.405 0.054 7.446 0.000
STDY Standardization
Two-Tailed
Estimate S.E. Est./S.E. P-Value
TI ON
JD 0.438 0.086 5.114 0.000
WRS 0.094 0.115 0.814 0.416
LD ON
JD 0.628 0.058 10.848 0.000
WFC 0.218 0.066 3.273 0.001
JD ON
WRS 0.472 0.089 5.279 0.000
WFC 0.376 0.073 5.178 0.000
WFC ON
WRS 0.615 0.078 7.844 0.000
LD WITH
TI -0.013 0.087 -0.151 0.880
Intercepts
TI 0.547 0.300 1.823 0.068
WFC 1.806 0.327 5.528 0.000
JD 0.688 0.288 2.389 0.017
LD 0.425 0.228 1.864 0.062
Residual Variances
TI 0.766 0.065 11.870 0.000
WFC 0.760 0.065 11.724 0.000
JD 0.579 0.065 8.854 0.000
LD 0.405 0.054 7.446 0.000
STD Standardization
Two-Tailed
Estimate S.E. Est./S.E. P-Value
TI ON
JD 0.536 0.112 4.767 0.000
WRS 0.092 0.114 0.812 0.417
LD ON
JD 0.682 0.073 9.399 0.000
WFC 0.186 0.057 3.255 0.001
JD ON
WRS 0.381 0.077 4.947 0.000
WFC 0.296 0.060 4.947 0.000
WFC ON
WRS 0.631 0.098 6.458 0.000
LD WITH
TI -0.006 0.042 -0.151 0.880
Intercepts
TI 0.539 0.277 1.950 0.051
WFC 1.853 0.256 7.228 0.000
JD 0.555 0.208 2.669 0.008
LD 0.373 0.185 2.019 0.043
Residual Variances
TI 0.745 0.092 8.124 0.000
WFC 0.800 0.098 8.124 0.000
JD 0.377 0.046 8.124 0.000
LD 0.311 0.038 8.124 0.000
R-SQUARE
Observed Two-Tailed
Variable Estimate S.E. Est./S.E. P-Value
TI 0.234 0.065 3.631 0.000
WFC 0.240 0.065 3.704 0.000
JD 0.421 0.065 6.436 0.000
LD 0.595 0.054 10.947 0.000
TECHNICAL 1 OUTPUT
PARAMETER SPECIFICATION
NU
TI WFC JD LD WRS
________ ________ ________ ________ ________
0 0 0 0 0
LAMBDA
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 0 0 0 0 0
WFC 0 0 0 0 0
JD 0 0 0 0 0
LD 0 0 0 0 0
WRS 0 0 0 0 0
THETA
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 0
WFC 0 0
JD 0 0 0
LD 0 0 0 0
WRS 0 0 0 0 0
ALPHA
TI WFC JD LD WRS
________ ________ ________ ________ ________
1 2 3 4 0
BETA
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 0 0 5 0 6
WFC 0 0 0 0 7
JD 0 8 0 0 9
LD 0 10 11 0 0
WRS 0 0 0 0 0
PSI
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 12
WFC 0 13
JD 0 0 14
LD 15 0 0 16
WRS 0 0 0 0 0
STARTING VALUES
NU
TI WFC JD LD WRS
________ ________ ________ ________ ________
0.000 0.000 0.000 0.000 0.000
LAMBDA
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 1.000 0.000 0.000 0.000 0.000
WFC 0.000 1.000 0.000 0.000 0.000
JD 0.000 0.000 1.000 0.000 0.000
LD 0.000 0.000 0.000 1.000 0.000
WRS 0.000 0.000 0.000 0.000 1.000
THETA
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 0.000
WFC 0.000 0.000
JD 0.000 0.000 0.000
LD 0.000 0.000 0.000 0.000
WRS 0.000 0.000 0.000 0.000 0.000
ALPHA
TI WFC JD LD WRS
________ ________ ________ ________ ________
2.120 3.430 2.520 2.730 2.500
BETA
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 0.000 0.000 0.000 0.000 0.000
WFC 0.000 0.000 0.000 0.000 0.000
JD 0.000 0.000 0.000 0.000 0.000
LD 0.000 0.000 0.000 0.000 0.000
WRS 0.000 0.000 0.000 0.000 0.000
PSI
TI WFC JD LD WRS
________ ________ ________ ________ ________
TI 0.490
WFC 0.000 0.530
JD 0.000 0.000 0.328
LD 0.000 0.000 0.000 0.387
WRS 0.000 0.000 0.000 0.000 0.635
What’s \(R^2\) for our variables?
grViz("
digraph causal{
# a 'graph' statement
graph [overlap = true, fontsize = 10]
# several 'node' statements
node [shape = box,
fontname = Helvetica]
WRS [label = 'Work Role Stress (WRS)']
WFC [label = 'Work Family Conflict (WFC)']
JD [label = 'Job Distress (JD)']
TI [label = 'Turnover Intentions (TI)']
LD [label = 'Life Distress (LD)']
# Edges
edge[color=black, arrowhead=vee]
WRS->WFC [label=<β<SUB>1</SUB>=.63/.49**>]
WRS->TI [label=<β<SUB>2</SUB>=.09/.08>]
WRS->JD [label=<β<SUB>3</SUB>=.38/0.38**>]
WFC->JD [label=<β<SUB>4</SUB>=.30/.38**>]
WFC->LD [label=<β<SUB>5</SUB>=.19/.22**>]
JD->TI [label=<β<SUB>6</SUB>=.54/.44**>]
JD->LD [label=<β<SUB>7</SUB>=.68/.63**>]
TI->LD[dir=both, label=<ψ=-.006/-.013>]
d1->WFC
d1 [shape=plaintext,label='']
d2->JD
d2 [shape=plaintext,label='']
d3->TI
d3 [shape=plaintext,label='']
d4->LD
d4 [shape=plaintext,label='']
{rank = same; WRS; WFC}
{rank = same; TI; LD}
}")
For example: \(\beta_7\) = .68/.63**, i.e. unstandardized/standardized both Y and X estimates and significance of p-value
cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999 (no covariance).inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem
Usevariables are
WRS WFC JD LD TI;
Analysis:
!Estimator = ML;
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
TI with LD@0;
Output:
Standardized sampstat Tech1 Tech8;
cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999 (new model1).inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem
Usevariables are
WRS WFC JD LD TI;
Analysis:
Estimator = ML;
Model:
WRS on WFC JD TI;
WFC on JD LD;
JD on TI LD;
Output:
Standardized Sampstat TECH1 TECH8;
cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999 (new model2).inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem
Usevariables are
WRS WFC JD LD TI;
Analysis:
Estimator = ML;
Model:
TI LD on JD;
JD on WRS WFC;
WRS with WFC;
Output:
Standardized Sampstat TECH1 TECH8;
ON
WITH
Lots of literature & many indices for ML estimator
Absolute fit
Relative/Incremental/Comparative fit
Information Criteria
Bayesians have less work on the topic …
In statistics programs we get fit from:
For comparing models that we estimate:
Information we have vs. what’s estimated
The difference is our degrees of freedom (df)
What about data/model fit ?
Analysis/Estimated model (\(H_0\)) | Unrestricted/Saturated/Alternate model (\(H_1\)) |
---|---|
Model:TI on JD WRS;LD on JD WFC;JD on WRS WFC;WFC on WRS; | Model:JD LD TI WRS WFC with JD LD TI WRS WFC; |
Baseline/Null Model |
---|
Model: |
Using \(\chi^2\) to Compare Estimated Models
\[ \frac{\sqrt{\chi^2_{Estimated} - df_{Estimated}}}{\sqrt{df\times (N-1)}}\]
\[ \frac{\sqrt{(\chi^2_{Baseline} - df_{Baseline}) - (\chi^2_{Estimated} - df_{Estimated})}}{\sqrt{\chi^2_{Baseline} - df_{Baseline}}}\]
\[ \frac{\chi^2_{Baseline}/df_{Baseline} - \chi^2_{Estimated}/df_{Estimated}}{\chi^2_{Baseline}/ df_{Baseline} - 1}\]
\[ \chi^2_{Estimated} + k\times (k-1) - 2\times df_{Estimated} \]
\[ \chi^2_{Estimated} + ln(N)\times k\times (k-1)/2 - df_{Estimated} \]
\[ \chi^2_{Estimated} + ln((N+2)/24)\times k\times (k-1)/2 - df_{Estimated} \]
For both AIC and BIC, no significance tests… see recommendations in interpreting differences
MODINDICES
command in Mpluscat(readLines('Examples/Day 1, Session 4 (Model Fit)/Grandey & Cropanzano 1999 (ML) modindices.inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999 MonteCarlo.dat;
Variable:
names are JD LD TI WRS WFC;
Analysis:
Estimator = ML;
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Output:
Standardized sampstat Tech1 Tech8 MODINDICES(1);
cat(readLines('Examples/Day 1, Session 4 (Model Fit)/Grandey & Cropanzano 1999 (Bayes).inp'), sep = '\n')
Title:
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999 MonteCarlo.dat;
Variable:
names are JD LD TI WRS WFC;
Analysis:
Estimator = Bayes;
fbiterations=10000;
Processors=2;
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Output:
Standardized sampstat Tech1 Tech8;
Grace, J. B., & Bollen, K. A. (2005). Interpreting the results from multiple regression and structural equation models. Bulletin of the Ecological Society of America, 86, 283-295.
Cohen, Cohen, West, & Aiken (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd). Lawrence Erlbaum & Associates.
Myers, Well, & Lorch (2010). Research Design and Statistical Analysis (3rd). Routledge Academic.
Tabachnick & Fidell (2006). Using Multivariate Statistics (5th). Allyn & Bacon.
Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London, Series A 222, 309–368.
Mplus User’s Guide and Diagrammer Documentation
Streiner, D. L. 2005. Finding Our Way: An Introduction to Path Analysis. Canadian Journal of Psychiatry, 50, 115-122.
Wright, S. (1921). Correlation and causation. J. Agricultural Research 20: 557–585.
Wright, S. (1934). The method of path coefficients. Annals of Mathematical Statistics 5: 161–215.
Angrist, J. D., & Krueger, A. B. (2001). Instrumental variables and the search for identification: From supply and demand to natural experiments. Journal of Economics Perspectives, 15: 69-85.
Muthén, B. (2002). Beyond SEM: General latent variable modeling. Behaviormetrika, 29, 81-117.
Hu & Bentler (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives, Structural Equation Modeling, 6(1), 1-55.
Personality and Individual Differences, Volume 42, Issue 5. 2007.
Neyman, J., & Pearson, E. S. 1933. On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society, A, 231, 289-337.
Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111-16
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795.
McDonald, R. P. (2010). Structural models and the art of approximation. Perspectives on Psychological Science, 5, 675-686.
Cheung, G. W., & Rensvold, R. B. (1999). Testing factorial invariance across groups: a reconceptualization and proposed new method. Journal of Management, 25, 1–27.
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255.
Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411-423.
Mulaik, S. A., & Millsap, R. E. (2000). Doing the four-step right. Structural Equation Modeling, 7, 36-73. (see other papers in the same issue).
Shapin, S. 2008. The scientific life: A moral history of a late modern vocation. Chicago University Press.
Porter, T. M. 1995. Trust in Numbers: The Pursuit of Objectivity in Science and Public Life.
Poovey, M. 1997. A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society. Chicago.
McCloskey, D. N. 1992. If you’re so smart: The narrative of economic expertise. University of Chicago Press.
McCloskey, D. N. 1992. The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives.
Asparouhov, T., Muthén, B. & Morin, A. J. S. 2015. Bayesian structural equation modeling with cross-loadings and residual covariances: Comments on Stromeyer et al. Journal of Management, 41, 1561-1577.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hai-mn/hai-mn.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nguyen (2021, May 11). HaiBiostat: A Note on 5-day Workshop on Mplus ~ Day 1. Retrieved from https://hai-mn.github.io/posts/2021-05-11-5-day-mplus-workshop-michael-zyphur-day-1/
BibTeX citation
@misc{nguyen2021a, author = {Nguyen, Hai}, title = {HaiBiostat: A Note on 5-day Workshop on Mplus ~ Day 1}, url = {https://hai-mn.github.io/posts/2021-05-11-5-day-mplus-workshop-michael-zyphur-day-1/}, year = {2021} }