A Note on 5-day Workshop on Mplus ~ Day 1

Biostatistics Psychology/Sociology Mplus Pathway/Causal Graph

A Crash Course on Social Psychology Research from Michael Zyphur: Regression - Pathway analysis - Model fit
Combination of using Mplus and R
Drawing a pathway graph/causal graph using DiagrammeR package

Hai Nguyen
May 11, 2021

Motivation

Course outline

Structural Equation and Multilevel Modeling in Mplus

The material can be downloaded here

Reminder:
  - Path analysis: regression for observed variables  
  - CFA: regression from latent --> observed variables  
  - SEM: regression among latent variables  
  - Multilevel models: regression at multiple 'levels'  
  - Letant growth: model change with latent variables  

Part 1: Means, (Co)Variances, Regression

Distributions Imply Rules

p <- seq(-5,5,by=0.1)

df <- data.frame(p)
ggplot(data=df, aes(x=p))+
  stat_function(fun=dnorm, args=list(mean=0, sd=sqrt(.2)), aes(colour = "mu=0,sigma2=.2")) + 
  stat_function(fun=dnorm, args=list(mean=0, sd=sqrt(1.0)), aes(colour = "mu=0,sigma2=1.0")) +
  stat_function(fun=dnorm, args=list(mean=0, sd=sqrt(5.0)), aes(colour = "mu=0,sigma2=5.0")) +
  stat_function(fun=dnorm, args=list(mean=-2, sd=sqrt(2.0)), aes(colour = "mu=-2,sigma2=2")) +
  scale_y_continuous(limits=c(0,1.0)) +
  scale_colour_manual("", values = c("palegreen", "orange", "olivedrab", "blue")) + 
  ylab("Density") +
  ggtitle("PDF of Normal Distribution") + 
  theme_bw() + 
  theme(plot.title = element_text(hjust = 0.5))

A variance \(\sigma^2\)

Graphical

Multivariate Parameters

#generate the data
gibbs<-function (n, rho) {
    mat <- matrix(ncol = 2, nrow = n)
    x <- 0
    y <- 0
    mat[1, ] <- c(x, y)
    for (i in 2:n) {
        x <- rnorm(1, rho * y, (1 - rho^2))
        y <- rnorm(1, rho * x, (1 - rho^2))
        mat[i, ] <- c(x, y)
    }
    mat
}
bvn <- gibbs(10000, 0.98)

#setup
library(rgl) # plot3d, quads3d, lines3d, grid3d, par3d, axes3d, box3d, mtext3d
library(car) # dataEllipse

#process the data
hx <- hist(bvn[,2], plot=FALSE)
hxs <- hx$density / sum(hx$density)
hy <- hist(bvn[,1], plot=FALSE)
hys <- hy$density / sum(hy$density)

## [xy]max: so that there's no overlap in the adjoining corner
xmax <- tail(hx$breaks, n=1) + diff(tail(hx$breaks, n=2))
ymax <- tail(hy$breaks, n=1) + diff(tail(hy$breaks, n=2))
zmax <- max(hxs, hys)

#Basic scatterplot on the floor
## the base scatterplot
plot3d(bvn[,2], bvn[,1], 0, zlim=c(0, zmax), pch='.',
       xlab='X', ylab='Y', zlab='', axes=FALSE)
par3d(scale=c(1,1,3))

#Histograms on the back walls
## manually create each histogram
for (ii in seq_along(hx$counts)) {
    quads3d(hx$breaks[ii]*c(.9,.9,.1,.1) + hx$breaks[ii+1]*c(.1,.1,.9,.9),
            rep(ymax, 4),
            hxs[ii]*c(0,1,1,0), color='gray80')
}
for (ii in seq_along(hy$counts)) {
    quads3d(rep(xmax, 4),
            hy$breaks[ii]*c(.9,.9,.1,.1) + hy$breaks[ii+1]*c(.1,.1,.9,.9),
            hys[ii]*c(0,1,1,0), color='gray80')
}

#Summary Lines
## I use these to ensure the lines are plotted "in front of" the
## respective dot/hist
bb <- par3d('bbox')
inset <- 0.02 # percent off of the floor/wall for lines
x1 <- bb[1] + (1-inset)*diff(bb[1:2])
y1 <- bb[3] + (1-inset)*diff(bb[3:4])
z1 <- bb[5] + inset*diff(bb[5:6])

## even with draw=FALSE, dataEllipse still pops up a dev, so I create
## a dummy dev and destroy it ... better way to do this?
###dev.new()
de <- dataEllipse(bvn[,1], bvn[,2], draw=FALSE, levels=0.95)
###dev.off()

## the ellipse
lines3d(de[,2], de[,1], z1, color='green', lwd=3)

## the two density curves, probability-style
denx <- density(bvn[,2])
lines3d(denx$x, rep(y1, length(denx$x)), denx$y / sum(hx$density), col='red', lwd=3)
deny <- density(bvn[,1])
lines3d(rep(x1, length(deny$x)), deny$x, deny$y / sum(hy$density), col='blue', lwd=3)

#Beautifications
grid3d(c('x+', 'y+', 'z-'), n=10)
box3d()
axes3d(edges=c('x-', 'y-', 'z+'))
outset <- 1.2 # place text outside of bbox *this* percentage
mtext3d('P(X)', edge='x+', pos=c(0, ymax, outset * zmax))
mtext3d('P(Y)', edge='y+', pos=c(xmax, 0, outset * zmax))

Covariance \(\sigma_{xy}\)

A Linear Regression Model

\[ y_i = \nu + \beta x_i + \epsilon_i \]

In Terms of Distributions?

Why Regression? Causality

Regression assesses the second, and helps with the third

Controlling for variables

\[ y_i = \nu + \beta_1 x_i + \beta_2 z_i+ \epsilon_i \]

With Path Diagrams

Summary

Part 2: Mplus and Estimation

Mplus Team

Mplus Overview

Mplus Features

Mplus Caveat

Example

Title: 
    Start input section with the heading and a colon, then type specific commands/options and end each statement with a semi-colon;
    * To ignore a command line, start it with an asteriks
Data: 
    File is data.dat;
Variable: 
    Names are employ salary education height weight;
    Usevariables are employ salary height education;
Analysis:
    Estimator = ML;
Model:
    Employ salary education on height;
Output:
    Standardized;

Input Command Headings

Data

Individual example.inp

cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/Individual Example.inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data: 
File is individual.dat;
Analysis: 
Estimator = ml;
!Estimator = Bayes;
Variable: 
names are y x1 x2;
Model:
y on x1 x2;
Output:
Standardized sampstat TECH1 TECH8;

Summary example.inp

cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/Summary Example.inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data: 
File is summary.dat;
Type = means stdeviations correlation;
Nobservations = 500;
Variable: 
names are y x1 x2;
Analysis:
Estimator = ML;
Model:
y on x1 x2;
Output:
Standardized sampstat TECH1;

Individual.dat

cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/individual.dat', n = 20), sep = '\n')
   -0.354517     0.573051    -0.175230
    0.561655    -0.368095     1.090042
    0.315551    -0.577052     0.425472
    3.347049     1.088520     1.149353
   -0.122389    -0.694153    -0.766538
   -0.251276    -0.017487    -1.367410
   -0.517996    -0.817974    -1.559255
    1.888854    -0.658335     1.007614
    0.461254     0.463916    -0.898300
    2.237483     1.533398     0.180512
    0.480991    -0.096545    -0.352276
    0.165901    -1.341994    -1.445909
    1.864947     1.027419     0.677408
   -0.466245    -0.138712    -0.759287
    2.567804     0.483444     0.959731
   -0.024201    -0.507631    -0.517296
   -1.912698     0.761720    -1.901134
   -1.350069    -0.736562     2.318569
    0.433773     0.723880     0.111837
   -0.977083     0.155868    -0.897112
cat(readLines('Examples/Day 1, Session 2 (Mplus and Estimation)/summary.dat'), sep = '\n')
.485    .001    -.042
1.552   1.046   .978
1.0
.665    1.0
.427    .028    1.0

Variable

Define

Analysis

Estimation and Inference

Frequentist: Estimation uses data & model

Bayes: Estimation uses data, model, & prior prob.

Model

Model: ON Command

Model: WITH Command

Model: BY Command

Model notation

Mplus notation for freeing and fixing estimates

y ON x;
* freely-estimated \(\beta\)
y ON x@.5;
* \(\beta\) is NOT estimated, but fixed to .5
[y@0];
* an intercept for y1 constrained to 0.0
f1 BY y*;
* * = freely estimate, so estimates factor loading for y on factor “f1”
f1 BY y1-y5@1;
* factor loadings for “f1” = 1 for variables y1, y2, y3, y4, y5

Labeling and constraining/fixing estimates

Model Constraint

TITLE:     this is an example of a two-group twin
           model for  continuous outcomes using parameter constraints

DATA:      FILE = ex5.21.dat;

VARIABLE:  NAMES = y1 y2 g;
           GROUPING = g(1 = mz 2 = dz);

MODEL:     [y1-y2]    (1);
           y1-y2      (var);
           y1 WITH y2 (covmz);

MODEL dz:  y1 WITH y2 (covdz);

MODEL CONSTRAINT:
           NEW(a c e h);
           var = a**2 + c**2 + e**2;
           covmz = a**2 + c**2;
           covdz = 0.5*a**2 + c**2;
           h = a**2/(a**2 + c**2 + e**2);

Model Priors

Model Indirect

Output, Savedata, Plot, Montecarlo

Summary

Our job is to

Part 3: Path Analysis

History

Show case

Structural Model Specification

Model diagram

grViz("
digraph causal{

  # a 'graph' statement
  graph [overlap = true, fontsize = 10]

  # several 'node' statements
  node  [shape = box,
         fontname = Helvetica]
WRS [label = 'Work Role Stress (WRS)']
WFC [label = 'Work Family Conflict (WFC)']
JD [label = 'Job Distress (JD)']
TI [label = 'Turnover Intentions (TI)']
LD [label = 'Life Distress (LD)']

# Edges
edge[color=black, arrowhead=vee]
WRS->WFC [label=<&beta;<SUB>1</SUB>>]
WRS->TI [label=<&#946;<SUB>2</SUB>>]
WRS->JD [label=<&#946;<SUB>3</SUB>>]
WFC->JD [label=<&#946;<SUB>4</SUB>>]
WFC->LD [label=<&#946;<SUB>5</SUB>>]
JD->TI [label=<&#946;<SUB>6</SUB>>]
JD->LD [label=<&#946;<SUB>7</SUB>>]
TI->LD[dir=both, label=<&psi;>]
d1->WFC
d1 [shape=plaintext,label='']
d2->JD
d2 [shape=plaintext,label='']
d3->TI
d3 [shape=plaintext,label='']
d4->LD
d4 [shape=plaintext,label='']

{rank = same; WRS; WFC}
{rank = same; TI; LD}
}")

Variable:

Model:

TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;

OR

TI LD on JD;
JD on WRS WFC;
TI WFC on WRS; LD on WFC;

Code in Mplus:

cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999_Hai.inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data: 
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are 
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem

Usevariables are 
WRS TI WFC JD LD;
Analysis:
Estimator = ML;
Model:
TI on JD WRS; ! left-side: dependent vbl; right-side: predictors
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Output:
Standardized Sampstat TECH1 TECH8;

Mplus output

cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999_Hai.out'), sep = '\n')
MODEL RESULTS ! Unstandardized Output

                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value

 TI       ON
    JD                 0.536      0.112      4.767      0.000 ! 1-unit increase on JD, .536 increase on TI
    WRS                0.092      0.114      0.812      0.417

 LD       ON
    JD                 0.682      0.073      9.399      0.000
    WFC                0.186      0.057      3.255      0.001

 JD       ON
    WRS                0.381      0.077      4.947      0.000
    WFC                0.296      0.060      4.947      0.000

 WFC      ON
    WRS                0.631      0.098      6.458      0.000

 LD       WITH
    TI                -0.006      0.042     -0.151      0.880 ! Residual covariances

 Intercepts
    TI                 0.539      0.277      1.950      0.051
    WFC                1.853      0.256      7.228      0.000
    JD                 0.555      0.208      2.669      0.008
    LD                 0.373      0.185      2.019      0.043

 Residual Variances
    TI                 0.745      0.092      8.124      0.000
    WFC                0.800      0.098      8.124      0.000
    JD                 0.377      0.046      8.124      0.000
    LD                 0.311      0.038      8.124      0.000


QUALITY OF NUMERICAL RESULTS

     Condition Number for the Information Matrix              0.141E-02
       (ratio of smallest to largest eigenvalue)


STANDARDIZED MODEL RESULTS 


STDYX Standardization ! Standardized Output (STDYX)

                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value

 TI       ON
    JD                 0.438      0.086      5.114      0.000
    WRS                0.075      0.092      0.813      0.416

 LD       ON
    JD                 0.628      0.058     10.848      0.000
    WFC                0.218      0.066      3.273      0.001

 JD       ON
    WRS                0.376      0.073      5.178      0.000
    WFC                0.376      0.073      5.178      0.000

 WFC      ON
    WRS                0.490      0.066      7.408      0.000

 LD       WITH
    TI                -0.013      0.087     -0.151      0.880

 Intercepts
    TI                 0.547      0.300      1.823      0.068
    WFC                1.806      0.327      5.528      0.000
    JD                 0.688      0.288      2.389      0.017
    LD                 0.425      0.228      1.864      0.062

 Residual Variances
    TI                 0.766      0.065     11.870      0.000
    WFC                0.760      0.065     11.724      0.000
    JD                 0.579      0.065      8.854      0.000
    LD                 0.405      0.054      7.446      0.000


STDY Standardization

                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value

 TI       ON
    JD                 0.438      0.086      5.114      0.000
    WRS                0.094      0.115      0.814      0.416

 LD       ON
    JD                 0.628      0.058     10.848      0.000
    WFC                0.218      0.066      3.273      0.001

 JD       ON
    WRS                0.472      0.089      5.279      0.000
    WFC                0.376      0.073      5.178      0.000

 WFC      ON
    WRS                0.615      0.078      7.844      0.000

 LD       WITH
    TI                -0.013      0.087     -0.151      0.880

 Intercepts
    TI                 0.547      0.300      1.823      0.068
    WFC                1.806      0.327      5.528      0.000
    JD                 0.688      0.288      2.389      0.017
    LD                 0.425      0.228      1.864      0.062

 Residual Variances
    TI                 0.766      0.065     11.870      0.000
    WFC                0.760      0.065     11.724      0.000
    JD                 0.579      0.065      8.854      0.000
    LD                 0.405      0.054      7.446      0.000


STD Standardization

                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value

 TI       ON
    JD                 0.536      0.112      4.767      0.000
    WRS                0.092      0.114      0.812      0.417

 LD       ON
    JD                 0.682      0.073      9.399      0.000
    WFC                0.186      0.057      3.255      0.001

 JD       ON
    WRS                0.381      0.077      4.947      0.000
    WFC                0.296      0.060      4.947      0.000

 WFC      ON
    WRS                0.631      0.098      6.458      0.000

 LD       WITH
    TI                -0.006      0.042     -0.151      0.880

 Intercepts
    TI                 0.539      0.277      1.950      0.051
    WFC                1.853      0.256      7.228      0.000
    JD                 0.555      0.208      2.669      0.008
    LD                 0.373      0.185      2.019      0.043

 Residual Variances
    TI                 0.745      0.092      8.124      0.000
    WFC                0.800      0.098      8.124      0.000
    JD                 0.377      0.046      8.124      0.000
    LD                 0.311      0.038      8.124      0.000


R-SQUARE

    Observed                                        Two-Tailed
    Variable        Estimate       S.E.  Est./S.E.    P-Value

    TI                 0.234      0.065      3.631      0.000
    WFC                0.240      0.065      3.704      0.000
    JD                 0.421      0.065      6.436      0.000
    LD                 0.595      0.054     10.947      0.000


TECHNICAL 1 OUTPUT

     PARAMETER SPECIFICATION

           NU
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
                  0             0             0             0             0

           LAMBDA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI                 0             0             0             0             0
 WFC                0             0             0             0             0
 JD                 0             0             0             0             0
 LD                 0             0             0             0             0
 WRS                0             0             0             0             0

           THETA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI                 0
 WFC                0             0
 JD                 0             0             0
 LD                 0             0             0             0
 WRS                0             0             0             0             0

           ALPHA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
                  1             2             3             4             0

           BETA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI                 0             0             5             0             6
 WFC                0             0             0             0             7
 JD                 0             8             0             0             9
 LD                 0            10            11             0             0
 WRS                0             0             0             0             0

           PSI
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI                12
 WFC                0            13
 JD                 0             0            14
 LD                15             0             0            16
 WRS                0             0             0             0             0

     STARTING VALUES

           NU
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
                0.000         0.000         0.000         0.000         0.000

           LAMBDA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI             1.000         0.000         0.000         0.000         0.000
 WFC            0.000         1.000         0.000         0.000         0.000
 JD             0.000         0.000         1.000         0.000         0.000
 LD             0.000         0.000         0.000         1.000         0.000
 WRS            0.000         0.000         0.000         0.000         1.000

           THETA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI             0.000
 WFC            0.000         0.000
 JD             0.000         0.000         0.000
 LD             0.000         0.000         0.000         0.000
 WRS            0.000         0.000         0.000         0.000         0.000

           ALPHA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
                2.120         3.430         2.520         2.730         2.500

           BETA
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI             0.000         0.000         0.000         0.000         0.000
 WFC            0.000         0.000         0.000         0.000         0.000
 JD             0.000         0.000         0.000         0.000         0.000
 LD             0.000         0.000         0.000         0.000         0.000
 WRS            0.000         0.000         0.000         0.000         0.000

           PSI
              TI            WFC           JD            LD            WRS
              ________      ________      ________      ________      ________
 TI             0.490
 WFC            0.000         0.530
 JD             0.000         0.000         0.328
 LD             0.000         0.000         0.000         0.387
 WRS            0.000         0.000         0.000         0.000         0.635

What’s \(R^2\) for our variables?

Report

grViz("
digraph causal{

  # a 'graph' statement
  graph [overlap = true, fontsize = 10]

  # several 'node' statements
  node  [shape = box,
         fontname = Helvetica]
WRS [label = 'Work Role Stress (WRS)']
WFC [label = 'Work Family Conflict (WFC)']
JD [label = 'Job Distress (JD)']
TI [label = 'Turnover Intentions (TI)']
LD [label = 'Life Distress (LD)']

# Edges
edge[color=black, arrowhead=vee]
WRS->WFC [label=<&beta;<SUB>1</SUB>=.63/.49**>]
WRS->TI [label=<&#946;<SUB>2</SUB>=.09/.08>]
WRS->JD [label=<&#946;<SUB>3</SUB>=.38/0.38**>]
WFC->JD [label=<&#946;<SUB>4</SUB>=.30/.38**>]
WFC->LD [label=<&#946;<SUB>5</SUB>=.19/.22**>]
JD->TI [label=<&#946;<SUB>6</SUB>=.54/.44**>]
JD->LD [label=<&#946;<SUB>7</SUB>=.68/.63**>]
TI->LD[dir=both, label=<&psi;=-.006/-.013>]
d1->WFC
d1 [shape=plaintext,label='']
d2->JD
d2 [shape=plaintext,label='']
d3->TI
d3 [shape=plaintext,label='']
d4->LD
d4 [shape=plaintext,label='']

{rank = same; WRS; WFC}
{rank = same; TI; LD}
}")

For example: \(\beta_7\) = .68/.63**, i.e. unstandardized/standardized both Y and X estimates and significance of p-value

What Does This Mean? How to Specify It?

cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999 (no covariance).inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data: 
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are 
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem

Usevariables are 
WRS WFC JD LD TI;
Analysis:
!Estimator = ML;
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
TI with LD@0;
Output:
Standardized sampstat Tech1 Tech8;

cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999 (new model1).inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data: 
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are 
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem

Usevariables are 
WRS WFC JD LD TI;
Analysis:
Estimator = ML;
Model:
WRS on WFC JD TI;
WFC on JD LD;
JD on TI LD;
Output:
Standardized Sampstat TECH1 TECH8;

cat(readLines('Examples/Day 1, Session 3 (Path Analysis)/Grandey & Cropanzano 1999 (new model2).inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data: 
File is Grandey & Cropanzano 1999.txt;
Type = means stdeviations correlation;
Nobservations = 132;
Variable:
names are 
WRS ! work role stress
FRS ! family role stress
WFC ! work-family conflict
FWC ! family-work conflict
JD ! job distress
FD ! family distress
LD ! life distress
TI ! turnover intentions
PPH ! poor physical health
SE; ! self-esteem

Usevariables are 
WRS WFC JD LD TI;
Analysis:
Estimator = ML;
Model:
TI LD on JD;
JD on WRS WFC;
WRS with WFC;
Output:
Standardized Sampstat TECH1 TECH8;

Summary

Part 4: Model Fit, Selection, Modification, & Equivalence

Model fit

ML Estimator

Model Build

Analysis/Estimated model (\(H_0\)) Unrestricted/Saturated/Alternate model (\(H_1\))
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Model:
JD LD TI WRS WFC with JD LD TI WRS WFC;
Baseline/Null Model
Model:

Model Diagnostics

Absolute Fit: A Models’ \(\chi^2\)

Using \(\chi^2\) to Compare Estimated Models

Absolute Fit: SRMR

Absolute Fit: RMSEA

\[ \frac{\sqrt{\chi^2_{Estimated} - df_{Estimated}}}{\sqrt{df\times (N-1)}}\]

Relative Fit: CFI

\[ \frac{\sqrt{(\chi^2_{Baseline} - df_{Baseline}) - (\chi^2_{Estimated} - df_{Estimated})}}{\sqrt{\chi^2_{Baseline} - df_{Baseline}}}\]

Relative Fit: TLI/NNFI

\[ \frac{\chi^2_{Baseline}/df_{Baseline} - \chi^2_{Estimated}/df_{Estimated}}{\chi^2_{Baseline}/ df_{Baseline} - 1}\]

Information Criteria: AIC

\[ \chi^2_{Estimated} + k\times (k-1) - 2\times df_{Estimated} \]

Information Criteria: BIC

\[ \chi^2_{Estimated} + ln(N)\times k\times (k-1)/2 - df_{Estimated} \]

\[ \chi^2_{Estimated} + ln((N+2)/24)\times k\times (k-1)/2 - df_{Estimated} \]

For both AIC and BIC, no significance tests… see recommendations in interpreting differences

Modification Indices

cat(readLines('Examples/Day 1, Session 4 (Model Fit)/Grandey & Cropanzano 1999 (ML) modindices.inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999 MonteCarlo.dat;
Variable:
names are JD LD TI WRS WFC;
Analysis:
Estimator = ML;
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Output:
Standardized sampstat Tech1 Tech8 MODINDICES(1);

Bayesian Estimates of Fit

cat(readLines('Examples/Day 1, Session 4 (Model Fit)/Grandey & Cropanzano 1999 (Bayes).inp'), sep = '\n')
Title: 
Bogus Mplus text;
Data:
File is Grandey & Cropanzano 1999 MonteCarlo.dat;
Variable:
names are JD LD TI WRS WFC;
Analysis:
Estimator = Bayes;
fbiterations=10000;
Processors=2;
Model:
TI on JD WRS;
LD on JD WFC;
JD on WRS WFC;
WFC on WRS;
Output:
Standardized sampstat Tech1 Tech8;

Posterior Predictive Checking

Deviance Information Criterion

Model Selection

Further Readings

Part 1

Grace, J. B., & Bollen, K. A. (2005). Interpreting the results from multiple regression and structural equation models. Bulletin of the Ecological Society of America, 86, 283-295.

Cohen, Cohen, West, & Aiken (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd). Lawrence Erlbaum & Associates.

Myers, Well, & Lorch (2010). Research Design and Statistical Analysis (3rd). Routledge Academic.

Tabachnick & Fidell (2006). Using Multivariate Statistics (5th). Allyn & Bacon.

Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London, Series A 222, 309–368.

Part 2

Mplus User’s Guide and Diagrammer Documentation

Part 3

Streiner, D. L. 2005. Finding Our Way: An Introduction to Path Analysis. Canadian Journal of Psychiatry, 50, 115-122.

Wright, S. (1921). Correlation and causation. J. Agricultural Research 20: 557–585.

Wright, S. (1934). The method of path coefficients. Annals of Mathematical Statistics 5: 161–215.

Angrist, J. D., & Krueger, A. B. (2001). Instrumental variables and the search for identification: From supply and demand to natural experiments. Journal of Economics Perspectives, 15: 69-85.

Muthén, B. (2002). Beyond SEM: General latent variable modeling. Behaviormetrika, 29, 81-117.

Part 4

Hu & Bentler (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives, Structural Equation Modeling, 6(1), 1-55.

Personality and Individual Differences, Volume 42, Issue 5. 2007.

Neyman, J., & Pearson, E. S. 1933. On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society, A, 231, 289-337.

Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111-16

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795.

McDonald, R. P. (2010). Structural models and the art of approximation. Perspectives on Psychological Science, 5, 675-686.

Cheung, G. W., & Rensvold, R. B. (1999). Testing factorial invariance across groups: a reconceptualization and proposed new method. Journal of Management, 25, 1–27.

Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255.

Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411-423.

Mulaik, S. A., & Millsap, R. E. (2000). Doing the four-step right. Structural Equation Modeling, 7, 36-73. (see other papers in the same issue).

Shapin, S. 2008. The scientific life: A moral history of a late modern vocation. Chicago University Press.

Porter, T. M. 1995. Trust in Numbers: The Pursuit of Objectivity in Science and Public Life.

Poovey, M. 1997. A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society. Chicago.

McCloskey, D. N. 1992. If you’re so smart: The narrative of economic expertise. University of Chicago Press.

McCloskey, D. N. 1992. The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives.

Asparouhov, T., Muthén, B. & Morin, A. J. S. 2015. Bayesian structural equation modeling with cross-loadings and residual covariances: Comments on Stromeyer et al. Journal of Management, 41, 1561-1577.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hai-mn/hai-mn.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Nguyen (2021, May 11). HaiBiostat: A Note on 5-day Workshop on Mplus ~ Day 1. Retrieved from https://hai-mn.github.io/posts/2021-05-11-5-day-mplus-workshop-michael-zyphur-day-1/

BibTeX citation

@misc{nguyen2021a,
  author = {Nguyen, Hai},
  title = {HaiBiostat: A Note on 5-day Workshop on Mplus ~ Day 1},
  url = {https://hai-mn.github.io/posts/2021-05-11-5-day-mplus-workshop-michael-zyphur-day-1/},
  year = {2021}
}