ThurMod
are performed. For the
moment, we separate two different model types:
The Thurstonian factor model was introduced by Maydeu-Olivares & Böckenholt (2005), the Thurstonian IRT model was introduced by Maydeu-Olivares & Brown (2010). For a review see Jansen & Schulze (2023a). For further extensions and discussion about the model types see Jansen & Schulze (2023b). For example, the factor and IRT model both can be further differentiated. For the differentiation we use the design matrix \(A\) which represents all paired comparisons of a design. For a design of four items, this would be
[,1] [,2] [,3] [,4]
[1,] 1 -1 0 0
[2,] 1 0 -1 0
[3,] 1 0 0 -1
[4,] 0 1 -1 0
[5,] 0 1 0 -1
[6,] 0 0 1 -1
The design matrix can be (Jansen & Schulze, 2023b):
First, we will load the package required in the vignette.
library(ThurMod)
Further we simulate data of 1000 respondents with loadings between .30 and .95 and latent utility means between -1 and 1. We assume that the data results from rankings, that is that transitivity between responses holds (the variance of the binary indicators is zero). Further, we assume uncorrelated data. We will simulate the data of all paired comparisons possible (see Jansen & Schulze, 2023b).
Set up the simulation conditions:
<- 4
nfactor <- 12
nitem <- 1000
nperson <- rep(1:4, 3)
itf <- diag(1, 4)
varcov
# latent utility means
set.seed(69)
<- runif(nitem, -1, 1)
lmu <- runif(nitem, 0.30, 0.95) loadings
Next, we simulate a data set based on the true parameter values:
<- sim.data(nfactor = nfactor, nitem = nitem, nperson = nperson, itf = itf,
data varcov = varcov, lmu = lmu, loadings = loadings)
#save the file
write.table(data, 'myDataFile_f.dat', quote = FALSE, sep = " ", col.names = FALSE, row.names =
FALSE)
The data set contains all (12 )/2 = 66 possible paired comparison variables. With this data set we will perform analyses of the full design (IRT, CFA).
Full models include all paired comparisons. The estimation of these models can take a while, as with categorical data a large correlations matrix must be estimated.
For all functions, the blocks we use must be defined. In a full model, there is only one block with all items:
<- matrix(1:nitem, nrow = 1) blocks
The blocks
must be defined as a matrix where the rows
correspond to each block. Only for a full design, a vector, for example,
1:12
would work.
We will analyse the data with Mplus and lavaan
. We can
do this in two ways: First, in three separate steps, second,
directly.
Way 1, step 1: Build syntax
# Mplus
syntax.mplus(blocks, itf, model = 'lmean', input_path = 'myFC_model_f.inp', data_path =
"myDataFile_f.dat")
#lavaan
<- syntax.lavaan(blocks, itf, model = 'lmean') modelsyn
For step 2, now these syntaxes must be run:
# Mplus
system('mplus myFC_model_f.inp', wait = TRUE, show.output.on.console= FALSE)
#lavaan
<- lavaan::lavaan(modelsyn, data = data, ordered = TRUE, auto.fix.first = FALSE,
results_lav1 auto.var = TRUE, int.lv.free = TRUE, parameterization = "theta",
estimator = 'ULSMV')
If you replicate this, be patient, it can take about 40 minutes each, depending on your processing power.
Step 3: Read results. For Mplus you can either open the output file,
or use read.mplus
# Mplus
<- read.mplus(blocks, itf, model = 'lmean', output_path = "myFC_model_f.out")
results_mplus1 unlist(results_mplus1$fit)
srmr cfi tli chisq df pvalue
0.0460 1.0000 1.0000 2131.5890 2171.0000 0.7231
blchisq bldf blpvalue rmsea rmsea_p rmsea.ci.lower
20861.0240 2145.0000 0.0000 0.0000 1.0000 0.0000
rmsea.ci.upper
0.0060
#lavaan
results_lav1
lavaan 0.6-11 ended normally after 229 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 40
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 4941.945 2130.962
Degrees of freedom 2171 2171
P-value (Unknown) NA 0.726
Scaling correction factor 7.895
Shift parameter 1504.987
simple second-order correction
::fitmeasures(results_lav1)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
lavaan'rmsea.pvalue.scaled','cfi.scaled')]
rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.000000000 0.000000000 0.005706838 1.000000000
cfi.scaled
1.000000000
The second way is to use fit_mplus
or
fit_lavaan
, each function does the above steps at once:
# Mplus
<- fit.mplus(blocks, itf, model = 'lmean', input_path = 'myFC_model_f.inp',
results_mplus2 output_path = "myFC_model_f.out", data_path = "myDataFile_f.dat")
unlist(results_mplus2$fit)
srmr cfi tli chisq df pvalue
0.0460 1.0000 1.0000 2131.5890 2171.0000 0.7231
blchisq bldf blpvalue rmsea rmsea_p rmsea.ci.lower
20861.0240 2145.0000 0.0000 0.0000 1.0000 0.0000
rmsea.ci.upper
0.0060
#lavaan
<- fit.lavaan(blocks, itf, model = 'lmean', data = data)
results_lav2 results_lav2
lavaan 0.6-11 ended normally after 229 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 40
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 4941.945 2130.962
Degrees of freedom 2171 2171
P-value (Unknown) NA 0.726
Scaling correction factor 7.895
Shift parameter 1504.987
simple second-order correction
::fitmeasures(results_lav2)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
lavaan'rmsea.pvalue.scaled','cfi.scaled')]
rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.000000000 0.000000000 0.005706838 1.000000000
cfi.scaled
1.000000000
For IRT models, the procedure is the same, just change
model = 'lmean'
to model = 'irt'
, for
example
# Mplus
<- fit.mplus(blocks, itf, model = 'irt', input_path = 'myFC_model_irt.inp',
results_mplus2irt output_path = "myFC_model_irt.out", data_path = "myDataFile_f.dat")
unlist(results_mplus2irt$fit)
#lavaan
<- fit.lavaan(blocks, itf, model = 'irt', data = data)
results_lav2irt
results_lav2irt::fitmeasures(results_lav2irt)[c('rmsea.scaled', 'rmsea.ci.lower.scaled',
lavaan'rmsea.ci.upper.scaled','rmsea.pvalue.scaled',
'cfi.scaled')]
<- lavaan::lavPredict(results_lav2irt) scores
The first block models were introduced as multidimensional forced-choice blocks (Brown & Maydeu-Olivares, 2011). However, it was shown that the models must neither be multidimensional, nor must every paired comparison only be contained once (Jansen & Schulze, 2023b).
We first simulate a new data set. Set up the simulation conditions:
<- 5
nfactor <- 30
nitem <- 1000
nperson <- rep(1:5, 6)
itf <- diag(1, 5)
varcov
# latent utility means
set.seed(69)
<- runif(nitem, -1, 1)
lmu <- runif(nitem, 0.30, 0.95) loadings
Next, we simulate a data set based on the true parameter values:
set.seed(1234)
<- sim.data(nfactor = nfactor, nitem = nitem, nperson = nperson, itf = itf, varcov = varcov,
data lmu = lmu, loadings = loadings)
#save the file
write.table(data,'myDataFile.dat', quote = FALSE, sep = " ", col.names = FALSE, row.names = FALSE)
Next, we consider unlinked blocks of three items (triplets):
<- matrix(sample(1:nitem, nitem), ncol = 3) blocks
The blocks are defined by a matrix where the rows are the blocks and the number of columns is the number of items per block. Before we can fit the model, we must ensure that the data fits to the syntax we create. The data simulated before assumes that all items are in ascending order. This is important, as the order of the items determine the coding. We code binary items as \[\begin{equation} y_{l} = \begin{cases} 1 & \text{if item $i$ is chosen over item $j$} \\ 0 & \text{else} .\\ \end{cases} \end{equation}\] In a ranking task, all choice alternatives are presented at once, but for each possible comparison, the coding scheme can be used. For example, consider \(n = 3\) items which are labeled as {A, B, C}. Then we have the pairs {A, B}, {A, C}, {B, C}. If for {A, B} A is preferred over B then \(y_{A,B}=1\) and 0 otherwise. This can be done for every paired comparison and ranking task (Maydeu-Olivares & Böckenholt, 2005): For {A,C,B} the binary outcomes are \(y_{A,B}=1\), \(y_{A,C}=1\), and \(y_{B,C}=0\). However, if the order of the binary item is changed, then the coding is changed accordingly, e.g. \(y_{B,A}=0\), \(y_{C,A}=0\), and \(y_{C,B}=1\).
We can get an overview over all possible comparisons by the block
design, by using pair.combn
:
pair.combn(blocks)
[,1] [,2]
[1,] 1 2
[2,] 4 23
[3,] 4 28
[4,] 5 14
[5,] 5 22
[6,] 8 11
[7,] 8 21
[8,] 9 3
[9,] 10 7
[10,] 13 3
[11,] 13 9
[12,] 14 22
[13,] 16 15
[14,] 16 18
[15,] 18 15
[16,] 19 17
[17,] 20 12
[18,] 20 27
[19,] 21 11
[20,] 24 17
[21,] 24 19
[22,] 25 1
[23,] 25 2
[24,] 26 7
[25,] 26 10
[26,] 27 12
[27,] 28 23
[28,] 29 6
[29,] 30 6
[30,] 30 29
See that we have some items, that are not in ascending order. The
data simulated assumes, that the first named item (item \(i\)) has a smaller number e.g., i3i9. The
blocks we created however, have some comparisons, where the first items
have a larger number e.g. i9i3. There are two ways to fit the syntax to
the data: First, we can simply sort the blocks via
blocksort
:
<- blocksort(blocks)
blocks_sorted pair.combn(blocks_sorted)
[,1] [,2]
[1,] 1 2
[2,] 1 25
[3,] 2 25
[4,] 3 9
[5,] 3 13
[6,] 4 23
[7,] 4 28
[8,] 5 14
[9,] 5 22
[10,] 6 29
[11,] 6 30
[12,] 7 10
[13,] 7 26
[14,] 8 11
[15,] 8 21
[16,] 9 13
[17,] 10 26
[18,] 11 21
[19,] 12 20
[20,] 12 27
[21,] 14 22
[22,] 15 16
[23,] 15 18
[24,] 16 18
[25,] 17 19
[26,] 17 24
[27,] 19 24
[28,] 20 27
[29,] 23 28
[30,] 29 30
Now all paired comparisons that can be derived are in ascending order.
The second way is to recode the corresponding binary indicators in the data, as if we flip the coding schema. We can just recode the variables:
# Get names of binary indicators that have non-ascending names
<- which(pair.combn(blocks)[,1] > pair.combn(blocks)[,2])
tmp
# get names
<- i.name(blocks)
pair_names_b <- i.name(1:nitem)
pair_names_ori
# Rename
<- i.name(1:nitem)
pair_names if(length(tmp) != 0){
<- pair_names_b[tmp]
tmp1 <- sub('^i.+i','i', tmp1)
tmp2 <- tmp1
tmp3 for(j in 1:length(tmp)){
<- paste0(tmp2[j], sub(paste0(tmp2[j],'$'), '', tmp1[j]))
tmp3 which(pair_names %in% tmp3)] <- pair_names_b[tmp[j]]
pair_names[
}
}
<- which(!names(data) %in% pair_names)
tmp # Clone data
<- data
data_recoded # Recode and rename
<- abs(data[,tmp]-1)
data_recoded[,tmp] names(data_recoded) <- pair_names
# Save data
write.table(data_recoded, 'myDataFile_rec.dat', quote = FALSE, sep = " ", col.names = FALSE,
row.names = FALSE)
Both ways yield the same results. We have to add the argument
data_full = TRUE
, as we simulated all data in the data
file, but only use some of the data. We demonstrate only the
model = 'irt'
case, however, also for the CFA model types,
these analyses can be performed (model = 'lmean'
,
model = 'uc'
, or model = 'simple'
).
# Blocks_sorted
# Mplus
<- fit.mplus(blocks_sorted, itf, model = 'irt', input_path = 'myFC_model_bu.inp',
results_mplus_b output_path = "myFC_model_bu.out", data_path = "myDataFile.dat",
data_full = TRUE)
unlist(results_mplus_b$fit)
srmr cfi tli chisq df pvalue
0.0400 1.0000 1.0000 345.9970 375.0000 0.8561
blchisq bldf blpvalue rmsea rmsea_p rmsea.ci.lower
4370.6990 435.0000 0.0000 0.0000 1.0000 0.0000
rmsea.ci.upper
0.0070
#lavaan
<- fit.lavaan(blocks_sorted, itf, model = 'irt', data = data)
results_lav_b results_lav_b
lavaan 0.6-11 ended normally after 126 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 155
Number of equality constraints 78
Row rank of the constraints matrix 65
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 787.985 345.776
Degrees of freedom 375 375
P-value (Unknown) NA 0.858
Scaling correction factor 3.576
Shift parameter 125.438
simple second-order correction
::fitmeasures(results_lav_b)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
lavaan'rmsea.pvalue.scaled', 'cfi.scaled')]
rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.00000000 0.00000000 0.00664782 1.00000000
cfi.scaled
1.00000000
::summary(results_lav_b) lavaan
lavaan 0.6-11 ended normally after 126 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 155
Number of equality constraints 78
Row rank of the constraints matrix 65
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 787.985 345.776
Degrees of freedom 375 375
P-value (Unknown) NA 0.858
Scaling correction factor 3.576
Shift parameter 125.438
simple second-order correction
Parameter Estimates:
Standard errors Robust.sem
Information Expected
Information saturated (h1) model Unstructured
Latent Variables:
Estimate Std.Err z-value P(>|z|)
Trait1 =~
i1i2 (L1) 1.409 0.230 6.130 0.000
i1i25 (L1) 1.409 0.230 6.130 0.000
i6i29 (L6) 1.443 0.238 6.066 0.000
i6i30 (L6) 1.443 0.238 6.066 0.000
i7i26 (L_26) -1.188 0.150 -7.938 0.000
i8i11 (L_11) -1.455 0.277 -5.243 0.000
i8i21 (L_21) -1.051 0.231 -4.552 0.000
i10i26 (L_26) -1.188 0.150 -7.938 0.000
i11i21 (L11_) 0.403 0.106 3.816 0.000
i15i16 (L_16) -0.513 0.104 -4.925 0.000
i16i18 (L16) 0.513 0.104 4.925 0.000
Trait2 =~
i1i2 (L_2) 1.499 0.261 5.747 0.000
i2i25 (L2) -1.499 0.261 -5.747 0.000
i5i22 (L_22) 1.233 0.144 8.562 0.000
i7i10 (L7) -1.121 0.165 -6.813 0.000
i7i26 (L7) -1.121 0.165 -6.813 0.000
i12i20 (L12) -0.515 0.112 -4.584 0.000
i12i27 (L12_) -0.071 0.082 -0.861 0.389
i14i22 (L_22) 1.233 0.144 8.562 0.000
i17i19 (L17) -0.353 0.091 -3.877 0.000
i17i24 (L17) -0.353 0.091 -3.877 0.000
i20i27 (L_27) 0.444 0.107 4.136 0.000
Trait3 =~
i3i9 (L3) -0.172 0.170 -1.012 0.311
i3i13 (L3_L) 0.329 0.109 3.033 0.002
i4i23 (L_23) 0.754 0.246 3.069 0.002
i4i28 (L_28) 0.095 0.253 0.376 0.707
i8i11 (L8) -0.569 0.291 -1.954 0.051
i8i21 (L8) -0.569 0.291 -1.954 0.051
i9i13 (L_13) 0.502 0.180 2.792 0.005
i15i18 (L_18) 0.385 0.150 2.561 0.010
i16i18 (L_18) 0.385 0.150 2.561 0.010
i23i28 (L23_) -0.658 0.170 -3.871 0.000
Trait4 =~
i3i9 (L_9) 0.693 0.116 5.959 0.000
i4i23 (L4) -1.084 0.192 -5.630 0.000
i4i28 (L4) -1.084 0.192 -5.630 0.000
i5i14 (L_14) 0.949 0.132 7.205 0.000
i6i29 (L_29) 1.530 0.242 6.328 0.000
i9i13 (L9) -0.693 0.116 -5.959 0.000
i14i22 (L14) -0.949 0.132 -7.205 0.000
i17i19 (L_19) 0.625 0.097 6.458 0.000
i17i24 (L_24) 0.718 0.102 7.049 0.000
i19i24 (L19_) 0.092 0.071 1.310 0.190
i29i30 (L29) -1.530 0.242 -6.328 0.000
Trait5 =~
i1i25 (L_25) 1.716 0.251 6.838 0.000
i2i25 (L_25) 1.716 0.251 6.838 0.000
i5i14 (L5) -0.764 0.128 -5.948 0.000
i5i22 (L5) -0.764 0.128 -5.948 0.000
i6i30 (L_30) 1.366 0.220 6.205 0.000
i7i10 (L_10) 1.050 0.150 7.016 0.000
i10i26 (L10) -1.050 0.150 -7.016 0.000
i12i20 (L_20) 0.472 0.101 4.672 0.000
i15i16 (L15) -0.657 0.121 -5.451 0.000
i15i18 (L15) -0.657 0.121 -5.451 0.000
i20i27 (L20) -0.472 0.101 -4.672 0.000
i29i30 (L_30) 1.366 0.220 6.205 0.000
Covariances:
Estimate Std.Err z-value P(>|z|)
Trait1 ~~
Trait2 -0.372 0.104 -3.567 0.000
Trait3 0.102 0.198 0.514 0.607
Trait4 -0.394 0.089 -4.410 0.000
Trait5 -0.329 0.110 -2.996 0.003
Trait2 ~~
Trait3 -0.216 0.193 -1.123 0.261
Trait4 0.283 0.114 2.480 0.013
Trait5 0.298 0.123 2.425 0.015
Trait3 ~~
Trait4 -0.142 0.190 -0.747 0.455
Trait5 -0.202 0.205 -0.983 0.326
Trait4 ~~
Trait5 0.412 0.099 4.172 0.000
.i1i2 ~~
.i1i25 (e1) 1.000
.i2i25 (e_2) -0.713 0.310 -2.298 0.022
.i1i25 ~~
.i2i25 (e25) 0.113 0.223 0.507 0.612
.i3i9 ~~
.i3i13 (e3) 1.000 NA
.i9i13 (e_9) -0.921 0.235 -3.924 0.000
.i3i13 ~~
.i9i13 (e13) 0.824 0.219 3.763 0.000
.i4i23 ~~
.i4i28 (e4) 1.000
.i23i28 (e_23) -0.424 0.238 -1.784 0.074
.i4i28 ~~
.i23i28 (e28) 1.243 0.360 3.455 0.001
.i5i22 ~~
.i5i14 (e5) 1.000 NA
.i14i22 ~~
.i5i14 (e_14) -0.874 0.212 -4.113 0.000
.i5i22 ~~
.i14i22 (e22) 0.427 0.194 2.203 0.028
.i6i29 ~~
.i6i30 (e6) 1.000 NA
.i29i30 (e_29) -0.708 0.341 -2.080 0.038
.i6i30 ~~
.i29i30 (e30) 0.859 0.348 2.470 0.014
.i7i26 ~~
.i7i10 (e7) 1.000 NA
.i10i26 ~~
.i7i10 (e_10) -0.673 0.196 -3.442 0.001
.i7i26 ~~
.i10i26 (e26) 0.678 0.193 3.519 0.000
.i8i11 ~~
.i8i21 (e8) 1.000
.i11i21 (e_11) -0.189 0.176 -1.074 0.283
.i8i21 ~~
.i11i21 (e21) 1.212 0.387 3.132 0.002
.i12i20 ~~
.i12i27 (e12) 1.000 NA
.i20i27 (e_20) -1.639 0.358 -4.577 0.000
.i12i27 ~~
.i20i27 (e27) 1.147 0.277 4.138 0.000
.i15i16 ~~
.i15i18 (e15) 1.000
.i16i18 (e_16) -1.757 0.422 -4.160 0.000
.i16i18 ~~
.i15i18 (e18) 1.235 0.349 3.535 0.000
.i17i19 ~~
.i17i24 (e17) 1.000 NA
.i19i24 (e_19) -0.896 0.182 -4.914 0.000
.i17i24 ~~
.i19i24 (e24) 0.799 0.170 4.711 0.000
Intercepts:
Estimate Std.Err z-value P(>|z|)
.i1i2 0.000
.i1i25 0.000
.i6i29 0.000
.i6i30 0.000
.i7i26 0.000
.i8i11 0.000
.i8i21 0.000
.i10i26 0.000
.i11i21 0.000
.i15i16 0.000
.i16i18 0.000
.i2i25 0.000
.i5i22 0.000
.i7i10 0.000
.i12i20 0.000
.i12i27 0.000
.i14i22 0.000
.i17i19 0.000
.i17i24 0.000
.i20i27 0.000
.i3i9 0.000
.i3i13 0.000
.i4i23 0.000
.i4i28 0.000
.i9i13 0.000
.i15i18 0.000
.i23i28 0.000
.i5i14 0.000
.i19i24 0.000
.i29i30 0.000
Trait1 0.000
Trait2 0.000
Trait3 0.000
Trait4 0.000
Trait5 0.000
Thresholds:
Estimate Std.Err z-value P(>|z|)
i1i2|t1 0.722 0.108 6.667 0.000
i1i25|t1 -1.113 0.140 -7.948 0.000
i6i29|t1 -2.274 0.259 -8.769 0.000
i6i30|t1 -1.626 0.195 -8.342 0.000
i7i26|t1 0.101 0.073 1.385 0.166
i8i11|t1 0.215 0.082 2.641 0.008
i8i21|t1 -0.934 0.142 -6.562 0.000
i10i26|t1 0.483 0.084 5.739 0.000
i11i21|t1 -1.296 0.208 -6.220 0.000
i15i16|t1 0.194 0.073 2.663 0.008
i16i18|t1 1.071 0.136 7.884 0.000
i2i25|t1 -1.980 0.244 -8.101 0.000
i5i22|t1 -0.388 0.072 -5.400 0.000
i7i10|t1 -0.415 0.079 -5.230 0.000
i12i20|t1 1.170 0.110 10.667 0.000
i12i27|t1 0.868 0.083 10.406 0.000
i14i22|t1 -0.273 0.073 -3.753 0.000
i17i19|t1 -0.118 0.060 -1.949 0.051
i17i24|t1 0.384 0.065 5.948 0.000
i20i27|t1 -0.252 0.074 -3.381 0.001
i3i9|t1 -1.054 0.097 -10.882 0.000
i3i13|t1 0.685 0.072 9.570 0.000
i4i23|t1 -0.525 0.091 -5.740 0.000
i4i28|t1 -0.627 0.099 -6.344 0.000
i9i13|t1 1.727 0.166 10.373 0.000
i15i18|t1 1.234 0.125 9.836 0.000
i23i28|t1 -0.091 0.058 -1.555 0.120
i5i14|t1 -0.184 0.066 -2.762 0.006
i19i24|t1 0.457 0.065 7.045 0.000
i29i30|t1 0.588 0.113 5.213 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
Trait1 1.000
Trait2 1.000
Trait3 1.000
Trait4 1.000
Trait5 1.000
.i1i2 (e1e2) 1.713 0.310 5.524 0.000
.i1i25 (e125) 1.113 0.223 4.984 0.000
.i2i25 (e225) 0.826 0.398 2.074 0.038
.i3i9 (e3e9) 1.921 0.235 8.185 0.000
.i3i13 (e313) 1.824 0.219 8.329 0.000
.i4i23 (e423) 1.424 0.238 5.987 0.000
.i4i28 (e428) 2.243 0.360 6.235 0.000
.i5i14 (e514) 1.874 0.212 8.819 0.000
.i5i22 (e522) 1.427 0.194 7.366 0.000
.i6i29 (e629) 1.708 0.341 5.015 0.000
.i6i30 (e630) 1.859 0.348 5.346 0.000
.i7i10 (e710) 1.673 0.196 8.557 0.000
.i7i26 (e726) 1.678 0.193 8.709 0.000
.i8i11 (e811) 1.189 0.176 6.750 0.000
.i8i21 (e821) 2.212 0.387 5.717 0.000
.i9i13 (e913) 1.745 0.374 4.668 0.000
.i10i26 (e102) 1.351 0.317 4.265 0.000
.i11i21 (e112) 1.401 0.454 3.087 0.002
.i12i20 (e1220) 2.639 0.358 7.370 0.000
.i12i27 (e1227) 2.147 0.277 7.747 0.000
.i14i22 (e142) 1.301 0.318 4.090 0.000
.i15i16 (e1516) 2.757 0.422 6.528 0.000
.i15i18 (e1518) 2.235 0.349 6.396 0.000
.i16i18 (e161) 2.993 0.707 4.234 0.000
.i17i19 (e171) 1.896 0.182 10.402 0.000
.i17i24 (e172) 1.799 0.170 10.605 0.000
.i19i24 (e192) 1.695 0.301 5.632 0.000
.i20i27 (e202) 2.786 0.572 4.866 0.000
.i23i28 (e232) 1.667 0.497 3.357 0.001
.i29i30 (e293) 1.567 0.559 2.801 0.005
Scales y*:
Estimate Std.Err z-value P(>|z|)
i1i2 0.478
i1i25 0.474
i6i29 0.477
i6i30 0.471
i7i26 0.546
i8i11 0.513
i8i21 0.516
i10i26 0.573
i11i21 0.800
i15i16 0.556
i16i18 0.539
i2i25 0.472
i5i22 0.580
i7i10 0.548
i12i20 0.579
i12i27 0.682
i14i22 0.572
i17i19 0.661
i17i24 0.660
i20i27 0.570
i3i9 0.637
i3i13 0.719
i4i23 0.542
i4i28 0.538
i9i13 0.623
i15i18 0.585
i23i28 0.690
i5i14 0.602
i19i24 0.766
i29i30 0.497
Defined Parameters:
Estimate Std.Err z-value P(>|z|)
L11 1.455 0.277 5.243 0.000
L13 -0.502 0.180 -2.792 0.005
L18 -0.385 0.150 -2.561 0.010
L19 -0.625 0.097 -6.458 0.000
L21 1.051 0.231 4.552 0.000
L22 -1.233 0.144 -8.562 0.000
L23 -0.754 0.246 -3.069 0.002
L24 -0.718 0.102 -7.049 0.000
L25 -1.716 0.251 -6.838 0.000
L26 1.188 0.150 7.938 0.000
L27 -0.444 0.107 -4.136 0.000
L28 -0.095 0.253 -0.376 0.707
L30 -1.366 0.220 -6.205 0.000
Constraints:
|Slack|
L11_L21 - (L11-L21) 0.000
L12_L27 - (L12-L27) 0.000
L3_L13 - (L3-L13) 0.000
L23_L28 - (L23-L28) 0.000
L19_L24 - (L19-L24) 0.000
L_2 - (-L2) 0.000
L_9 - (-L9) 0.000
L_10 - (-L10) 0.000
L_11 - (-L11) 0.000
L_13 - (-L13) 0.000
L_14 - (-L14) 0.000
L_16 - (-L16) 0.000
L_18 - (-L18) 0.000
L_19 - (-L19) 0.000
L_20 - (-L20) 0.000
L_21 - (-L21) 0.000
L_22 - (-L22) 0.000
L_23 - (-L23) 0.000
L_24 - (-L24) 0.000
L_25 - (-L25) 0.000
L_26 - (-L26) 0.000
L_27 - (-L27) 0.000
L_28 - (-L28) 0.000
L_29 - (-L29) 0.000
L_30 - (-L30) 0.000
e1e2 - (e1-e_2) 0.000
e3e9 - (e3-e_9) 0.000
e1e25 - (e1+e25) 0.000
e3e13 - (e3+e13) 0.000
e4e28 - (e4+e28) 0.000
e5e22 - (e5+e22) 0.000
e6e30 - (e6+e30) 0.000
e7e26 - (e7+e26) 0.000
e8e21 - (e8+e21) 0.000
e4e23 - (e4-e_23) 0.000
e5e14 - (e5-e_14) 0.000
e6e29 - (e6-e_29) 0.000
e7e10 - (e7-e_10) 0.000
e8e11 - (e8-e_11) 0.000
e12e27 - (e12+e27) 0.000
e15e18 - (e15+e18) 0.000
e17e24 - (e17+e24) 0.000
e2e25 - (-e_2+e25) 0.000
e9e13 - (-e_9+e13) 0.000
e12e20 - (e12-e_20) 0.000
e15e16 - (e15-e_16) 0.000
e17e19 - (e17-e_19) 0.000
e10e26 - (-e_10+e26) 0.000
e11e21 - (-e_11+e21) 0.000
e14e22 - (-e_14+e22) 0.000
e16e18 - (-e_16+e18) 0.000
e19e24 - (-e_19+e24) 0.000
e20e27 - (-e_20+e27) 0.000
e23e28 - (-e_23+e28) 0.000
e29e30 - (-e_29+e30) 0.000
e3 - (1) 0.000
e15 - (1) 0.000
e8 - (1) 0.000
e5 - (1) 0.000
e17 - (1) 0.000
e1 - (1) 0.000
e7 - (1) 0.000
e12 - (1) 0.000
e4 - (1) 0.000
e6 - (1) 0.000
<- lavaan::lavPredict(results_lav_b)
scores
# Recoded data
# Mplus
<- fit.mplus(blocks, itf, model = 'irt', input_path = 'myFC_model_brecu.inp',
results_mplus_brec output_path = "myFC_model_brecu.out", data_path =
"myDataFile_rec.dat", data_full = TRUE)
unlist(results_mplus_brec$fit)
srmr cfi tli chisq df pvalue
0.0400 1.0000 1.0000 345.9970 375.0000 0.8561
blchisq bldf blpvalue rmsea rmsea_p rmsea.ci.lower
4370.6990 435.0000 0.0000 0.0000 1.0000 0.0000
rmsea.ci.upper
0.0070
#lavaan
<- fit.lavaan(blocks, itf, model = 'irt', data = data_recoded)
results_lav_brec results_lav_brec
lavaan 0.6-11 ended normally after 143 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 155
Number of equality constraints 78
Row rank of the constraints matrix 65
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 787.985 345.776
Degrees of freedom 375 375
P-value (Unknown) NA 0.858
Scaling correction factor 3.576
Shift parameter 125.438
simple second-order correction
::fitmeasures(results_lav_brec)[c('rmsea.scaled', 'rmsea.ci.lower.scaled',
lavaan'rmsea.ci.upper.scaled','rmsea.pvalue.scaled',
'cfi.scaled')]
rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.000000000 0.000000000 0.006647839 1.000000000
cfi.scaled
1.000000000
<- lavaan::lavPredict(results_lav_brec) scores
In the case of blocks, we must correct the fit indices, as there are
redundancies among the thresholds and tetrachoric correlations. The
redundancies can be determined with the function
redundancies
. We can also directly correct the fit with
fit.correct
:
#save fit measures
<- lavaan::fitmeasures(results_lav_brec)
tmp fit.correct(1000, blocks, tmp['chisq.scaled'], tmp['df.scaled'], tmp['baseline.chisq.scaled'],
'baseline.df.scaled']) tmp[
df.df_corr rmsea cfi
365 0 1
Now, we consider linked block designs (Jansen & Schulze, 2023b). The simplest way to construct these designs is to take the original unlinked design and link all blocks together (rank of the design \(r_D=N-1\), where \(N\) is the number of items; Jansen & Schulze, 2023b). The original blocks are:
blocks
[,1] [,2] [,3]
[1,] 25 1 2
[2,] 30 29 6
[3,] 16 18 15
[4,] 5 14 22
[5,] 24 19 17
[6,] 13 9 3
[7,] 26 10 7
[8,] 8 21 11
[9,] 20 27 12
[10,] 4 28 23
To get a linked design we need extra blocks. The number can be
determined with count.xblocks
count.xblocks(blocks)
[1] 5
Hence, we need five extra triplets in this case. We add for example the
following linking blocks:
<- matrix(c(25,30,16,5,25,13,26,8,20,4,29,19,1,3,21), ncol = 3, byrow = TRUE)
blocks_extra <- rbind(blocks, blocks_extra) blocks_con
We can determine the rank of the design matrix with
rankA
:
rankA(blocks_con)
[1] 29
The rank is 30-1=29. The function metablock
also gives a
overview over which blocks are linked.
metablock(blocks_con)
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
[29] 29 30
There is only one meta block, therefore all items are linked. A counter example would be if we would not add block 4:
<- matrix(c(25,30,16,5,25,13,26,8,20,1,3,21), ncol = 3, byrow = TRUE)
blocks_extra <- rbind(blocks, blocks_extra) blocks_con
Then the rank is
rankA(blocks_con)
[1] 27
The rank is 30-1=29\(\neq\) 27. And we have three “meta” blocks.
metablock(blocks_con)
[[1]]
[1] 17 19 24
[[2]]
[1] 4 23 28
[[3]]
[1] 1 2 3 5 6 7 8 9 10 11 12 13 14 15 16 18 20 21 22 25 26 27 29 30
Instead of manually constructing the blocks, ThurMod
enables the user to get extra blocks via a function
get.xblocks
. We have to define if the blocks should be
multidimensional (else multidim = FALSE
), and if items
should not be considered (e.g. because they are negatively keyed):
<- get.xblocks(blocks, itf, multidim = TRUE, item_not = NULL)
blocks_extra <- rbind(blocks, blocks_extra)
blocks_con blocks_con
[,1] [,2] [,3]
[1,] 25 1 2
[2,] 30 29 6
[3,] 16 18 15
[4,] 5 14 22
[5,] 24 19 17
[6,] 13 9 3
[7,] 26 10 7
[8,] 8 21 11
[9,] 20 27 12
[10,] 4 28 23
[11,] 2 29 18
[12,] 14 17 13
[13,] 26 8 20
[14,] 4 30 3
[15,] 10 22 8
<- blocksort(blocks_con) blocks_con_sorted
If we take the blocks as is, we need to recode the data set:
# Get names of binary indicators that have non-ascending names
<- which(pair.combn(blocks_con)[,1] > pair.combn(blocks_con)[,2])
tmp
# get names
<- i.name(blocks_con)
pair_names_b <- i.name(1:nitem)
pair_names_ori
# Rename
<- i.name(1:nitem)
pair_names if(length(tmp)!=0){
<- pair_names_b[tmp]
tmp1 <- sub('^i.+i','i', tmp1)
tmp2 <- tmp1
tmp3 for(j in 1:length(tmp)){
<- paste0(tmp2[j], sub(paste0(tmp2[j],'$'), '', tmp1[j]))
tmp3 which(pair_names %in% tmp3)] <- pair_names_b[tmp[j]]
pair_names[
}
}
<- which(!names(data) %in% pair_names)
tmp # Clone data
<- data
data_recoded # Recode and rename
<- abs(data[,tmp]-1)
data_recoded[,tmp] names(data_recoded) <- pair_names
# Save data
write.table(data_recoded, 'myDataFile_rec.dat', quote = FALSE, sep = " ", col.names = FALSE,
row.names = FALSE)
The estimation of linked designs is done equivalent via
# Blocks_sorted
# Mplus
<- fit.mplus(blocks_con_sorted,itf,model='irt',input_path='myFC_model_b_con.inp',
results_mplus_b output_path="myFC_model_b_con.out",data_path="myDataFile.dat",
data_full = TRUE)
unlist(results_mplus_b$fit)
srmr cfi tli chisq df pvalue
0.0460 0.9990 0.9990 926.5150 921.0000 0.4428
blchisq bldf blpvalue rmsea rmsea_p rmsea.ci.lower
9586.1610 990.0000 0.0000 0.0020 1.0000 0.0000
rmsea.ci.upper
0.0090
#lavaan
<- fit.lavaan(blocks_con_sorted, itf, model = 'irt', data = data)
results_lav_b results_lav_b
lavaan 0.6-11 ended normally after 148 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 294
Number of equality constraints 191
Row rank of the constraints matrix 180
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 2309.927 926.019
Degrees of freedom 921 921
P-value (Unknown) NA 0.447
Scaling correction factor 4.658
Shift parameter 430.112
simple second-order correction
::fitmeasures(results_lav_b)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
lavaan'rmsea.pvalue.scaled', 'cfi.scaled')]
rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.002335691 0.000000000 0.009340939 1.000000000
cfi.scaled
0.999415466
<- lavaan::lavPredict(results_lav_b)
scores
# Recoded data
# Mplus
<- fit.mplus(blocks_con, itf, model = 'irt', input_path = 'myFC_model_brec.inp',
results_mplus_brec output_path = "myFC_model_brec.out",data_path =
"myDataFile_rec.dat",data_full = TRUE, byblock = FALSE)
unlist(results_mplus_brec$fit)
srmr cfi tli chisq df pvalue
0.0460 0.9990 0.9990 926.5160 921.0000 0.4428
blchisq bldf blpvalue rmsea rmsea_p rmsea.ci.lower
9586.1610 990.0000 0.0000 0.0020 1.0000 0.0000
rmsea.ci.upper
0.0090
#lavaan
<- fit.lavaan(blocks_con, itf, model = 'irt', data = data_recoded)
results_lav_brec results_lav_brec
lavaan 0.6-11 ended normally after 137 iterations
Estimator ULS
Optimization method NLMINB
Number of model parameters 294
Number of equality constraints 190
Row rank of the constraints matrix 180
Number of observations 1000
Model Test User Model:
Standard Robust
Test Statistic 2309.927 926.019
Degrees of freedom 921 921
P-value (Unknown) NA 0.447
Scaling correction factor 4.658
Shift parameter 430.112
simple second-order correction
::fitmeasures(results_lav_brec)[c('rmsea.scaled', 'rmsea.ci.lower.scaled',
lavaan'rmsea.ci.upper.scaled', 'rmsea.pvalue.scaled',
'cfi.scaled')]
rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.002335673 0.000000000 0.009340934 1.000000000
cfi.scaled
0.999415475
<- lavaan::lavPredict(results_lav_brec) scores
Here, we also must correct the fit indices, as there are redundancies
among the thresholds and tetrachoric correlations. Again, the
redundancies can be determined with the function
redundancies
. We directly correct the fit with
fit.correct
:
#save fit measures
<- lavaan::fitmeasures(results_lav_brec)
tmp fit.correct(1000, blocks_con, tmp['chisq.scaled'], tmp['df.scaled'], tmp['baseline.chisq.scaled'],
'baseline.df.scaled']) tmp[
df.df_corr rmsea cfi
9.060000e+02 4.703037e-03 9.976687e-01
The estimation of a full model will most likely seldom be of
interest. More likely, (linked) block designs are of interest.
Especially for simulations with many items (more than 50), many binary
variables are simulated and saved. While the simulation of all items is
important (see Jansen & Schulze, 2023b), saving all items is not.
sim.data
can also return only items that are of relevance.
For that we need to specify the argument variables
. Assume
we have the linked block design we specified before. We only have to
give the item names of the sorted blocks and define it as the
variables
argument:
#Get the relevant names
<- i.name(blocks_con_sorted)
tmp_names
#same example as before
set.seed(1234)
<- sim.data(nfactor = nfactor, nitem = nitem, nperson = nperson, itf = itf, varcov = varcov,
data lmu = lmu, loadings = loadings, variables = tmp_names)
#save the file
write.table(data, 'myDataFile.dat', quote = FALSE, sep = " ", col.names = FALSE,
row.names = FALSE)
For the Mplus analysis, the argument data_full
must then
be set to FALSE
.
Taken together, we have demonstrated the basic steps on how to use
the functions given by ThurMod
. Please report any bugs. If
you miss functions that could be helpful for the analysis of Thurstonian
forced-choice data, please let me know.
Brown, A., & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71(3), 460-502. https://doi.org/10.1177/0013164410375112
Jansen, M. T., & Schulze, R. (2023a). Item scaling of social desirability using conjoint measurement: A comparison of ratings, paired comparisons, and rankings. Manuscript in preparation.
Jansen, M. T., & Schulze, R. (2023b). The Thurstonian linked bock design: Improving Thurstonian modeling for paired comparison and ranking data. Manuscript submitted.
Maydeu-Olivares, A., & Böckenholt, U. (2005). Structural equation modeling of paired-comparison and ranking data. Psychological Methods, 10(3), 285–304. https://doi.org/10.1037/1082-989X.10.3.285
Maydeu-Olivares, A., & Brown, A. (2010). Item response modeling of paired comparison and ranking data. Multivariate Behavioural Research, 45(6), 935-974. https://doi.org/10.1080/00273171.2010.531231