Simulating data and estimating Thurstonian IRT and factor models with ThurMod

Markus Thomas Jansen

2023-03-05

Introduction

In this vignette, the basic procedures in data-handling, simulation and estimation with the package ThurMod are performed. For the moment, we separate two different model types:

The Thurstonian factor model was introduced by Maydeu-Olivares & Böckenholt (2005), the Thurstonian IRT model was introduced by Maydeu-Olivares & Brown (2010). For a review see Jansen & Schulze (2023a). For further extensions and discussion about the model types see Jansen & Schulze (2023b). For example, the factor and IRT model both can be further differentiated. For the differentiation we use the design matrix \(A\) which represents all paired comparisons of a design. For a design of four items, this would be

     [,1] [,2] [,3] [,4]
[1,]    1   -1    0    0
[2,]    1    0   -1    0
[3,]    1    0    0   -1
[4,]    0    1   -1    0
[5,]    0    1    0   -1
[6,]    0    0    1   -1

The design matrix can be (Jansen & Schulze, 2023b):

Examples

First, we will load the package required in the vignette.

library(ThurMod)
The next step is to define the model we are interested in. For this we have to define the following aspects: We consider a model with four factors (traits), measured by 12 items. The item-to-factor relations are defined so that

Further we simulate data of 1000 respondents with loadings between .30 and .95 and latent utility means between -1 and 1. We assume that the data results from rankings, that is that transitivity between responses holds (the variance of the binary indicators is zero). Further, we assume uncorrelated data. We will simulate the data of all paired comparisons possible (see Jansen & Schulze, 2023b).

Set up the simulation conditions:

nfactor <- 4
nitem <- 12
nperson <- 1000
itf <- rep(1:4, 3)
varcov <- diag(1, 4)

# latent utility means
set.seed(69)
lmu <- runif(nitem, -1, 1)
loadings <- runif(nitem, 0.30, 0.95)

Next, we simulate a data set based on the true parameter values:

data <- sim.data(nfactor = nfactor, nitem = nitem, nperson = nperson, itf = itf,
                 varcov = varcov, lmu = lmu, loadings = loadings)

#save the file
write.table(data, 'myDataFile_f.dat', quote = FALSE, sep = " ", col.names = FALSE, row.names = 
              FALSE)

The data set contains all (12 )/2 = 66 possible paired comparison variables. With this data set we will perform analyses of the full design (IRT, CFA).

Full model

Full models include all paired comparisons. The estimation of these models can take a while, as with categorical data a large correlations matrix must be estimated.

For all functions, the blocks we use must be defined. In a full model, there is only one block with all items:

blocks <- matrix(1:nitem, nrow = 1)

The blocks must be defined as a matrix where the rows correspond to each block. Only for a full design, a vector, for example, 1:12 would work.

Thurstonian factor models

We will analyse the data with Mplus and lavaan. We can do this in two ways: First, in three separate steps, second, directly.

Way 1, step 1: Build syntax

# Mplus
syntax.mplus(blocks, itf, model = 'lmean', input_path = 'myFC_model_f.inp', data_path = 
               "myDataFile_f.dat")

#lavaan
modelsyn <- syntax.lavaan(blocks, itf, model = 'lmean')

For step 2, now these syntaxes must be run:

# Mplus
system('mplus myFC_model_f.inp', wait = TRUE, show.output.on.console= FALSE)

#lavaan
results_lav1 <- lavaan::lavaan(modelsyn, data = data, ordered = TRUE, auto.fix.first = FALSE,
                      auto.var = TRUE, int.lv.free = TRUE, parameterization = "theta",
                      estimator = 'ULSMV')

If you replicate this, be patient, it can take about 40 minutes each, depending on your processing power.

Step 3: Read results. For Mplus you can either open the output file, or use read.mplus

# Mplus
results_mplus1 <- read.mplus(blocks, itf, model = 'lmean', output_path = "myFC_model_f.out")
unlist(results_mplus1$fit)
          srmr            cfi            tli          chisq             df         pvalue 
        0.0460         1.0000         1.0000      2131.5890      2171.0000         0.7231 
       blchisq           bldf       blpvalue          rmsea        rmsea_p rmsea.ci.lower 
    20861.0240      2145.0000         0.0000         0.0000         1.0000         0.0000 
rmsea.ci.upper 
        0.0060 
#lavaan
results_lav1
lavaan 0.6-11 ended normally after 229 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                        40
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                              4941.945    2130.962
  Degrees of freedom                              2171        2171
  P-value (Unknown)                                 NA       0.726
  Scaling correction factor                                  7.895
  Shift parameter                                         1504.987
       simple second-order correction                             
lavaan::fitmeasures(results_lav1)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
                                    'rmsea.pvalue.scaled','cfi.scaled')]
         rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled   rmsea.pvalue.scaled 
          0.000000000           0.000000000           0.005706838           1.000000000 
           cfi.scaled 
          1.000000000 

The second way is to use fit_mplus or fit_lavaan, each function does the above steps at once:

# Mplus
results_mplus2 <- fit.mplus(blocks, itf, model = 'lmean', input_path = 'myFC_model_f.inp',
                            output_path = "myFC_model_f.out", data_path = "myDataFile_f.dat")
unlist(results_mplus2$fit)
          srmr            cfi            tli          chisq             df         pvalue 
        0.0460         1.0000         1.0000      2131.5890      2171.0000         0.7231 
       blchisq           bldf       blpvalue          rmsea        rmsea_p rmsea.ci.lower 
    20861.0240      2145.0000         0.0000         0.0000         1.0000         0.0000 
rmsea.ci.upper 
        0.0060 
#lavaan
results_lav2 <- fit.lavaan(blocks, itf, model = 'lmean', data = data)
results_lav2
lavaan 0.6-11 ended normally after 229 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                        40
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                              4941.945    2130.962
  Degrees of freedom                              2171        2171
  P-value (Unknown)                                 NA       0.726
  Scaling correction factor                                  7.895
  Shift parameter                                         1504.987
       simple second-order correction                             
lavaan::fitmeasures(results_lav2)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
                                    'rmsea.pvalue.scaled','cfi.scaled')]
         rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled   rmsea.pvalue.scaled 
          0.000000000           0.000000000           0.005706838           1.000000000 
           cfi.scaled 
          1.000000000 

Thurstonian IRT models

For IRT models, the procedure is the same, just change model = 'lmean' to model = 'irt', for example

# Mplus
results_mplus2irt <- fit.mplus(blocks, itf, model = 'irt', input_path = 'myFC_model_irt.inp',
                               output_path = "myFC_model_irt.out", data_path = "myDataFile_f.dat")
unlist(results_mplus2irt$fit)

#lavaan
results_lav2irt <- fit.lavaan(blocks, itf, model = 'irt', data = data)
results_lav2irt
lavaan::fitmeasures(results_lav2irt)[c('rmsea.scaled', 'rmsea.ci.lower.scaled',
                                       'rmsea.ci.upper.scaled','rmsea.pvalue.scaled',
                                       'cfi.scaled')]
scores <- lavaan::lavPredict(results_lav2irt)

Block models - unlinked

The first block models were introduced as multidimensional forced-choice blocks (Brown & Maydeu-Olivares, 2011). However, it was shown that the models must neither be multidimensional, nor must every paired comparison only be contained once (Jansen & Schulze, 2023b).

We first simulate a new data set. Set up the simulation conditions:

nfactor <- 5
nitem <- 30
nperson <- 1000
itf <- rep(1:5, 6)
varcov <- diag(1, 5)

# latent utility means
set.seed(69)
lmu <- runif(nitem, -1, 1)
loadings <- runif(nitem, 0.30, 0.95)

Next, we simulate a data set based on the true parameter values:

set.seed(1234)
data <- sim.data(nfactor = nfactor, nitem = nitem, nperson = nperson, itf = itf, varcov = varcov,
                 lmu = lmu, loadings = loadings)
#save the file
write.table(data,'myDataFile.dat', quote = FALSE, sep = " ", col.names = FALSE, row.names = FALSE)

Next, we consider unlinked blocks of three items (triplets):

blocks <- matrix(sample(1:nitem, nitem), ncol = 3)

The blocks are defined by a matrix where the rows are the blocks and the number of columns is the number of items per block. Before we can fit the model, we must ensure that the data fits to the syntax we create. The data simulated before assumes that all items are in ascending order. This is important, as the order of the items determine the coding. We code binary items as \[\begin{equation} y_{l} = \begin{cases} 1 & \text{if item $i$ is chosen over item $j$} \\ 0 & \text{else} .\\ \end{cases} \end{equation}\] In a ranking task, all choice alternatives are presented at once, but for each possible comparison, the coding scheme can be used. For example, consider \(n = 3\) items which are labeled as {A, B, C}. Then we have the pairs {A, B}, {A, C}, {B, C}. If for {A, B} A is preferred over B then \(y_{A,B}=1\) and 0 otherwise. This can be done for every paired comparison and ranking task (Maydeu-Olivares & Böckenholt, 2005): For {A,C,B} the binary outcomes are \(y_{A,B}=1\), \(y_{A,C}=1\), and \(y_{B,C}=0\). However, if the order of the binary item is changed, then the coding is changed accordingly, e.g. \(y_{B,A}=0\), \(y_{C,A}=0\), and \(y_{C,B}=1\).

We can get an overview over all possible comparisons by the block design, by using pair.combn:

pair.combn(blocks)
      [,1] [,2]
 [1,]    1    2
 [2,]    4   23
 [3,]    4   28
 [4,]    5   14
 [5,]    5   22
 [6,]    8   11
 [7,]    8   21
 [8,]    9    3
 [9,]   10    7
[10,]   13    3
[11,]   13    9
[12,]   14   22
[13,]   16   15
[14,]   16   18
[15,]   18   15
[16,]   19   17
[17,]   20   12
[18,]   20   27
[19,]   21   11
[20,]   24   17
[21,]   24   19
[22,]   25    1
[23,]   25    2
[24,]   26    7
[25,]   26   10
[26,]   27   12
[27,]   28   23
[28,]   29    6
[29,]   30    6
[30,]   30   29

See that we have some items, that are not in ascending order. The data simulated assumes, that the first named item (item \(i\)) has a smaller number e.g., i3i9. The blocks we created however, have some comparisons, where the first items have a larger number e.g. i9i3. There are two ways to fit the syntax to the data: First, we can simply sort the blocks via blocksort:

blocks_sorted <- blocksort(blocks)
pair.combn(blocks_sorted)
      [,1] [,2]
 [1,]    1    2
 [2,]    1   25
 [3,]    2   25
 [4,]    3    9
 [5,]    3   13
 [6,]    4   23
 [7,]    4   28
 [8,]    5   14
 [9,]    5   22
[10,]    6   29
[11,]    6   30
[12,]    7   10
[13,]    7   26
[14,]    8   11
[15,]    8   21
[16,]    9   13
[17,]   10   26
[18,]   11   21
[19,]   12   20
[20,]   12   27
[21,]   14   22
[22,]   15   16
[23,]   15   18
[24,]   16   18
[25,]   17   19
[26,]   17   24
[27,]   19   24
[28,]   20   27
[29,]   23   28
[30,]   29   30

Now all paired comparisons that can be derived are in ascending order.

The second way is to recode the corresponding binary indicators in the data, as if we flip the coding schema. We can just recode the variables:

# Get names of binary indicators that have non-ascending names
tmp <- which(pair.combn(blocks)[,1] > pair.combn(blocks)[,2])

# get names
pair_names_b <- i.name(blocks)
pair_names_ori <- i.name(1:nitem)

# Rename
pair_names <- i.name(1:nitem)
if(length(tmp) != 0){
  tmp1 <- pair_names_b[tmp]
  tmp2 <- sub('^i.+i','i', tmp1)
  tmp3 <- tmp1
  for(j in 1:length(tmp)){
    tmp3 <- paste0(tmp2[j], sub(paste0(tmp2[j],'$'), '', tmp1[j]))
    pair_names[which(pair_names %in% tmp3)] <- pair_names_b[tmp[j]]
  }
}

tmp <- which(!names(data) %in% pair_names)
# Clone data
data_recoded <- data
# Recode and rename
data_recoded[,tmp] <- abs(data[,tmp]-1)
names(data_recoded) <- pair_names

# Save data
write.table(data_recoded, 'myDataFile_rec.dat', quote = FALSE, sep = " ", col.names = FALSE,
            row.names = FALSE)

Both ways yield the same results. We have to add the argument data_full = TRUE, as we simulated all data in the data file, but only use some of the data. We demonstrate only the model = 'irt' case, however, also for the CFA model types, these analyses can be performed (model = 'lmean', model = 'uc', or model = 'simple').

# Blocks_sorted
# Mplus
results_mplus_b <- fit.mplus(blocks_sorted, itf, model = 'irt', input_path = 'myFC_model_bu.inp',
                             output_path = "myFC_model_bu.out", data_path = "myDataFile.dat",
                             data_full = TRUE)
unlist(results_mplus_b$fit)
          srmr            cfi            tli          chisq             df         pvalue 
        0.0400         1.0000         1.0000       345.9970       375.0000         0.8561 
       blchisq           bldf       blpvalue          rmsea        rmsea_p rmsea.ci.lower 
     4370.6990       435.0000         0.0000         0.0000         1.0000         0.0000 
rmsea.ci.upper 
        0.0070 
#lavaan
results_lav_b <- fit.lavaan(blocks_sorted, itf, model = 'irt', data = data)
results_lav_b
lavaan 0.6-11 ended normally after 126 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                       155
  Number of equality constraints                    78
  Row rank of the constraints matrix                65
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                               787.985     345.776
  Degrees of freedom                               375         375
  P-value (Unknown)                                 NA       0.858
  Scaling correction factor                                  3.576
  Shift parameter                                          125.438
       simple second-order correction                             
lavaan::fitmeasures(results_lav_b)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
                                     'rmsea.pvalue.scaled', 'cfi.scaled')]
         rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled   rmsea.pvalue.scaled 
           0.00000000            0.00000000            0.00664782            1.00000000 
           cfi.scaled 
           1.00000000 
lavaan::summary(results_lav_b)
lavaan 0.6-11 ended normally after 126 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                       155
  Number of equality constraints                    78
  Row rank of the constraints matrix                65
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                               787.985     345.776
  Degrees of freedom                               375         375
  P-value (Unknown)                                 NA       0.858
  Scaling correction factor                                  3.576
  Shift parameter                                          125.438
       simple second-order correction                             

Parameter Estimates:

  Standard errors                           Robust.sem
  Information                                 Expected
  Information saturated (h1) model        Unstructured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  Trait1 =~                                           
    i1i2      (L1)    1.409    0.230    6.130    0.000
    i1i25     (L1)    1.409    0.230    6.130    0.000
    i6i29     (L6)    1.443    0.238    6.066    0.000
    i6i30     (L6)    1.443    0.238    6.066    0.000
    i7i26   (L_26)   -1.188    0.150   -7.938    0.000
    i8i11   (L_11)   -1.455    0.277   -5.243    0.000
    i8i21   (L_21)   -1.051    0.231   -4.552    0.000
    i10i26  (L_26)   -1.188    0.150   -7.938    0.000
    i11i21  (L11_)    0.403    0.106    3.816    0.000
    i15i16  (L_16)   -0.513    0.104   -4.925    0.000
    i16i18   (L16)    0.513    0.104    4.925    0.000
  Trait2 =~                                           
    i1i2     (L_2)    1.499    0.261    5.747    0.000
    i2i25     (L2)   -1.499    0.261   -5.747    0.000
    i5i22   (L_22)    1.233    0.144    8.562    0.000
    i7i10     (L7)   -1.121    0.165   -6.813    0.000
    i7i26     (L7)   -1.121    0.165   -6.813    0.000
    i12i20   (L12)   -0.515    0.112   -4.584    0.000
    i12i27  (L12_)   -0.071    0.082   -0.861    0.389
    i14i22  (L_22)    1.233    0.144    8.562    0.000
    i17i19   (L17)   -0.353    0.091   -3.877    0.000
    i17i24   (L17)   -0.353    0.091   -3.877    0.000
    i20i27  (L_27)    0.444    0.107    4.136    0.000
  Trait3 =~                                           
    i3i9      (L3)   -0.172    0.170   -1.012    0.311
    i3i13   (L3_L)    0.329    0.109    3.033    0.002
    i4i23   (L_23)    0.754    0.246    3.069    0.002
    i4i28   (L_28)    0.095    0.253    0.376    0.707
    i8i11     (L8)   -0.569    0.291   -1.954    0.051
    i8i21     (L8)   -0.569    0.291   -1.954    0.051
    i9i13   (L_13)    0.502    0.180    2.792    0.005
    i15i18  (L_18)    0.385    0.150    2.561    0.010
    i16i18  (L_18)    0.385    0.150    2.561    0.010
    i23i28  (L23_)   -0.658    0.170   -3.871    0.000
  Trait4 =~                                           
    i3i9     (L_9)    0.693    0.116    5.959    0.000
    i4i23     (L4)   -1.084    0.192   -5.630    0.000
    i4i28     (L4)   -1.084    0.192   -5.630    0.000
    i5i14   (L_14)    0.949    0.132    7.205    0.000
    i6i29   (L_29)    1.530    0.242    6.328    0.000
    i9i13     (L9)   -0.693    0.116   -5.959    0.000
    i14i22   (L14)   -0.949    0.132   -7.205    0.000
    i17i19  (L_19)    0.625    0.097    6.458    0.000
    i17i24  (L_24)    0.718    0.102    7.049    0.000
    i19i24  (L19_)    0.092    0.071    1.310    0.190
    i29i30   (L29)   -1.530    0.242   -6.328    0.000
  Trait5 =~                                           
    i1i25   (L_25)    1.716    0.251    6.838    0.000
    i2i25   (L_25)    1.716    0.251    6.838    0.000
    i5i14     (L5)   -0.764    0.128   -5.948    0.000
    i5i22     (L5)   -0.764    0.128   -5.948    0.000
    i6i30   (L_30)    1.366    0.220    6.205    0.000
    i7i10   (L_10)    1.050    0.150    7.016    0.000
    i10i26   (L10)   -1.050    0.150   -7.016    0.000
    i12i20  (L_20)    0.472    0.101    4.672    0.000
    i15i16   (L15)   -0.657    0.121   -5.451    0.000
    i15i18   (L15)   -0.657    0.121   -5.451    0.000
    i20i27   (L20)   -0.472    0.101   -4.672    0.000
    i29i30  (L_30)    1.366    0.220    6.205    0.000

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
  Trait1 ~~                                           
    Trait2           -0.372    0.104   -3.567    0.000
    Trait3            0.102    0.198    0.514    0.607
    Trait4           -0.394    0.089   -4.410    0.000
    Trait5           -0.329    0.110   -2.996    0.003
  Trait2 ~~                                           
    Trait3           -0.216    0.193   -1.123    0.261
    Trait4            0.283    0.114    2.480    0.013
    Trait5            0.298    0.123    2.425    0.015
  Trait3 ~~                                           
    Trait4           -0.142    0.190   -0.747    0.455
    Trait5           -0.202    0.205   -0.983    0.326
  Trait4 ~~                                           
    Trait5            0.412    0.099    4.172    0.000
 .i1i2 ~~                                             
   .i1i25     (e1)    1.000                           
   .i2i25    (e_2)   -0.713    0.310   -2.298    0.022
 .i1i25 ~~                                            
   .i2i25    (e25)    0.113    0.223    0.507    0.612
 .i3i9 ~~                                             
   .i3i13     (e3)    1.000       NA                  
   .i9i13    (e_9)   -0.921    0.235   -3.924    0.000
 .i3i13 ~~                                            
   .i9i13    (e13)    0.824    0.219    3.763    0.000
 .i4i23 ~~                                            
   .i4i28     (e4)    1.000                           
   .i23i28  (e_23)   -0.424    0.238   -1.784    0.074
 .i4i28 ~~                                            
   .i23i28   (e28)    1.243    0.360    3.455    0.001
 .i5i22 ~~                                            
   .i5i14     (e5)    1.000       NA                  
 .i14i22 ~~                                           
   .i5i14   (e_14)   -0.874    0.212   -4.113    0.000
 .i5i22 ~~                                            
   .i14i22   (e22)    0.427    0.194    2.203    0.028
 .i6i29 ~~                                            
   .i6i30     (e6)    1.000       NA                  
   .i29i30  (e_29)   -0.708    0.341   -2.080    0.038
 .i6i30 ~~                                            
   .i29i30   (e30)    0.859    0.348    2.470    0.014
 .i7i26 ~~                                            
   .i7i10     (e7)    1.000       NA                  
 .i10i26 ~~                                           
   .i7i10   (e_10)   -0.673    0.196   -3.442    0.001
 .i7i26 ~~                                            
   .i10i26   (e26)    0.678    0.193    3.519    0.000
 .i8i11 ~~                                            
   .i8i21     (e8)    1.000                           
   .i11i21  (e_11)   -0.189    0.176   -1.074    0.283
 .i8i21 ~~                                            
   .i11i21   (e21)    1.212    0.387    3.132    0.002
 .i12i20 ~~                                           
   .i12i27   (e12)    1.000       NA                  
   .i20i27  (e_20)   -1.639    0.358   -4.577    0.000
 .i12i27 ~~                                           
   .i20i27   (e27)    1.147    0.277    4.138    0.000
 .i15i16 ~~                                           
   .i15i18   (e15)    1.000                           
   .i16i18  (e_16)   -1.757    0.422   -4.160    0.000
 .i16i18 ~~                                           
   .i15i18   (e18)    1.235    0.349    3.535    0.000
 .i17i19 ~~                                           
   .i17i24   (e17)    1.000       NA                  
   .i19i24  (e_19)   -0.896    0.182   -4.914    0.000
 .i17i24 ~~                                           
   .i19i24   (e24)    0.799    0.170    4.711    0.000

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)
   .i1i2              0.000                           
   .i1i25             0.000                           
   .i6i29             0.000                           
   .i6i30             0.000                           
   .i7i26             0.000                           
   .i8i11             0.000                           
   .i8i21             0.000                           
   .i10i26            0.000                           
   .i11i21            0.000                           
   .i15i16            0.000                           
   .i16i18            0.000                           
   .i2i25             0.000                           
   .i5i22             0.000                           
   .i7i10             0.000                           
   .i12i20            0.000                           
   .i12i27            0.000                           
   .i14i22            0.000                           
   .i17i19            0.000                           
   .i17i24            0.000                           
   .i20i27            0.000                           
   .i3i9              0.000                           
   .i3i13             0.000                           
   .i4i23             0.000                           
   .i4i28             0.000                           
   .i9i13             0.000                           
   .i15i18            0.000                           
   .i23i28            0.000                           
   .i5i14             0.000                           
   .i19i24            0.000                           
   .i29i30            0.000                           
    Trait1            0.000                           
    Trait2            0.000                           
    Trait3            0.000                           
    Trait4            0.000                           
    Trait5            0.000                           

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)
    i1i2|t1           0.722    0.108    6.667    0.000
    i1i25|t1         -1.113    0.140   -7.948    0.000
    i6i29|t1         -2.274    0.259   -8.769    0.000
    i6i30|t1         -1.626    0.195   -8.342    0.000
    i7i26|t1          0.101    0.073    1.385    0.166
    i8i11|t1          0.215    0.082    2.641    0.008
    i8i21|t1         -0.934    0.142   -6.562    0.000
    i10i26|t1         0.483    0.084    5.739    0.000
    i11i21|t1        -1.296    0.208   -6.220    0.000
    i15i16|t1         0.194    0.073    2.663    0.008
    i16i18|t1         1.071    0.136    7.884    0.000
    i2i25|t1         -1.980    0.244   -8.101    0.000
    i5i22|t1         -0.388    0.072   -5.400    0.000
    i7i10|t1         -0.415    0.079   -5.230    0.000
    i12i20|t1         1.170    0.110   10.667    0.000
    i12i27|t1         0.868    0.083   10.406    0.000
    i14i22|t1        -0.273    0.073   -3.753    0.000
    i17i19|t1        -0.118    0.060   -1.949    0.051
    i17i24|t1         0.384    0.065    5.948    0.000
    i20i27|t1        -0.252    0.074   -3.381    0.001
    i3i9|t1          -1.054    0.097  -10.882    0.000
    i3i13|t1          0.685    0.072    9.570    0.000
    i4i23|t1         -0.525    0.091   -5.740    0.000
    i4i28|t1         -0.627    0.099   -6.344    0.000
    i9i13|t1          1.727    0.166   10.373    0.000
    i15i18|t1         1.234    0.125    9.836    0.000
    i23i28|t1        -0.091    0.058   -1.555    0.120
    i5i14|t1         -0.184    0.066   -2.762    0.006
    i19i24|t1         0.457    0.065    7.045    0.000
    i29i30|t1         0.588    0.113    5.213    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
    Trait1            1.000                           
    Trait2            1.000                           
    Trait3            1.000                           
    Trait4            1.000                           
    Trait5            1.000                           
   .i1i2    (e1e2)    1.713    0.310    5.524    0.000
   .i1i25   (e125)    1.113    0.223    4.984    0.000
   .i2i25   (e225)    0.826    0.398    2.074    0.038
   .i3i9    (e3e9)    1.921    0.235    8.185    0.000
   .i3i13   (e313)    1.824    0.219    8.329    0.000
   .i4i23   (e423)    1.424    0.238    5.987    0.000
   .i4i28   (e428)    2.243    0.360    6.235    0.000
   .i5i14   (e514)    1.874    0.212    8.819    0.000
   .i5i22   (e522)    1.427    0.194    7.366    0.000
   .i6i29   (e629)    1.708    0.341    5.015    0.000
   .i6i30   (e630)    1.859    0.348    5.346    0.000
   .i7i10   (e710)    1.673    0.196    8.557    0.000
   .i7i26   (e726)    1.678    0.193    8.709    0.000
   .i8i11   (e811)    1.189    0.176    6.750    0.000
   .i8i21   (e821)    2.212    0.387    5.717    0.000
   .i9i13   (e913)    1.745    0.374    4.668    0.000
   .i10i26  (e102)    1.351    0.317    4.265    0.000
   .i11i21  (e112)    1.401    0.454    3.087    0.002
   .i12i20 (e1220)    2.639    0.358    7.370    0.000
   .i12i27 (e1227)    2.147    0.277    7.747    0.000
   .i14i22  (e142)    1.301    0.318    4.090    0.000
   .i15i16 (e1516)    2.757    0.422    6.528    0.000
   .i15i18 (e1518)    2.235    0.349    6.396    0.000
   .i16i18  (e161)    2.993    0.707    4.234    0.000
   .i17i19  (e171)    1.896    0.182   10.402    0.000
   .i17i24  (e172)    1.799    0.170   10.605    0.000
   .i19i24  (e192)    1.695    0.301    5.632    0.000
   .i20i27  (e202)    2.786    0.572    4.866    0.000
   .i23i28  (e232)    1.667    0.497    3.357    0.001
   .i29i30  (e293)    1.567    0.559    2.801    0.005

Scales y*:
                   Estimate  Std.Err  z-value  P(>|z|)
    i1i2              0.478                           
    i1i25             0.474                           
    i6i29             0.477                           
    i6i30             0.471                           
    i7i26             0.546                           
    i8i11             0.513                           
    i8i21             0.516                           
    i10i26            0.573                           
    i11i21            0.800                           
    i15i16            0.556                           
    i16i18            0.539                           
    i2i25             0.472                           
    i5i22             0.580                           
    i7i10             0.548                           
    i12i20            0.579                           
    i12i27            0.682                           
    i14i22            0.572                           
    i17i19            0.661                           
    i17i24            0.660                           
    i20i27            0.570                           
    i3i9              0.637                           
    i3i13             0.719                           
    i4i23             0.542                           
    i4i28             0.538                           
    i9i13             0.623                           
    i15i18            0.585                           
    i23i28            0.690                           
    i5i14             0.602                           
    i19i24            0.766                           
    i29i30            0.497                           

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|)
    L11               1.455    0.277    5.243    0.000
    L13              -0.502    0.180   -2.792    0.005
    L18              -0.385    0.150   -2.561    0.010
    L19              -0.625    0.097   -6.458    0.000
    L21               1.051    0.231    4.552    0.000
    L22              -1.233    0.144   -8.562    0.000
    L23              -0.754    0.246   -3.069    0.002
    L24              -0.718    0.102   -7.049    0.000
    L25              -1.716    0.251   -6.838    0.000
    L26               1.188    0.150    7.938    0.000
    L27              -0.444    0.107   -4.136    0.000
    L28              -0.095    0.253   -0.376    0.707
    L30              -1.366    0.220   -6.205    0.000

Constraints:
                                               |Slack|
    L11_L21 - (L11-L21)                          0.000
    L12_L27 - (L12-L27)                          0.000
    L3_L13 - (L3-L13)                            0.000
    L23_L28 - (L23-L28)                          0.000
    L19_L24 - (L19-L24)                          0.000
    L_2 - (-L2)                                  0.000
    L_9 - (-L9)                                  0.000
    L_10 - (-L10)                                0.000
    L_11 - (-L11)                                0.000
    L_13 - (-L13)                                0.000
    L_14 - (-L14)                                0.000
    L_16 - (-L16)                                0.000
    L_18 - (-L18)                                0.000
    L_19 - (-L19)                                0.000
    L_20 - (-L20)                                0.000
    L_21 - (-L21)                                0.000
    L_22 - (-L22)                                0.000
    L_23 - (-L23)                                0.000
    L_24 - (-L24)                                0.000
    L_25 - (-L25)                                0.000
    L_26 - (-L26)                                0.000
    L_27 - (-L27)                                0.000
    L_28 - (-L28)                                0.000
    L_29 - (-L29)                                0.000
    L_30 - (-L30)                                0.000
    e1e2 - (e1-e_2)                              0.000
    e3e9 - (e3-e_9)                              0.000
    e1e25 - (e1+e25)                             0.000
    e3e13 - (e3+e13)                             0.000
    e4e28 - (e4+e28)                             0.000
    e5e22 - (e5+e22)                             0.000
    e6e30 - (e6+e30)                             0.000
    e7e26 - (e7+e26)                             0.000
    e8e21 - (e8+e21)                             0.000
    e4e23 - (e4-e_23)                            0.000
    e5e14 - (e5-e_14)                            0.000
    e6e29 - (e6-e_29)                            0.000
    e7e10 - (e7-e_10)                            0.000
    e8e11 - (e8-e_11)                            0.000
    e12e27 - (e12+e27)                           0.000
    e15e18 - (e15+e18)                           0.000
    e17e24 - (e17+e24)                           0.000
    e2e25 - (-e_2+e25)                           0.000
    e9e13 - (-e_9+e13)                           0.000
    e12e20 - (e12-e_20)                          0.000
    e15e16 - (e15-e_16)                          0.000
    e17e19 - (e17-e_19)                          0.000
    e10e26 - (-e_10+e26)                         0.000
    e11e21 - (-e_11+e21)                         0.000
    e14e22 - (-e_14+e22)                         0.000
    e16e18 - (-e_16+e18)                         0.000
    e19e24 - (-e_19+e24)                         0.000
    e20e27 - (-e_20+e27)                         0.000
    e23e28 - (-e_23+e28)                         0.000
    e29e30 - (-e_29+e30)                         0.000
    e3 - (1)                                     0.000
    e15 - (1)                                    0.000
    e8 - (1)                                     0.000
    e5 - (1)                                     0.000
    e17 - (1)                                    0.000
    e1 - (1)                                     0.000
    e7 - (1)                                     0.000
    e12 - (1)                                    0.000
    e4 - (1)                                     0.000
    e6 - (1)                                     0.000
scores <- lavaan::lavPredict(results_lav_b)

# Recoded data
# Mplus
results_mplus_brec <- fit.mplus(blocks, itf, model = 'irt', input_path = 'myFC_model_brecu.inp',
                                output_path = "myFC_model_brecu.out", data_path = 
                                  "myDataFile_rec.dat", data_full = TRUE)
unlist(results_mplus_brec$fit)
          srmr            cfi            tli          chisq             df         pvalue 
        0.0400         1.0000         1.0000       345.9970       375.0000         0.8561 
       blchisq           bldf       blpvalue          rmsea        rmsea_p rmsea.ci.lower 
     4370.6990       435.0000         0.0000         0.0000         1.0000         0.0000 
rmsea.ci.upper 
        0.0070 
#lavaan
results_lav_brec <- fit.lavaan(blocks, itf, model = 'irt', data = data_recoded)
results_lav_brec
lavaan 0.6-11 ended normally after 143 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                       155
  Number of equality constraints                    78
  Row rank of the constraints matrix                65
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                               787.985     345.776
  Degrees of freedom                               375         375
  P-value (Unknown)                                 NA       0.858
  Scaling correction factor                                  3.576
  Shift parameter                                          125.438
       simple second-order correction                             
lavaan::fitmeasures(results_lav_brec)[c('rmsea.scaled', 'rmsea.ci.lower.scaled',
                                        'rmsea.ci.upper.scaled','rmsea.pvalue.scaled',
                                        'cfi.scaled')]
         rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled   rmsea.pvalue.scaled 
          0.000000000           0.000000000           0.006647839           1.000000000 
           cfi.scaled 
          1.000000000 
scores <- lavaan::lavPredict(results_lav_brec)

In the case of blocks, we must correct the fit indices, as there are redundancies among the thresholds and tetrachoric correlations. The redundancies can be determined with the function redundancies. We can also directly correct the fit with fit.correct:

#save fit measures
tmp <- lavaan::fitmeasures(results_lav_brec)
fit.correct(1000, blocks, tmp['chisq.scaled'], tmp['df.scaled'], tmp['baseline.chisq.scaled'],
            tmp['baseline.df.scaled'])
df.df_corr      rmsea        cfi 
       365          0          1 

Block models - linked

Now, we consider linked block designs (Jansen & Schulze, 2023b). The simplest way to construct these designs is to take the original unlinked design and link all blocks together (rank of the design \(r_D=N-1\), where \(N\) is the number of items; Jansen & Schulze, 2023b). The original blocks are:

blocks
      [,1] [,2] [,3]
 [1,]   25    1    2
 [2,]   30   29    6
 [3,]   16   18   15
 [4,]    5   14   22
 [5,]   24   19   17
 [6,]   13    9    3
 [7,]   26   10    7
 [8,]    8   21   11
 [9,]   20   27   12
[10,]    4   28   23

To get a linked design we need extra blocks. The number can be determined with count.xblocks

count.xblocks(blocks)
[1] 5
Hence, we need five extra triplets in this case. We add for example the following linking blocks:
blocks_extra <- matrix(c(25,30,16,5,25,13,26,8,20,4,29,19,1,3,21), ncol = 3, byrow = TRUE)
blocks_con <- rbind(blocks, blocks_extra)

We can determine the rank of the design matrix with rankA:

rankA(blocks_con)
[1] 29

The rank is 30-1=29. The function metablock also gives a overview over which blocks are linked.

metablock(blocks_con)
[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
[29] 29 30

There is only one meta block, therefore all items are linked. A counter example would be if we would not add block 4:

blocks_extra <- matrix(c(25,30,16,5,25,13,26,8,20,1,3,21), ncol = 3, byrow = TRUE)
blocks_con <- rbind(blocks, blocks_extra)

Then the rank is

rankA(blocks_con)
[1] 27

The rank is 30-1=29\(\neq\) 27. And we have three “meta” blocks.

metablock(blocks_con)
[[1]]
[1] 17 19 24

[[2]]
[1]  4 23 28

[[3]]
 [1]  1  2  3  5  6  7  8  9 10 11 12 13 14 15 16 18 20 21 22 25 26 27 29 30

Instead of manually constructing the blocks, ThurMod enables the user to get extra blocks via a function get.xblocks. We have to define if the blocks should be multidimensional (else multidim = FALSE), and if items should not be considered (e.g. because they are negatively keyed):

blocks_extra <- get.xblocks(blocks, itf, multidim = TRUE, item_not = NULL)
blocks_con <- rbind(blocks, blocks_extra)
blocks_con
      [,1] [,2] [,3]
 [1,]   25    1    2
 [2,]   30   29    6
 [3,]   16   18   15
 [4,]    5   14   22
 [5,]   24   19   17
 [6,]   13    9    3
 [7,]   26   10    7
 [8,]    8   21   11
 [9,]   20   27   12
[10,]    4   28   23
[11,]    2   29   18
[12,]   14   17   13
[13,]   26    8   20
[14,]    4   30    3
[15,]   10   22    8
blocks_con_sorted <- blocksort(blocks_con)

If we take the blocks as is, we need to recode the data set:

# Get names of binary indicators that have non-ascending names
tmp <- which(pair.combn(blocks_con)[,1] > pair.combn(blocks_con)[,2])

# get names
pair_names_b <- i.name(blocks_con)
pair_names_ori <- i.name(1:nitem)

# Rename
pair_names <- i.name(1:nitem)
if(length(tmp)!=0){
  tmp1 <- pair_names_b[tmp]
  tmp2 <- sub('^i.+i','i', tmp1)
  tmp3 <- tmp1
  for(j in 1:length(tmp)){
    tmp3 <- paste0(tmp2[j], sub(paste0(tmp2[j],'$'), '', tmp1[j]))
    pair_names[which(pair_names %in% tmp3)] <- pair_names_b[tmp[j]]
  }
}

tmp <- which(!names(data) %in% pair_names)
# Clone data
data_recoded <- data
# Recode and rename
data_recoded[,tmp] <- abs(data[,tmp]-1)
names(data_recoded) <- pair_names

# Save data
write.table(data_recoded, 'myDataFile_rec.dat', quote = FALSE, sep = " ", col.names = FALSE,
            row.names = FALSE)

The estimation of linked designs is done equivalent via

# Blocks_sorted
# Mplus
results_mplus_b <- fit.mplus(blocks_con_sorted,itf,model='irt',input_path='myFC_model_b_con.inp',
                             output_path="myFC_model_b_con.out",data_path="myDataFile.dat",
                             data_full = TRUE)
unlist(results_mplus_b$fit)
          srmr            cfi            tli          chisq             df         pvalue 
        0.0460         0.9990         0.9990       926.5150       921.0000         0.4428 
       blchisq           bldf       blpvalue          rmsea        rmsea_p rmsea.ci.lower 
     9586.1610       990.0000         0.0000         0.0020         1.0000         0.0000 
rmsea.ci.upper 
        0.0090 
#lavaan
results_lav_b <- fit.lavaan(blocks_con_sorted, itf, model = 'irt', data = data)
results_lav_b
lavaan 0.6-11 ended normally after 148 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                       294
  Number of equality constraints                   191
  Row rank of the constraints matrix               180
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                              2309.927     926.019
  Degrees of freedom                               921         921
  P-value (Unknown)                                 NA       0.447
  Scaling correction factor                                  4.658
  Shift parameter                                          430.112
       simple second-order correction                             
lavaan::fitmeasures(results_lav_b)[c('rmsea.scaled','rmsea.ci.lower.scaled','rmsea.ci.upper.scaled',
                                     'rmsea.pvalue.scaled', 'cfi.scaled')]
         rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled   rmsea.pvalue.scaled 
          0.002335691           0.000000000           0.009340939           1.000000000 
           cfi.scaled 
          0.999415466 
scores <- lavaan::lavPredict(results_lav_b)

# Recoded data
# Mplus
results_mplus_brec <- fit.mplus(blocks_con, itf, model = 'irt', input_path = 'myFC_model_brec.inp',
                                output_path = "myFC_model_brec.out",data_path =
                                  "myDataFile_rec.dat",data_full = TRUE, byblock = FALSE)
unlist(results_mplus_brec$fit)
          srmr            cfi            tli          chisq             df         pvalue 
        0.0460         0.9990         0.9990       926.5160       921.0000         0.4428 
       blchisq           bldf       blpvalue          rmsea        rmsea_p rmsea.ci.lower 
     9586.1610       990.0000         0.0000         0.0020         1.0000         0.0000 
rmsea.ci.upper 
        0.0090 
#lavaan
results_lav_brec <- fit.lavaan(blocks_con, itf, model = 'irt', data = data_recoded)
results_lav_brec
lavaan 0.6-11 ended normally after 137 iterations

  Estimator                                        ULS
  Optimization method                           NLMINB
  Number of model parameters                       294
  Number of equality constraints                   190
  Row rank of the constraints matrix               180
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                              Standard      Robust
  Test Statistic                              2309.927     926.019
  Degrees of freedom                               921         921
  P-value (Unknown)                                 NA       0.447
  Scaling correction factor                                  4.658
  Shift parameter                                          430.112
       simple second-order correction                             
lavaan::fitmeasures(results_lav_brec)[c('rmsea.scaled', 'rmsea.ci.lower.scaled', 
                                        'rmsea.ci.upper.scaled', 'rmsea.pvalue.scaled',
                                        'cfi.scaled')]
         rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled   rmsea.pvalue.scaled 
          0.002335673           0.000000000           0.009340934           1.000000000 
           cfi.scaled 
          0.999415475 
scores <- lavaan::lavPredict(results_lav_brec)

Here, we also must correct the fit indices, as there are redundancies among the thresholds and tetrachoric correlations. Again, the redundancies can be determined with the function redundancies. We directly correct the fit with fit.correct:

#save fit measures
tmp <- lavaan::fitmeasures(results_lav_brec)
fit.correct(1000, blocks_con, tmp['chisq.scaled'], tmp['df.scaled'], tmp['baseline.chisq.scaled'],
            tmp['baseline.df.scaled'])
  df.df_corr        rmsea          cfi 
9.060000e+02 4.703037e-03 9.976687e-01 

Simulating only relevant data

The estimation of a full model will most likely seldom be of interest. More likely, (linked) block designs are of interest. Especially for simulations with many items (more than 50), many binary variables are simulated and saved. While the simulation of all items is important (see Jansen & Schulze, 2023b), saving all items is not. sim.data can also return only items that are of relevance. For that we need to specify the argument variables. Assume we have the linked block design we specified before. We only have to give the item names of the sorted blocks and define it as the variables argument:

#Get the relevant names
tmp_names <- i.name(blocks_con_sorted)

#same example as before
set.seed(1234)
data <- sim.data(nfactor = nfactor, nitem = nitem, nperson = nperson, itf = itf, varcov = varcov,
                 lmu = lmu, loadings = loadings, variables = tmp_names)
#save the file
write.table(data, 'myDataFile.dat', quote = FALSE, sep = " ", col.names = FALSE,
            row.names = FALSE)

For the Mplus analysis, the argument data_full must then be set to FALSE.

Taken together, we have demonstrated the basic steps on how to use the functions given by ThurMod. Please report any bugs. If you miss functions that could be helpful for the analysis of Thurstonian forced-choice data, please let me know.

References

Brown, A., & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71(3), 460-502. https://doi.org/10.1177/0013164410375112

Jansen, M. T., & Schulze, R. (2023a). Item scaling of social desirability using conjoint measurement: A comparison of ratings, paired comparisons, and rankings. Manuscript in preparation.

Jansen, M. T., & Schulze, R. (2023b). The Thurstonian linked bock design: Improving Thurstonian modeling for paired comparison and ranking data. Manuscript submitted.

Maydeu-Olivares, A., & Böckenholt, U. (2005). Structural equation modeling of paired-comparison and ranking data. Psychological Methods, 10(3), 285–304. https://doi.org/10.1037/1082-989X.10.3.285

Maydeu-Olivares, A., & Brown, A. (2010). Item response modeling of paired comparison and ranking data. Multivariate Behavioural Research, 45(6), 935-974. https://doi.org/10.1080/00273171.2010.531231