Abstract
Political scientists often need to evaluate whether samples are comparable, for example, when analysing different countries or time points or when comparing data collected using different methods. A necessary condition for conducting such meaningful cross-group comparisons is the establishment of measurement invariance. One of the most frequently used procedures for establishing measurement invariance is the multigroup confirmatory factor analysis. This method was criticised in the literature because it may suggest that a model fits the data although it may contain serious misspecifications. We present an alternative method to test for measurement invariance using detection of local misspecifications and illustrate its use on two data sets assessing value priorities that are often analysed in political science and collected using paper-and-pencil and web modes of data collection.
Similar content being viewed by others
Notes
Partial invariance is supported when the parameters of at least two indicators per construct (i.e., loadings for partial metric invariance and loadings plus intercepts for partial scalar invariance) are equal across groups. According to Byrne et al (1989) and Steenkamp and Baumgartner (1998), partial invariance is sufficient for meaningful cross-group comparison (but for different views see de Beuckelaer and Swinnen, 2011; Steinmetz, 2011).
For alternative, stricter criteria, see Meade et al (2008). Recently, Muthén and Asparouhov (2013) and Asparouhov and Muthén (2013) proposed to use Bayesian and alignment methods to test for measurement invariance and partly rely on different global fit measures. However, we do not discuss these methods here and the reader is referred to the aforementioned web notes and to van de Schoot et al (2013) as well as to Cieciuch et al (2014).
A Jrule version that adopts the output of the Lisrel program (Jöreskog and Sörbom, 2001) was developed by van der Veld (available upon request from van der Veld, email: W.vanderVeld@socsci.ru.nl); a Jrule version that adopts the output of the Mplus program (Muthén and Muthén, 1998–2012) was developed by Oberski (2009).
Clearer guidelines as to which deviations may be tolerated should rely on future simulation studies. Recently, Oberski (2014) provided several guidelines to evaluate the sensitivity of parameters of interest to measurement (non)invariance (see also Meuleman, 2012, for a method to evaluate the sensitivity of latent means to scalar non-invariance).
However, see Cohen (1988, 1992) who suggests a cut-off of 0.8.
To the best of our knowledge, the literature is not clear about how many cross-loadings may be tolerated. However, the inclusion of cross-loadings for some of the groups is a threat to the assumption that the measurement operates similarly across groups.
Further details about the sample and data collection may be obtained from the first author upon request. The data is available from the first author upon request.
Fixing the variance of the latent variable to 1 or one of the items loadings as a reference to identify each latent variable did not have any impact on the results of the invariance test.
Detailed output is available from the first author upon request.
Indeed, only two items served as indicators for this value, so we could not rely on partial scalar invariance.
References
Asparouhov, T. and Muthén, B.O. (2013) Multiple group factor analysis alignment. Mplus Web Note No. 18, Version 3, available at http://www.statmodel.com/examples/webnote.shtml, accessed 23 August 2013.
Bentler, P.M. and Bonett, D.G. (1980) ‘Significance tests and goodness of fit in the analysis of covariance structures’, Psychological Bulletin 88 (3): 588–606.
Bollen, K.A. (1989) Structural Equation Modeling with Latent Variables, New York: Wiley.
Brown, T.A. (2006) Confirmatory Factor Analysis for Applied Research, New York: Guilford Press.
Brown, T.A. and Cudeck, R. (1993) ‘Alternative Ways of Assessing Model Fit’, in K.A. Bollen and J.S. Long (eds.) Testing Structural Equation Models, Newbury Park, CA: Sage, pp. 136–162.
Byrne, B.M. (2004) ‘Testing for multigroup invariance using AMOS graphics: A road less traveled’, Structural Equation Modeling 11 (2): 272–300.
Byrne, B.M., Shavelson, R.J. and Muthén, B.O. (1989) ‘Testing for the equivalence of factor covariance and mean structures – The issue of partial measurement invariance’, Psychological Bulletin 105 (3): 456–466.
Byrne, B. M. and Stewart, S.M. (2006) ‘The MACS approach to testing for multigroup invariance of a second-order structure: A walk through the process’, Structural Equation Modeling 13 (2): 287–321.
Caprara, G. V., Schwartz, S. H., Capanna, C., Vecchione, M. and Barbaranelli, C. (2006) ‘Personality and politics: Values, traits, and political choice’, Political Psychology 27 (1): 1–28.
Chen, F.F. (2007) ‘Sensitivity of goodness of fit indexes to lack of measurement invariance’, Structural Equation Modeling 14 (3): 464–504.
Chen, F.F. (2008) ‘What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research’, Journal of Personality and Social Psychology 95 (5): 1005–1018.
Cieciuch, J. and Davidov, E. (2012) ‘A comparison of the invariance properties of the PVQ-40 and the PVQ-21 to measure human values across German and Polish samples’, Survey Research Methods 6 (1): 37–48.
Cieciuch, J., Davidov, E., Schmidt, P., Algesheimer, R. and Schwartz, S.H. (2014) ‘Comparing results of an exact versus an approximate (Bayesian) measurement invariance test: A cross-country illustration with a scale to measure 19 human values’, Frontiers in Psychology 5: 982, doi:10.3389/fpsyg.2014.00982.
Cieciuch, J. and Schwartz, S.H. (2012) ‘The number of distinct basic values and their structure assessed by PVQ-40’, Journal of Personality Assessment 94 (3): 321–328.
Cieciuch, J., Schwartz, S.H. and Vecchione, M. (2013) ‘Applying the refined values theory to past data: What can researchers gain?’ Journal of Cross-Cultural Psychology 44 (8): 1215–1234.
Cohen, J. (1988) Statistical Power Analysis for the Behavioral Sciences, 2nd ed. New York: Academic Press.
Cohen, J. (1992) ‘A power primer’, Psychological Bulletin 112 (1): 155–159.
Davidov, E. and Depner, F. (2009) ‘Testing for measurement equivalence of human values across online and paper-and-pencil surveys’, Quality & Quantity 45 (2): 375–390.
Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P. and Billiet, J. (2014) ‘Measurement equivalence in cross-national research’, Annual Review of Sociology 40: 55–75.
de Beuckelaer, A. and Swinnen, G. (2011) ‘Biased Latent Variable Mean Comparisons Due to Measurement Noninvariance: A Simulation Study’, in E. Davidov, P. Schmidt and J. Billiet (eds.) Cross-Cultural Research: Methods and Applications, New York: Routledge, pp. 117–147.
De Leeuw, E.D. (2005) ‘To mix or not to mix data collection modes in surveys’, Journal of Official Statistics 21 (5): 233–255.
Dillman, D.A., Smyth, J.D. and Christian, L.M. (2009) Internet, Mail and Mixed-Mode Surveys. The Tailored Design Method, Hoboken, NJ: John Wiley & Sons.
Gordoni, G., Schmidt, P. and Gordoni, Y. (2012) ‘Measurement invariance across face-to-face and telephone modes: The case of minority-status collectivistic-oriented groups’, International Journal of Public Opinion Research 24 (2): 185–207.
Horn, J.L. and McArdle, J.J. (1992) ‘A practical and theoretical guide to measurement invariance in aging research’, Experimental Aging Research 18 (3): 117–144.
Hu, L.T. and Bentler, P.M. (1995) ‘Evaluating Model Fit’, in R. Hoyle (ed.) Structural Equation Modeling: Issues, Concepts, and Applications, Newbury Park, CA: Sage, pp. 76–99.
Hu, L.T. and Bentler, P.M. (1998) ‘Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification’, Psychological Methods 3 (4): 424–453.
Hu, L.T. and Bentler, P.M. (1999) ‘Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives’, Structural Equation Modeling 66 (1): 1–55.
Hu, L.T., Bentler, P.M. and Kano, Y. (1992) ‘Can test statistics in covariance structure-analysis be trusted?’ Psychological Bulletin 112 (2): 351–362.
Jowell, R., Roberts, C., Fitzgerald, R. and Eva, G. (2007) Measuring Attitudes Cross-Nationally. Lessons from the European Social Survey, London: Sage.
Jöreskog, K.G. (1969) ‘A general approach to confirmatory maximum likelihood factor analysis’, Psychometrika 34 (2): 183–202.
Jöreskog, K.G. (1971) ‘Simultaneous factor analysis in several populations’, Psychometrika 36 (4): 409–426.
Jöreskog, K.G. (1978) ‘Structural analysis of covariance and correlation matrices’, Psychometrika 43 (4): 443–477.
Jöreskog, K.G. and Sörbom, D. (2001) LISREL 8: User’s Reference Guide, Lincolnwood: Scientific Software International.
Kaplan, D. (1990) ‘Evaluating and modifying covariance structure models – A review and recommendation’, Multivariate Behavioral Research 25 (2): 137–155.
King, G., Christopher, J.L.M., Joshua, A.S. and Tandon, A. (2004) ‘Enhancing the validity and cross-cultural comparability of measurement in survey research’, American Political Science Review 98 (1): 191–207.
Little, T.D., Slegers, D.W. and Card, N.A. (2006) ‘A non-arbitrary method of identifying and scaling latent variable in SEM and MACS models’, Structural Equation Modeling 13 (1): 59–72.
MacCallum, R.C., Browne, M.W. and Sugawara, H.M. (1996) ‘Power analysis and determination of sample size for covariance structure modeling’, Psychological Methods 1 (2): 130–149.
Marsh, H.W., Hau, K.T. and Grayson, D. (2005) ‘Goodness of Fit in Structural Equation Models’, in A. Maydeu-Olivares and J.J. McArdle (eds.) Contemporary Psychometrics, Mahwah, NJ: Lawrence Erlbaum Associates, pp. 275–340.
Marsh, H.W., Hau, K.T. and Wen, Z. (2004) ‘In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings’, Structural Equation Modeling 11 (3): 320–341.
Marsh, H.W., Ludtke, O., Muthén, B.O., Asparouhov, T., Morin, A.J.S., Trautwein, U. and Nagengast, B. (2010) ‘A new look at the big five factor structure through exploratory structural equation modeling’, Psychological Assessment 22 (3): 471–491.
Meade, A.W., Johnson, E.C. and Braddy, P.W. (2008) ‘Power and sensitivity of alternative fit indices in tests of measurement invariance’, Journal of Applied Psychology 93 (3): 568–592.
Meredith, W. (1993) ‘Measurement invariance, factor analysis and factorial invariance’, Psychometrika 58 (4): 525–543.
Meuleman, B. (2012) ‘When are Intercept Differences Substantively Relevant in Measurement Invariance Testing?’ in S. Salzborn, E. Davidov and J. Reinecke (eds.) Methods, Theories, and Empirical Applications in the Social Sciences: Festschrift for Peter Schmidt, Heidelberg: Springer VS, pp. 97–104.
Millsap, R.E. (2011) Statistical Approaches to Measurement Invariance, New York: Routledge.
Millsap, R.E. and Everson, H.T. (1993) ‘Methodology review: Statistical approaches for assessing measurement bias’, Applied Psychological Measurement 17 (4): 297–334.
Muthén, B.O. and Asparouhov, T. (2013) BSEM measurement invariance analysis. Mplus Web Note No. 17, available at http://www.statmodel.com/examples/webnote.shtml, accessed 11 January 2013.
Muthén, L.K. and Muthén, B.O. (1998-2012) Mplus User’s Guide, Seventh edn Los Angeles, CA: Muthén & Muthén.
Oberski, D.L. (2009) Jrule for Mplus version 0.91 (beta) [Computer software], available at https://github.com/daob/JruleMplus/wiki, accessed 1 June 2015.
Oberski, D.L. (2012) ‘Comparability of Survey Measurements’, in L. Gideon (ed.) Handbook of Survey Methodology for the Social Sciences, New York: Springer, pp. 477–498.
Oberski, D.L. (2014) ‘Evaluating sensitivity of parameters of interest to measurement invariance in latent variable models’, Political Analysis 22 (1): 45–60.
Piurko, Y., Schwartz, S. H. and Davidov, E. (2011) ‘Basic personal values and the meaning of left-right political orientations in 20 countries’, Political Psychology 32 (4): 537–561.
Podsakoff, P.M., MacKenzie, S.B. and Podsakoff, N.P. (2012) ‘Sources of method bias in social science research and recommendations on how to control for it’, Annual Review of Psychology 63: 539–569.
Révilla, M.A. and Saris, W.E. (2012) ‘A comparison of the quality of questions in a face-to-face and a web survey’, International Journal of Public Opinion Research 25 (2): 242–253.
Saris, W.E. and Gallhofer, I.N. (2007) Design, Evaluation, and Analysis of Survey Research, Hoboken, NJ: John Wiley & Sons.
Saris, W.E. and Hagenaars, J.A. (1997) ‘Mode Effects in the Standard Eurobarometer Questions’, in W.E. Saris and M. Kaase (eds.) Eurobarometer. Measurement Instruments for Opinions in Europe, Mannheim: ZUMA, pp. 87–100.
Saris, W.E., Satorra, A. and Sörbom, D. (1987) ‘The detection and correction of specification errors in structural equation models’, Sociological Methodology 17: 105–129.
Saris, W.E., Satorra, A. and van der Veld, W.M. (2009) ‘Testing structural equation models or detection of misspecifications?’ Structural Equation Modeling 16 (4): 561–582.
Schwartz, S.H. (1992) ‘Universals in the content and structure of values: Theoretical advances and empirical tests in 20 countries’, Advances in Experimental Social Psychology 25: 1–65, doi:10.1016/s0065-2601(08)60281-6.
Schwartz, S. H., Caprara, G. V., Vecchione, M., Bain, P., Bianchi, G., Caprara, M. G., Cieciuch, J., Kirmanoglu, H., Baslevent, C., Lönnqvist, J-E., Mamali, C., Manzi, J., Pavlopoulos, V., Posnova, T., Schoen, H., Silvester, J., Tabernero, C., Torres, C., Verkasalo, M., Vondráková, E., Welzel, C. and Zaleski, Z. (2014) ‘Basic personal values underlie and give coherence to political values: A cross national study in 15 countries’, Political Behavior 36 (4): 899–930.
Schwartz, S.H., Cieciuch, J., Vecchione, M., Davidov, E., Fischer, R., Beierlein, C., Ramos, A., Verkasalo, M., Lönnqvist, J.-E., Demirutku, K., Dirilen-Gumus, O. and Konty, M. (2012) ‘Refining the theory of basic individual values’, Journal of Personality and Social Psychology 103 (4): 663–688.
Schwartz, S.H., Melech, G., Lehmann, A., Burgess, S., Harris, M. and Owens, V. (2001) ‘Extending the cross-cultural validity of the theory of basic human values with a different method of measurement’, Journal of Cross-Cultural Psychology 32 (5): 519–542.
Schmitt, N. and Kuljanin, G. (2008) ‘Measurement invariance: Review of practice and implications’, Human Resource Management Review 18 (4): 210–222.
Sörbom, D. (1989) ‘Model modification’, Psychometrika 54 (3): 371–384.
Steenkamp, J.-B.E.M. and Baumgartner, H. (1998) ‘Assessing measurement invariance in cross-national consumer research’, Journal of Consumer Research 25 (1): 78–90.
Steinmetz, H. (2011) ‘Estimation and Comparison of Latent Means across Cultures’, in E. Davidov, P. Schmidt and J. Billiet (eds.) Cross-Cultural Analysis: Methods and Applications, New York: Routledge, pp. 85–116.
van de Schoot, R., Kluytmans, A., Tummers, L., Lugtig, P., Hox, J. and Muthén, B. (2013) ‘Facing off with Scylla and Charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance’, Frontiers in Psychology 4: 770, doi:10.3389/fpsyg.2013.00770.
van der Veld, W.M. and Saris, W.E. (2011) ‘Causes of Generalized Social Trust’, in E. Davidov, P. Schmidt and J. Billiet (eds.) Cross-Cultural Analysis: Methods and Applications, New York: Routledge, pp. 207–247.
van der Veld, W.M., Saris, W.E. and Satorra, A. (2008) JRule 2.0: User manual. Unpublished document.
Van de Vijver, F.J.R. and Poortinga, Y.H. (1997) ‘Towards an integrated analysis of bias in cross-cultural assessment’, European Journal of Psychological Assessment 13 (1): 29–37.
Vandenberg, R.J. (2002) ‘Toward a further understanding of and improvement in measurement invariance methods and procedures’, Organizational Research Methods 5 (2): 139–158.
Vandenberg, R.J. and Lance, C.E. (2000) ‘A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research’, Organizational Research Methods 3 (1): 4–70.
Vecchione, M., Caprara, G. V., Schwartz, S. H., Cieciuch, J., Schoen, H., Silvester, J., Bain, P., Bianchi, G., Kirmanoglu, H., Baslevent, C., Mamali, C., Manzi, J., Pavlopoulos, V., Posnova, T., Torres, C., Verkasalo, M., Lönnqvist, J-E, Vondráková, E. and Alessandri, G. (2015) ‘Personal values and political activism: A cross-national study’, British Journal of Psychology 106 (1): 84–106.
Acknowledgements
The work of the first, second, and fourth authors was supported by the University Research Priority Program (URPP) ‘Social Networks’, University of Zürich. The work of the third author was supported by the Netherlands Organization for Scientific Research (NWO) [Vici grant 453-10-002]. The second author would like to thank the EUROLAB, GESIS, Cologne, for their hospitality during work on this article. The authors would also like to thank Lisa Trierweiler for the English proof of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
cieciuch, j., davidov, e., oberski, d. et al. testing for measurement invariance by detecting local misspecification and an illustration across online and paper-and-pencil samples. Eur Polit Sci 14, 521–538 (2015). https://doi.org/10.1057/eps.2015.64
Published:
Issue Date:
DOI: https://doi.org/10.1057/eps.2015.64