A confirmatory factor analysis of the attitude towards mathematics scale using multiply imputed datasets

Article history: Received 5 November 2016 Received in revised form 15 January 2017 Accepted 15 January 2017 This study re-examined, via confirmatory factor analysis (CFA) method, construct validity of PISA 2012 attitude towards mathematics scale using multiply imputed datasets. Data for this study were drawn from the Malaysian sample of PISA 2012. Specifically, 4247 students from 135 Malaysian secondary schools were used as sample in this study. Prior to conducting the CFA, missing data resulted from questionnaire rotation design were multiply imputed using predictive mean matching (PMM) method via R-package Multiple Imputation by Chained Equations (MICE). Subsequently, Mardia’s multivariate normality test was performed using Rpackage MVN. Since the attitude towards Mathematics scale was hypothesized to consist of ten constructs, a ten-factor congeneric CFA model was then built using R-package lavaan.survey, which incorporate both multiply imputed data and survey weights as well as non-normality of data through its Maximum Likelihood Robust estimation. After a few series of theory-guided model specification, several items with low loadings or crossloadings, and construct with low in both Composite Reliability (CR) and Average Variance Extracted (AVE) were eliminated. Through examination of various goodness-of-fit indices, results indicated that the final nine-factor congeneric CFA model provided good fit to the data.


Introduction
*Over the past few decades, attitudes towards mathematics have been continually receiving great attention among researchers due to its significant relationships with students' achievement (Behr, 1973;Cheung, 1988;Kibrislioglu, 2015;Quaye, 2015;Tarim and Akdeniz 2008). Attitudes towards mathematics is classified as one of affective domains in mathematics (Palacios et al., 2014) and usually measured through integration of various constructs which are related to emotional, feelings and values Dreger, 1961, Aiken, 1972;1974;1979;Fennema and Sherman, 1976;Huang and Lin, 2015;Tapia and Marsh, 2004;OECD, 2012;Tezer and Ozcan, 2015). For example, pleasure and fear of mathematics are among the earliest constructs included in the instrument for measuring attitudes towards mathematics that have been introduced by Aiken and Dreger (1961). After a decade, Aiken proposed new constructs for attitudes towards mathematics, namely enjoyment of mathematics (Aiken, 1972) and value of mathematics (Aiken, 1974). By the year 1979, constructs for attitudes towards mathematics have been growing to include more affective domains such as enjoyment of mathematics, mathematical motivation and valueutility of mathematics and fear of mathematics (Aiken, 1979).
The work of Aiken motivates other scholars to further develop other instrument for measuring attitudes towards mathematics, such as Fennema-Sherman Mathematics Attitude Scales (FSMAS) (Fennema and Sherman 1976) and Attitude toward Mathematics Inventory (ATMI) (Tapia and Marsh 2004). FSMAS is basically focused on gender differences in attitudes towards Mathematics and their effects on achievement. The FSMAS is mainly consisted of nine scales, including attitude toward success in mathematics, mathematics as a male domain, mother/father, teacher, confidence in learning mathematics, mathematics anxiety, effectance motivation in mathematics and mathematics usefulness (Fennema and Sherman 1976). Besides FSMAS, ATMI is also among the most recognized instruments used for measuring attitudes towards mathematics. Made up of 49 items, ATMI comprises six constructs, namely confidence-selfconcept, anxiety, utility-value of mathematics, enjoyment of mathematics, motivation, and parents' and teachers' expectation (Tapia and Marsh, 2004).
Different from the previous studies which predominantly focus on the attitudes towards mathematics scale for primary and secondary school students, Huang and Lin (2015) measure attitudes towards calculus among university students and introduce Attitude Toward Calculus Inventory (ATCI). Constructs for ATCI include self-confidence, motivation, value and enjoyment. Variation in constructs used for measuring attitudes towards mathematics however has sparked contentious debate among earlier scholars regarding the validity of the constructs, especially when they are used in different context than the one that they have been developed (Abdul et al., 2013;OECD, 2012;Palacios et al., 2014).
Reacting to this, various strategies have been taken by earlier scholars in an attempt to assure construct validity of the attitudes towards mathematics scale by using different contexts (Abdul et al., 2013;OECD, 2012;Palacios et al., 2014). For example, Abdul et al. (2013) have tested construct validity of ATMI instrument, which is developed in the United States of America, in South Australia using a CFA. Results showed that the ATMI scale is a reliable tool to be used in the South Australian context. Similarly, the validity and reliability of ATMI in the context of Turkish culture is also evident (Tabuk and Haciömeroğlu, 2015).
Specifically in PISA 2012, internal consistencies of constructs for the attitude towards Mathematics scale, which is developed via Item Response Theory (IRT) scaling of Likert-type items, are assessed using Cronbach's alpha. In addition, since various countries participate in PISA 2012, correlations between the constructs are computed as to warrant cross-country validity of the constructs (OECD 2012). Despite various efforts taken by OECD to ensure validity of the attitudes towards mathematics scale in PISA 2012, however little attention has been paid to the possible effect of employing multiply imputed datasets on the construct validity and reliability.
In this paper, we therefore re-examine construct validity and reliability of PISA 2012 attitudes towards Mathematics scale using multiply imputed datasets. We purposely utilize CFA method that usually being used to test the hypothesized theoretical relationship between observed items and their underlying latent constructs. Precisely, the CFA is used in this study to test factorial structure of the constructs of the attitudes towards Mathematics scale.

Methodology
Data for this study were drawn from the Malaysian sample of PISA 2012. Specifically, we used a sample of 4247 students from 135 Malaysian national secondary schools. Unlike scale development of attitudes towards Mathematics in PISA 2012 that used samples from all types of schools, in this study, we focused on a subsample consisting of students from Malaysian national secondary schools only. This sample selection is mainly due to diverse characteristics between Malaysian national secondary schools and other types of schools, such as residential, technical and private schools, in which statistical analysis involving all samples might yield misleading results (OECD, 2012).
As mentioned earlier, PISA 2012 attitudes towards mathematics scale encompassed ten constructs. Items for each constructs are presented in Table 1.
All items for each construct were scored on either 4-point scale or 5-point scale, except items ST48 which used "Forced Chioce" format, in which students were forced to choose between two possible answers.
Each item for PISA 2012 attitudes towards scale is subjected to approximately 33% missing data due to student questionnaire rotation design (OECD 2012), in which only two-third of the total number of students answered each item. Thus, prior to conducting the CFA, we cautiously handled the missing data via multiple imputation method. The data imputation was conducted using R-package MICE, in which five imputed datasets were generated. We employed predictive mean matching (PMM) estimation because PMM is more suitable for imputing ordinal type of data as well as retaining the original distribution of the items. After data imputation, we then conducted Mardia's multivariate normality test using R-package MVN. The CFA was then performed using R-package lavaan.survey, which can take into account multiply imputed data, sampling weights and non-normality of data through its Maximum Likelihood Robust (MLR) estimation (Van Buuren and Groothuis-Oudshoorn, 2011).
Since the attitude towards Mathematics scale was hypothesized to consist of ten constructs, we therefore began the CFA by building a ten-factor congeneric model. Making an effort in mathematics is worth it because it will help me in the work that I want to do later on ST29Q05 Learning mathematics is worthwhile for me because it will improve my career <prospects, chances> ST29Q05 Learning mathematics is worthwhile for me because it will improve my career <prospects, chances> ST29Q07 Mathematics is an important subject for me because I need it for what I want to study later on ST29Q08 I will learn many things in mathematics that will help me get a job SUBNORM ST35Q01 Most of my friends do well in mathematics ST35Q02 Most of my friends work hard at mathematics ST35Q03 My friends enjoy taking mathematics tests ST35Q04 My parents believe it's important for me to study mathematics ST35Q05 My parents believe that mathematics is important for my career ST35Q06 My parents like mathematics MATHEFF ST37Q01 Using a <train timetable> to work out how long it would take to get from one place to another ST37Q02 Calculating how much cheaper a TV would be after a 30% discount ST37Q03 Calculating how many square meters of tiles you need to cover a floor ST37Q04 Understanding graphs presented in newspapers ST37Q05 Solving an equation like 3x+5= 17 ST37Q06 Finding the actual distance between two places on a map with a 1:10 000 scale ST37Q07 Solving an equation like 2(x+3) = (x + 3) (x -3) ST37Q08 Calculating the petrol consumption rate of a car ANXMAT ST42Q01 I often worry that it will be difficult for me in mathematics classes ST42Q03 I get very tense when I have to do mathematics homework ST42Q05 I get very nervous doing mathematics problems ST42Q08 I feel helpless when doing a mathematics problem ST42Q10 I worry that I will get poor <grades> in mathematics SCMAT ST42Q02 I am just not good at mathematics ST42Q04 I get good <grades> in mathematics ST42Q06 I learn mathematics quickly ST42Q07 I have always believed that mathematics is one of my best subjects ST42Q09 In my mathematics class, I understand even the most difficult work FAILMAT ST44Q01 I'm not very good at solving mathematics problems ST44Q03 My teacher did not explain the concepts well this week ST44Q04 This week I made bad guesses on the quiz ST44Q05 Sometimes the course material is too hard ST44Q07 The teacher did not get students interested in the material ST44Q08 Sometimes We then performed model specification based on examination of standardized factor loadings, residuals, modification indices and goodness-of-fit indices. As recommended by Sellin and Keeves (1997), any items with factor loading of less than 0.3 were dropped from the model. In addition, as suggested by Hair et al. (2010), any pair of items with standardized residual greater than an absolute value of 4.0 was also eliminated from the model. Additionally, we also removed cross-loaded items based on modification indices.
Despite its popularity as a fit index in CFA, chisquare is rarely used independently as an absolute indicator for model fit due to its sensitivity towards large sample size (Hair et al., 2010). Therefore, to determine model fit, we used other goodness-of-fit indices such as standardized root mean square residual (SMSR), root mean square error of approximation (RMSEA), Comparative fit index (CFI) and Tucker-Lewis Index (TLI) (Hair et al., 2010). Specifically, the cut-off values for the fit indices as recommended by Hair et al. (2010) are shown in Table 2. In addition, we determined the validity and reliability of the CFA model using Composite Reliability (CR) and Average Variance Extracted (AVE), with 0.6 and 0.5 as their cut of values (Hair et al., 2010). However, in the process of building the CFA model, any decision made regarding model specification were guided by the theories that underlying this study, as strongly emphasized by Hair et al. (2010).

Multivariate normality test
Mardia's multivariate normality test showed that multivariate skewness and kurtosis for all five imputed data sets were significant, indicating that the imputed data sets did not follow multivariate normal distributions. Therefore, we used MLR estimation in the CFA in order to take into account the non-normality of data.

The confirmatory factor analysis of attitudes towards Mathematics
Following the normality test, we then built a tenfactor congeneric CFA model. After a few series of theory-guided model specification, results showed that some of the items had to be eliminated due to either having large residuals, low factor loadings, cross loadings or low in both CR and AVE values. Eliminated items for constructs were as follows: SUBNORM were ST35Q01, ST35Q02, ST35Q03, MATEFF were ST37Q01, ST37Q04, ST37Q08 ANXMAT were ST42Q10, SCMAT was ST42Q02, MTWKETH were ST46Q01, ST46Q02, ST46Q03, ST46Q04, MATINFC were ST48Q01, ST48Q03 and MATBEH were ST49Q01, ST49Q02, ST49Q05, ST49Q06, ST49Q07. No item was eliminated for INTMAT and INSTMOT constructs. FAILMAT was removed from the model because factor loadings for all its items -ST44Q01, ST44Q03 and ST44Q08were low, yielding low values in both CR and AVE. Therefore, the final CFA model only comprised nineconstructs as shown in Fig. 1.   Fig. 1: A Confirmatory factor analysis of the attitude towards Mathematics scale As expected due to large sample size, the final CFA had a significant Satorra-Bentler (SB) chi-square (SB χ2 (524) = 3612.92, p < 0.01). We thus examined other goodness-of-fit indices in order to determine model fit and the findings were as follows: RMSEA = 0.037 (90% confidence interval = 0.036 to 0.038, CFI = 0.93 and TLI = 0.92; SMSR = 0.04. The values of all goodness-of-fit indices were beyond their respective cut-off values, indicating that the model provided good fit to the data. Subsequently, we examined the validity and reliability of the constructs. Based on Table 3, all items had factor loadings exceeding the cut-off value of 0.3, reflecting that all items were indicators for each construct (Hair et al., 2010). The CR values were in the range of 0.73 to 0.85, which were more than the cut-off value of 0.6, while the AVE values were between 0.40 and 0.59. Even though the AVE values for MATHEFF, ANXMAT, SCMAT and ANXMAT constructs were less than the cut-off value of 0.5, we still retained them in the final CFA model because their CR values were high. Moreover, MATHEFF, ANXMAT, SCMAT and ANXMAT constructs were believed to have theoretical rationale to explain the attitude towards Mathematics scale. Following recommendation by Hair et al. (2010), theory must be prioritized when implementing model specification in CFA.

Conclusion
In this study, we re-examined, via CFA, the construct validity and reliability of PISA 2012 attitudes towards Mathematics scale using multiply imputed datasets of Malaysian national secondary schools students. We validated the constructs for the attitudes towards Mathematics scale by first building a ten-factor congeneric CFA model using MLR estimation in R-package lavaan.survey. However, after we conducted a few cycles of theory-guided model specification involving elimination of several items with low loadings or cross-loadings, deletion of one construct, examination of various goodnessof-fit indices and inspection of CR and AVE values, we found that the final nine-factor CFA model provided good fit to the data.
In sum, the final CFA model revealed that the attitudes towards mathematics scale exhibited different factor structures from the one originally constructed by the OECD (OECD, 2012). This result substantiates the importance of re-examining the existing attitudes towards mathematics scale as to ensure that the scale can be validly and reliably used in different context, especially when the analysis involves multiply imputed datasets Hence, we suggest that, before conducting further statistical analysis involving PISA 2012 attitudes towards mathematics scale and multiply imputed datasets, the validity and reliability of the scale should be reexamined and amendment to the scale should be made accordingly.