Psychometric evidence of statistical self-efficacy instrument based on postgraduate

Self-efficacy determines students’ capabilities in designing their performance and personal well – being. How their feelings and motivation shape the success whenever tasks are given to them. This study aimed at assessing the psychometric properties of statistical self-efficacy among 241 postgraduate students that been selected using purposive sampling. Survey research design applied and respondents were given a set of the questionnaire of Statistical Self-Efficacy (SSE) instrument which originally has 22 items. The data was analyzed using Rasch Model which produced a good item and person reliability. Likewise, the item fit, unidimensionality of the instrument were tested to identify the psychometric properties of SSE were fulfilled. The data from Rasch analysis had shown that SSE instrument had achieved the central assumptions such as item fit and unidimensionality. The analysis also covered item polarity, Wright Map, reliability and separation index. Although, past researchers have studied self-efficacy, yet some limitations including the psychometric properties have been overcome in this research. Thus, the Rasch Model analysis provides empirical evidence for SSE for future studies, particularly in the psychometrical aspect.


Introduction
*Self-efficacy is vital in psychology in defining beliefs of one's behavior, which lead to motivation, action and social environment. How students think and motivate themselves are related to their selfefficacy (Goulão, 2014). The source of efficacy is mainly from the individual experience, vicarious experience, social encouragement and the physical and emotional conditions (Bandura and Locke, 2003). Likewise, self-efficacy is highly required in making any outcome a success through behavior and approaches that are shaped by their belief of knowledge in subject matter. Besides, self-efficacy boost students' motivation in learning statistics through their awareness in controlling the motivation. An access to the development of knowledge and skills, self-efficacy has an impact to students' participation in class which leads to their academic achievement (Richardson et al., 2012;Ferla et al., 2009). Therefore, the statistical self-efficacy concerning this context of the study, influence the students' motivation and resilience in learning statistics. It may affect their cognitive or the affective domain of the learning process. Even in tertiary level, students' grade point average is correlated with their selfefficacy as well as learning strategies (Bartimote-Aufflick et al., 2015). Looking specifically at the selfefficacy in the academic issue; it is appropriate to measure the self-efficacy particularly in the subject matter, the students' outcome and consideration on demographic factors like gender should be explored (Nielsen et al., 2017). Differences in the analysis can be either from the items or the positive attribute of self-efficacy (Scherer and Siddiq, 2015).
Furthermore, to have a meaningful generalization the findings, statistical and psychometrical evidence for the responses are preferably in the study. Relevant information of the respondents' selfefficacy scales should be sufficient in providing the required proof of the study (Kreiner, 2013).
The justification for delivering the psychometric evidence is due to problems in replicating the scales based on past studies (Gaudiano and Herbert, 2003;Smith and Betz, 2000). In addition, self-efficacy has been measured widely yet in the more general context like General Self-Efficacy (GSE) (Schwarzer and Jerusalem, 1995) and Scale of Perceived Social Self-Efficacy (Smith and Betz, 2000). Hence, the importance of measuring self-efficacy is required through the use of Rasch Model analysis. Although past research has shown efforts in producing the psychometric properties of the statistical selfefficacy, the differences in research setting the respondents particularly have gained insight into the said matter. Nevertheless, vigorous studies have been done yet there exist the missing part in validation aspect (Ward et al., 2002).
The Rasch Model is commonly used in providing extensive information on individuals and items of an assessment or instrument (Matore et al., 2018a;Maat et al., 2016). The robustness criteria of an instrument can be analyzed using the Rasch Model. Abd-el-fattah (2015) stated the limitations of measuring self-efficacy regarding too extensive or not related to a particular domain. Some issues of self-efficacy in academic have been debated among researchers (Nielsen et al., 2017).Therefore, this study was conducted to provide the psychometric evidence of self-efficacy instrument using Rasch Model. A quantitative approach methodology has been applied in gathering the data.

Methodology
A survey research design was implemented in order to get 255 postgraduate students as the respondent of the study. The characteristics of the respondents should be diversely across the targeted population (Ayob and Yassin, 2017). However, after the cleaning and screening data process, only 241 respondents were considered to participate in measuring the SSE. Using a purposive sampling technique, the respondents were postgraduate students, who enrolled in various master programs such as mathematics education, psychology, science education, measurement and evaluation, leadership, counseling and many more.
As part of the graduation requirement, they were required to enroll in one statistical course which was conducted in the second semester. The SSE consists of 14 items which were adapted from Schneider (2011) which has been tested twice and produced an acceptable rate of Cronbach alpha of 0.902 for the first administration and the reliability value increased by 0.935 during the second part of the study (Schneider, 2011).
The psychometric properties of SSE involving the assumptions for Item Response Theory (IRT) such as the reliability (Apple, 2013), the item fit and unidimensionality (Effendi and Zamri, 2015) were also revealed. The justification for determining the item fit is to ensure items sustainability. Rasch Model is used as a probabilistic model which can predict the success of an event using maximum likelihood estimation using the transformation of ordinal data to logits data (Matore et.al, 2018b).

Results and discussion
The results covers the analyses based on Rasch Model analysis, which include Item Fit, Unidimensionality and Wright Map. Each analysis would cover the cut off point for every assumption. Table 1 shows the item measure order, which displays the information on the logit measurement for the SSE items. The important points that can be considered in measuring the item fit include Mean Square (MNSQ) and Z-std. These criteria can be used to detect Item Outlier or Misfit. In order to obtain the item validity for SSE, then Infit-Otfit Mean Square Analysis and Point Measure Correlation (PTMEA Corr) are required to be analysed. Fisher (2007) suggested that the Infit-Outfit Mean Square Analysis (MNSQ) which their values should be between 0.77 logits to 1.30 logits. Any items that beyond this range are suggested to be modified or removed in order to ensure the psychometric properties are fulfilled.

Item fit
Furthermore, the Zstd value that represents a normal unit is also used to test item fit with the model. However, the Zstd value can be neglected if the MNSQ value is acceptable. Moreover, the PTMEA Corr values indicate the direction of the item in measuring the construct (Hanafi et al., 2014) As shown by Table 2, all Infit-Oufit values for SSE items are within the suggested range of 0.77 to 1.30 except item D12 ("Distinguish between a population parameter and a sample statistics") of having 1.47 and its Zstd value is also more than 0. The Zstd value for item D12 is greater than 0 which reflects that the item is not able to predict.

Unidimensionality
The unidimensionality characteristic can be used to identify the measurement alignment of the construct. Aziz et al. (2013) proposed that at least 40% of the raw variance explained by measures should be achieved.
Based on Table 3, it can be shown that 55.9% of the raw variance explained by measures has become the evidence that the instrument is measuring in the right direction. The justification of not reaching 60% is due to the item disturbance or noise of 8.9% which can be shown by the unexplained variance in 1st contrast.  By using Rasch model, the logit values were obtained as the log-odds which are based on Natural logarithm unit. These logits are located along the variable line, which separates the distribution of item as well as the respondents as shown in Fig.1. The left side of the vertical ruler represents the distribution of the respondents. Each symbol of "#' represents 2 while the symbol "." represent one respondent respectively. All 154 items of SSE are located at the right side of the ruler. The distribution of the respondents' ability is not well mapped according to the items of SSE. The person mean value can be represented by the "M" on the left side of the ruler is higher than the item mean. Four respondents at the maximum location show a logit value of +7.45 and one respondent with a logit value of +1.45 was the minimum location. The location of these respondents on the ruler is related to their agreement on the respective items of D6 ("Identify the factors that influence power") and D13 ("Identify when the mean, median and mode should be used as a measure of central tendency"). Item D6 is considered the most difficult item for the respondents to agree yet item D13 is the easiest among all the 14 items. Table 4 shows the statistical summary of 241 respondents, which indicates excellent reliability (Fisher, 2007) value of 0.90. This suggests that the SSE instrument is consistent if given to another set of samples that have the same characteristics features (Arasinah et al., 2015). The person separation index shows 3.06, which exceed the 2 as the suggested value by Jones and Fox (1998) and Linacre (2002) which reflects the number of groups that were categorized according to their ability in responding towards their SSE. In addition, the person mean is +0.96 that indicates these respondents are aware with the importance of their self-efficacy in statistics.

Reliability and separation index
Likewise, the item reliability of SSE has shown an outstanding value of 0.98 as displayed in Table 4. The item separation index of 6.34 exceeds the cut of point of 3 (Linacre, 2002) and indicating 6 strata of respondents' ability in SSE.

Conclusion
This study provides the psychometrical evidence of statistical self-efficacy instrument particularly in terms of postgraduate students. The psychometrical evidence of self-efficacy has become the endorsement in improve the SSE instrument. Every characteristic of Rasch model analysis indicates that the self-efficacy instrument can be used in the similar study of the same research context. Construct of self-efficacy can be clarified in terms of flaws and the responses. The hierarchy of the item difficulty and the ability of the respondents involved are measured on an equal continuum so that self-efficacy is well assessed in terms of Item Response Theory (IRT). Some limitations do occur in this study that include uncertainty response from the respondents such as choosing a uniform Likert scale point for all items and the distribution of the questionnaire using online version has to be monitored. To overcome those challenges than, the involvement of the research during the distribution if the questionnaire is highly required. By monitoring them when choosing the item would able to avoid the uniform selection of Likert Scale point. While the restriction of using online version can be solved if the researcher fixes the time and venue during the data collection process.
Whatever the challenges are, a wide area of exploring self-efficacy is recommended for future studies. For instance, Differential Item Functioning is suggested in order to identify biases or unfairness concerning individuals. Another future research that can be done is by using Confirmatory Factor Analysis with regard to items and construct validations. Therefore, more extensive research on self-efficacy is highly recommended for the betterment in terms of the instrument as well as the relation with other important area. Effendi M and Zamri KA (2015). Psychometric assessment on Adversity Quotient instrument (IKBAR) among polytechnic students using Rasch model. In the International Conference on Education and Educational Technologies (EET), Institute for Natural Sciences and Engineering, Barcelona, Spain: 52-57.

Respondents with high ability
Respondents with low ability The most difficult item The easiest item