A review of the methods in the evaluation of faculty performance

Institutional constraints and increasing accountability continue in Colleges and Universities. In order to improve outputs, it is a necessity to establish faculty performance evaluation program in each institution. To provide adequate and unbiased evaluation programs, administrators must involve faculty members in the process of determining the evaluation's purpose, as well as its scope, sources of data, participants, and assessment of effectiveness. An assessment of the performance evaluation program helps to determine a program's effectiveness in promoting faculty development and productivity. Because there are many dimensions in pedagogical work, it is better to use multiple measures involving multiple sources for evaluation. Evidence or data can be collected from students, colleagues, and chairs, or from faculties on their own. Faculty evaluation programs need annual reviewing to see how they fit with institutional purposes of evaluation.


Introduction
*The progress of nations has become dependent on their knowledge, advanced technology and educated human resources capable of creativity and production. This progress is also based on the efficiency of University education which in turn depends on the efficiency of the University staff members who are responsible for preparing the human cadres that advance the development process in the society (Jumia'an et al., 2018).
In order to improve Universities' outputs, it has been found that the evaluation of the educational process in all its components, especially performance of faculty members, is necessary and aims to raise their competencies, and correcting imbalances, if any (White, 1995).
The performance evaluation is the process of formal evaluation that notifies employees about the duties and responsibilities assigned, and traits, qualities and characteristics desired. It also identifies potential employees for growth and prosperity in various aspects (de Almeida, 2017). Today, the evaluation system regarding the capabilities and performance of faculty members is an obvious need. But in practice, it is not easy to establish such a system. One of the most challenging issues that the Universities are facing is establishing appropriate methods for evaluation of the faculty members' performance (Jesarati et al., 2013;Lyde at al., 2016). Licata (1986) and McKeachie (1987) offered the following general guidelines for establishing successful evaluation programs ( Fig. 1):

Guidelines for faculty evaluation programs
A) Make sure the purpose of evaluation is clear. Tie all aspects of the process to the purpose. B) Involve faculty in all aspects of evaluation. C) Make administrative commitment to the evaluation process go hand in hand with commitment to due process, including written and published criteria for evaluation and appeal. D) Attempt to balance institutional needs with individual faculty needs. E) Link evaluation to faculty development and rewards. For instance, some institutions offer more liberal sabbaticals to professors agreeing to more frequent evaluation.
F) Apply all evaluation procedures consistently and fairly. G) Include multiple sources of faculty data in evaluation. H) Bring evaluation policies and practices into conformity with established civil rights guidelines. I) When using existing programs (used successfully at other institutions), tailor them to meet local needs and traditions. J) Include several levels of review and appeal.
In summary, using guidelines in the evaluation process accomplishes three goals:  They reopen the lines of communication between faculty and administration on faculty effectiveness.  They help minimize faculty resistance to evaluation.  They permit an integration of evaluation into decision making and development processes on campus. All three guidelines need to be incorporated in any faculty evaluation planning.

Purpose of faculty performance evaluation
Faculty evaluation has been defined by Miller (1987) as either a process designed to improve faculty performance (a developmental process), or a procedure that assists in making personnel decisions (a reviewing process). Another particular concern has to do with evaluating the performance and vitality of tenured faculty members (Licata 1986). Vitality refers to the faculty member's ability and interest in continuing to grow. The author observes that this is an increasing phenomenon in light of the advancing ages of professors at most institutions and decreasing job mobility.
As emphasized by Seldin (1984), the cornerstone of any evaluation must be its purpose. The purpose of evaluation shapes the questions asked, the sources of data utilized, the depth of analysis, and the dissemination of findings. The author further asserted that evaluation systems provide a constructive feedback to the professor and often create a kind of dissatisfaction that motivates the professor to improve. Chances for faculty improvement increase when an immediate feedback is given, when the professor wants to improve, and knows how to bring about the improvement.
Although most institutions identify faculty improvement as their primary goal, Moomaw (1977) believed that most evaluation systems do not stimulate and support faculty development effectively. He cited the lack of connection between evaluation and development activities, and the absence of faculty involvement in the process of evaluation as the chief reasons for the uneven, or poor, effectiveness of programs at most institutions.

Goals for the annual faculty performance evaluation plan
In Ohio University, administrators believed that yearly goals and objectives provide the foundation and direction for annual faculty development, performance enhancement, and evaluation. These goals and objectives have been agreed upon by the faculty members. The outcomes provide evidence of faculty achievement in meeting these objectives. Although the plan is oriented toward the individual faculty member, the process provides an opportunity to coordinate and integrate objectives with the mission, goals, and priorities of the University, College and Departments, as well as with the respective promotion and tenure criteria. The system is designed to meet individual and collective needs. In 2013, the Annual Faculty Performance Evaluation Policy of College of Health Sciences and Professions, Ohio University, originated and established the goals (Table 1):  The formative goals of the performance plan are intended to: A) Assist faculty members in identifying and targeting objectives for professional growth.

Faculty Evaluation Programs
Apply Several Levels of Review

Link Evaluation to Faculty Needs
B) Assist faculty members in identifying and obtaining resources needed to accomplish objectives. C) Assist faculty members in identifying professional objectives that will move them toward promotion and/or tenure (when appropriate). D) Recognize and support individual differences and preferences among faculty members in terms of their unique abilities in teaching, scholarly endeavors, and service. E) Serve as a basis for feedback to faculty members about methods, behaviors, and outcomes that can enhance performance.
 The summative goals of the performance plan are intended to: A) Identify role expectations of faculty members (i.e., teaching, scholarly endeavors, and service) to assure that each fully understands how performance will be reviewed on an annual basis. B) Review faculty performance based on mutually agreed upon objectives, action plans, and outcomes. C) Provide a basis of mutual understanding for annual review and salary adjustments that reflect the strengths and unique contributions of each faculty member. To assure how performance will be reviewed in an annual basis. To identify resources to accomplish objectives.
To review faculty performance based on objectives. To identify professional objectives.
To provide a basis for annual review and salary adjustment. To identify individual differences in teaching.
To provide a basis for faculty promotion. To give a feedback about outcomes.

5.
Outcome-based faculty performance evaluation Luguador (2015) stated that outcome-based faculty performance evaluation provides a holistic approach in education to determine the actual accomplishment and attainment of outputs among faculty members based on their documented and submitted records and reports. As he mentioned, this is one of the ways of eliminating biases and subjectivity in giving performance ratings. It aims to exercise fairness and transparency in making the evaluation process more reliable and truthful. He added that the criteria for evaluation must always be well formulated and disseminated to obtain the actual performance of people being assessed. Making it validated and presented to the concerned employees is necessary before implementation in order for them to react and comment on some areas they find ambiguous and confusing. Especially in academic institutions where performance is being highly valued due to the nature of teaching profession, determining the way faculty provide services to the students and to the organization is an utmost concern for continuous improvement.

Methods of faculty performance evaluation
Measuring the quality and accountability of teaching effectiveness in higher education has a lengthy and well-researched history (Costin et al., 1971;Arreola, 2000). Still, the questions of what "effective" means and how it is measured continue to challenge College and University faculty and administrators, particularly in regard to personnel decisions (McKeachie, 1997;Arreola, 2000;Sproule, 2000). Student ratings of instruction are the most commonly used measure of teaching effectiveness (Gustad, 1961;McKeachie, 1997). However, teaching effectiveness as a measurable construct is more complex (Young and Shaw 1999). What is the standard and who sets it? How is it measured objectively? How are the measurement results used? Fiscal constraints and the desire for better student outcomes contribute to increasing demands for accountability of student learning, thereby increasing the importance of evaluating University teaching effectiveness (McCarthy et al., 2011). When considered holistically, teaching effectiveness can account for teaching skills and student learning, as well as the process of improving both (Lyde et al., 2016) (Fig. 2).
In general, two forms of assessment were used to evaluate teaching for different purposes. Summative assessment is often used to judge teaching performance that impacts personnel decisions such as the awarding of tenure or promotion, but may not be helpful to the instructor (Raths and Preskill, 1982). Alternatively, formative assessment assists the instructor by providing information about teaching strengths and areas of improvement (Chism, 1999). A clear definition of formative assessment is "…a process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to improve students' achievement of intended instructional outcomes…" (McManus, 2008). By this definition, formative assessment is not an evaluation apart from teaching but rather an integral part of the teaching and learning process (Lyde et al., 2016). Table 2 shows the different formative and summative methods of assessment described in the literature.
The most recent methods of performance evaluation reported in the literature are as follows.
6.1. Using student outcomes to evaluate teaching Fenwick (2001) suggested the use of student outcomes to evaluate teaching, when employed carefully and thoughtfully. He stated that, whether focused generally on overall program improvement or specifically on faculty development, student outcome information appears to be most productively used by faculty working as a group toward a collective vision. The goal is building sufficient trust among teachers that they are willing to open up and share areas of weakness and strength together and work collegially to address such issues. However, according to the author, evaluation processes that are productive, valid, and reliable are usually labor-intensive and time-consuming. A note must be made that collecting meaningful data about student outcomes may demand increased paperwork from instructors already feeling overburdened. For this reason, the use of student outcomes for evaluating teaching and programs should be sparing and periodic rather than continuous and dependent on time, resources, and recognition from the institution. He concluded that any method of judging teaching is problematic when it becomes the sole measure. He recommended that any effective program of ongoing faculty development should employ student outcomes in combination with student evaluations of the course, peer classroom observation, peer evaluation of course syllabus and materials, and instructors' selfassessments. He finally stressed on that student outcome information be ultimately used to support and improve teaching, not contribute to faculty stress, fear, and alienation in an age obsessed with accountability.

Survey of 12 strategies to measure teaching effectiveness
Using the National standards to guide the definition and measurement of effective teaching, Berk (2005) reviewed twelve potential sources of evidence to measure teaching effectiveness: (a) student ratings, (b) peer ratings, (c) self-evaluation, (d) videos, (e) student interviews, (f) alumni ratings, (g) employer ratings, (h) administrator ratings, (i) teaching scholarship, (j) teaching awards, (k) learning outcome measures, and (l) teaching portfolios. He confirmed the necessity to use multiple sources for measurement of faculty performance. According to him, such strategy builds on the strengths of all sources, while compensating for the weaknesses in any single source. The author proposed a unified conceptualization of teaching effectiveness to use multiple sources of evidence, such as student ratings, peer ratings, and selfevaluation, to provide an accurate and reliable base for formative and summative decisions. He recommended this triangulation of sources in view of the complexity of measuring the act of teaching and the variety of direct and indirect sources and tools used to produce the evidence. Mohammadi et al. (2015) presented a model for faculty members' activities evaluation based on meta-evaluation of the existing system. The reliability of the current faculty members' activities metrics system was investigated in Medical School of Tehran University of Medical Sciences. A semistructured interviews were conducted regarding meta-evaluation standards. A questionnaire based on interviews' results was designed and delivered to faculty members. Finally, the components of the model regarding interviews' content analysis and questionnaire's factor analysis were extracted and finalized in a focus group session with experts. The authors found that the reliability of the current system was 0.99 (P< 0.05). They reported that the

Meta-evaluation of a-5-year Experience
Strategies to Measure Teaching Effectiveness final model had six dimensions (mission alignment, accuracy, explicit, satisfaction, appropriateness, and constructiveness) derived from factor analysis of the questionnaire and nine factors (consensus, selfreporting, web-based system, evaluation period, minimum expectancies, analysis intervals, verifiers, flexibility, and decision making) obtained via qualitative content analysis of the interviews. They concluded that the model covered conceptual and executive aspects and recommended it for Medical Schools. Jesarati et al. (2013) proposed a model for investigation of performance evaluation index. They suggested that the first and most important factor is teaching, followed by research and development, consulting and professional services, scientific and administrative services index, extracurricular and educational activities, training and cultural activities. The least important factor is cultural and training activities. According to the research results and experiences gained during the implementation, the authors suggested the following:

An investigation of performance evaluation index
A) Summary and abstract of research is programmatically available to all faculty members, managers, departments, and deans of University; and binding upon, each of them was responsible to form a functional certificate recorded in their workbook. In this way selfreporting, and self-regulation in the performance of faculty members can be improved. B) Supervision and evaluation office of University uses approved indicators in faculty members' performance evaluation. C) Indicators of faculty Members' Performance evaluation shall be communicated clearly to faculty and academic staff. D) Managers and officials of the University paid close attention to main indicators and subdivisions of the proposed model in faculty members' evaluation and ranking promotion. Laska (2016) proposed observation as a method used to supplement and verify the accuracy of the other methods. He stated that the process of observation was performed in the classrooms with the purpose of impartial and objective collection of accurate information. He added that observation provides direct and constructive feedback about professional practice. It helps observers to identify good behavior and professional practice, as well as, the professional attitudes and practices that require further assessment and improvement. Classroom observation can be done in two ways: direct classroom observation or video recording. Data are collected from diverse sources, including direct observation, interviewing and consulting. He concluded that, through observation, we intend to ensure accountability, improve performance, and increase professional development opportunities. Lyde et al. (2016) stated that, to continue support for the development of new faculty member teaching effectiveness and to improve upon the skills of experienced faculty members, the policies and procedures utilized to evaluate teaching performance were clarified according to best practices in the literature. Such clarification, the faculty believed, would support formative development of teaching while continuing to produce a summative score suitable for personnel decisions. The result of these changes was a multisource method for evaluation (MME). The MME is comprised of three primary data sources: student evaluations, instructor reflections describing attributes of their own teaching such as the teaching philosophy, and a formative external review. While the faculty perceived the MME as a useful tool, they still believe it operates more to produce a summative product than work as a formative process. According to the results, a more formative process would be supported by addressing several factors, including timing of reflections, accountability from year to year, and mentoring. Improving these constraints may make the proposed MME a more appropriate tool for formative review of teaching. When attempting to increase the formative qualities of a policy or process similar to the MME, the authors recommended that academic departments should fix a schedule of due dates that keeps work evenly distributed throughout the year and encourages an ongoing reflection and development cycle. This will not only reduce the proportion of reflective work that occurs when the annual performance review portfolio is due, it will also support faculty to reflect during the teaching semesters thereby providing opportunities for faculty to identify challenges and adjust accordingly. He also recommended that criteria could be added to performance score levels that support faculty demonstration of connectedness among elements or parts of the MME portfolio. For example, how are the student evaluation scores related to or reflective of the teaching philosophy? Or, how does the professional development plan demonstrate a connection to the student evaluation results or the teaching philosophy? Currently, the MME policy only considers reflections related to student and external peer reviews. He finally stressed on that systemic, peer mentoring or guidance (not requirement) is needed in the MME policy and in the academic department culture.

Feedback and recommendations
Evaluation of teaching can have many purposes, including collecting feedback for teaching improvement, developing a portfolio for job applications, or gathering data as part of personnel decisions, such as reappointment or promotion and tenure. Most of the methods described above can be used for all of these functions. In general, efforts to collect information for improvement can be informal and focus on specific areas an individual instructor wishes to develop. Information for job applications involves presenting one's best work and meeting the requirements outlined in job ads. However, when the purpose of evaluation is personnel decision making, it is important to use a comprehensive and systematic process. Because there are many dimensions to pedagogical work, it is best to use multiple measures involving multiple sources of data to evaluate the range of instructional activities. Evidence or data can be collected from students, colleagues and chairs, or from faculties on their own.
We recommend that administrators and method planners first look to the sources already exist in their departments, start with students' ratings with one or more sources that their faculty can embrace which reflect best practices in teaching, weigh the pulses and minutes of the different sources and finally decide which combination of sources should be used for both formative and summative decisions and those that should be used for one type of decision but not the other, such as peer ratings. They must make sure that the faculty stakeholders are involved in all the steps of the process. Whatever the combination of sources they choose to use, they should take the time and make the effort to design, execute, and report the results appropriately. The accuracy of faculty evaluation decisions hinges on the integrity of the process and the reliability and validity of the evidence collected. Finally, following establishment and implementation, reviewing and possible revision of the faculty performance evaluation program should be performed annually by the institution in order to determine if such program meets with yearly goals and objectives and if outcomes provide evidence of faculty achievement in meeting these objectives.