Research

A number of research projects addressing many of the challenges posed by existing and new conceptions of assessment are underway at CARPE. Details of each project can be accessed below. Through its research programme, CARPE contributes to critiques of policy and to policy making pertaining to all aspects of assessment.

Current Research Projects

COVID-19 Related Research Projects at CARPE

1a. LC Calculated Grades: Teachers' Reflections on the Process and on Assessment

Project Directors: Audrey Doyle, Zita Lysaght and Michael O'Leary.

This project is focused on post primary teachers’ reflections on their experiences of estimating marks/ranks for their students as part of the LC2020 Calculated Grades process in schools. The project also explores teachers’ reflections on the role they play in assessment. A survey of a voluntary sample of teachers took place during the Autumn of 2020. DCU ethical clearance for the project and a plain language statement describing it are available here and here.

1b. LC2021 Accredited Grades

Project Directors: Audrey Doyle, Zita Lysaght and Michael O'Leary.

A survey of teachers' reflections on the process and on assessment will begin in November 2021. Teachers wishing to participate can access the survey at: (to be announced). A plain language statement pertaining to the study can be accessed here.

2. Remote Proctoring

Project Directors: Gemma Cherry (CARPE), Oksana Noumenko (Prometric) and Michael O'Leary (CARPE).

Remote or online proctoring refers to the process of using technology in lieu of face-to-face proctoring when examinations are administered online. Using classical test theory (CTT) and item response theory (IRT) methods, a study is being conducted by CARPE and Prometric personnel to investigate the psychometric equivalence of performance score results, achieved by candidates taking a professional licensure examination in the US via remote and in-person proctoring modes. The data used for analyses come from administrations of the examinations pre and post Covid-19.

3. High Stakes Assessment in the Era of COVID-19: Interruption, Transformation or Regression?

Project Directors: Louise Hayward (University of Glasgow) and Michael O’Leary (CARPE).

This special issue of Assessment in Education: Principles, Policy and Practice will be edited by Louise and Michael and seeks to contribute to the debate about what we have learned from the COVID experience and how that learning might inform the future of high stakes assessment both for individual nations and internationally. The special issue will be published in 2022.

4. A Conceptual Framework for Exploring Change in Teacher Assessment Agency

Project Directors: Louise Hayward (University of Glasgow), Jana Groß Ophoff (University of Tuebingen), Sotiria Kanavidou (University of Southampton), Michael O’Leary (CARPE), Dennis Opposs, (Ofqual).

This research is focused on how teacher agency in assessment played out in the high stakes context of terminal examinations at the end of secondary education in England, Germany, Greece, Scotland and Ireland through an analysis of key policy documents. The work is being conducted under the auspices of the International Congress for School Effectiveness and Improvement (ICSEI).

5. High Stakes Examinations in the Era of Covid-19

Project directors: Vasiliki Pitsia & Michael O’Leary (CARPE). Marguerite Clarke, Diego Armando, Luna Bazaldua, Julia Liberman and Victoria Levin (World Bank).

This research sets out to capture the diversity of responses across countries when the outbreak of Covid-19 in early 2020 placed plans to hold high stakes examinations at the end of post primary school in jeopardy. Countries from different parts of the world and with different economic profiles are organised into two main categories: (a) those that continued with their usual examinations and (b) those that did not. This report describes the different approaches adopted by each of these countries, reviews the evidence on what worked well and what did not, and highlights lessons learned. The project is a joint venture between research at CARPE and the World Bank.

_________________________________________________________________________________________________________

Remote Proctoring in Credentialing Examination Contexts

Research Memos on Remote Proctoring

Project Directors: Paula Lehane & Conor Scully (CARPE)

Two memos prepared for Prometric updated the research brief on remote proctoring submitted to Prometric by Karakolidis, O’Leary and Scully in 2017. Focusing on research published over the past five years, memo one examined literature on the psychometric properties of RP examinations, the candidate experience, and test security. In the second memo consideration was given to policies, procedures and regulations (including legal regulations) when using remote proctoring for online licensure and certification tests. Recommendations are provided in both memos for how Prometric can develop and administer RP assessments in line with best practice. The reports are not currently available to the public.

_________________________________________________________________________________________________________

Twenty Five Years of Research on Leaving Certificate Assessment

Project Directors: Michael O'Leary (CARPE) and Gillian O'Connor

The project is focused on developing a structured/searchable database of all academic papers and research reports that refer to LC assessment published between 1995 and 2020. Each entry contains the full citation, an abstract, type of publication, key themes explored and details pertaining to methodology, sample size and key informants for empirical studies. The database currently has 100 entries, and all are hyperlinked to a digital copy of the paper/report. The database will be available on the CARPE website in Autumn 2021.

_________________________________________________________________________________________________________

Assessment of Bullying in the Workplace Project

Project Directors: Zita Lysaght, Angela Mazzone (Anti Bullying Centre) Michael O'Leary & Conor Scully with Anastasios Karakolidis (ERC), Paula Lehane, Larry Ludlow (Boston College), Sebastian Moncaleano (Boston College), & Vasiliki Pitsia

This project is focused on creating a measurement instrument that can be used to assess people's ability to identify bullying in the workplace. For the purposes of this study, workplace bullying is conceptualised as behaviours that involve an imbalance of power, that are repeated over time, that are intentional and that make the target feel threatened, humiliated, stressed, or unsafe at work. The research is a collaborative venture involving CARPE and the Anti Bullying Centre (ABC) at DCU. When developed, the workplace bullying instrument will be made freely available to organisations interested in planning programmes of professional development for staff. DCU ethical clearance for the project and a plain language statement describing it are available here and here.

_________________________________________________________________________________________________________

Assessment of Learning about Well-Being Project

Project Directors: Darina Scully (School of Human Development), Nisha Crosbie/Deirdre O'Brien (School of Psychology) & Michael O’Leary (CARPE)

Wellbeing of the child/young person and its significance for developmental and educational outcomes are unequivocal. There is an abundance of instruments in existence that purport to measure various aspects of wellbeing, or an individual's subjective state of wellbeing. However, a heretofore understudied area is how young people's knowledge and understanding of the concept can be assessed. Wellbeing has been identified as a key curricular area in the reformed Junior Cycle programme, and the NCCA's Guidelines for Wellbeing in Junior Cycle (2017) call for the use of a wide variety of approaches in assessing students' learning in this area. Consequently, the development of tools that can aid student and teacher judgement making about students' progress in knowing about and understanding wellbeing may prove very useful. With this in mind, this study seeks to examine the potential use of scenarios/vignettes to achieve this.

_________________________________________________________________________________________________________

Student Experience of Feedback

Project leaders: Michael O'Leary, Zita Lysaght & Sean McGrath (Glanmire College)

This study is being conducted jointly by CARPE and Glanmire Community College, Cork and is designed to gather data on how second year students in school experience feedback from their teachers. Using an on-line questionnaire, the study aims to gather data from students on variables such as how often they receive feedback and what types of feedback they find most useful.

_________________________________________________________________________________________________________

Assessment for Learning and Teaching (ALT) Project

Project Directors: Zita Lysaght

The Assessment for Learning and Teaching Project (ALT) project has its roots in assessment challenges identified from research conducted in the Irish context. This research highlighted: (a) The dearth of assessment instruments nationally and internationally to capture changes in children’s learning arising from exposure to, and engagement with, AfL pedagogy; (b) The nature and extent of the professional challenges that teachers face when trying to implement AfL with fidelity and; (c) The urgent need for a programme of continuous professional development to be designed to support teachers, at scale, to learn about AfL and integrate it into their day-to-day practice.

Since the initiation of the ALT project, significant progress has been made in all three areas: The Assessment for Learning Audit instrument (AfLAi) has been used across a range of Irish primary schools and in educational systems in Australia, Norway, Malaysia, Chile and South Africa. Work is currently underway in adapting the AfLAi for use in secondary schools and by students in both primary and secondary settings. The research focused Assessment for Learning Measurement instrument (AfLMi), first developed in 2013, is being updated with data from almost 600 Irish primary teachers. Programmes of professional development continue to be implemented in pre-service undergraduate teacher education, in post graduate teacher education and as part of site based in-service teacher education.

_________________________________________________________________________________________________________

Minecraft in Irish Primary and Post-Primary Schools

Project Directors: Paula Lehane (CARPE), & Deirdre Butler (Institute of Education)

Minecraft is a ‘sandbox’ video game first released to the public in 2009, where players control a virtual avatar in a Lego-like world made up of blocks that can be moved to construct buildings and used to create items and structures. It is currently the second most popular video game of all time, with more than 100,000,000 copies sold worldwide. Schools in many countries, including the United States of America and Sweden, have decided to integrate the education version of the game (MinecraftEdu) into their curricula. MinecraftEdu is a platform that allows students in schools to freely explore, imagine and create in virtual environments and collaborative worlds that have special features specifically designed for classroom use. In DCU, the Institute of Education (IoE) has a dedicated Minecraft Studio (opened in December 2018) that student teachers can use to explore how innovative virtual and physical learning spaces can transform the curriculum and engage young people with new educational environments. CARPE is currently working with the IoE to develop research projects that will investigate the possible value of Minecraft in Irish primary and post-primary settings.

_________________________________________________________________________________________________________

Inter-Rater Reliability in the Objective Structured Clinical Examination (PhD Project)

Project Director: Conor Scully (PhD Candidate); Project Supervisors: Michael O'Leary, Mary Kelly & Zita Lysaght

Conor's thesis will examine the issues of inter-rater reliability and validity in the Objective Structured Clinical Examination (OSCE), an assessment format common in medicine and nursing. Using a mixed-methods approach, he will seek to understand how OSCE assessors interpret and understand student performances in the exam. It is hoped that this understanding will allow for more reliable inferences to be made on the basis of OSCE scores and a higher quality assessment overall.

_________________________________________________________________________________________________________

Embedding the Assessment of Emotional Intelligence within Collaborative Problem-Solving Tasks: An Exploratory Study (PhD Project)

Project Director: Deirdre Dennehy (PhD Candidate); Project Supervisors: Michael O'Leary, Zita Lysaght

Emotional Intelligence (EI) assessment is significant against the background of global interest in transversal skills assessment which were previously termed 21^st century skills. The term transversal skills refer to key abilities and aptitudes which are transferable across all areas of modern life and are pertinent to overall successful functioning in a digitalised society (May et al., 2015, Munro, 2017). Historically, the type of knowledge that was esteemed by humanity was content and knowledge-based. However, the development of the globalized market and advances in technology have altered the skills that are required for many careers. Today, many jobs require individuals to collaborate, communicate and use their interpersonal skills to a high level. An individual who is skilled in perceiving, managing and using their emotions will flourish in these types of problem-solving environments. As a result, the domain of EI education and assessment has attracted substantial interest from economic organisations and educational settings alike.

However, there are significant limitations facing current EI measures. The majority are text-based and assess EI in isolation. This may not adequately reflect how an individual exhibits their EI skills in real-life as these are frequently demonstrated in tandem with other important cognitive skills like problem-solving. There is a need therefore for the development of authentic high-fidelity EI assessments which capture the dynamics of true human interaction. The current study will attempt to embed an EI assessment within an existing problem-solving assessment. Technology will assist in creating an evaluation which is both time and user friendly. This PhD project aims to contribute to this field of research by serving as an exploratory blueprint for the future development of authentic EI assessments.

_________________________________________________________________________________________________________

Measuring Non-Cognitve Factors

Project Directors: Lisa Abrams (Virginia Commonwealth UNiversity), Mark Morgan (DCU) & Michael O'Leary (CARPE)

Cognitive skills involve conscious intellectual effort, such as thinking, reasoning, or remembering. In contrast, non-cognitive skills are related to other important interpersonal or ‘soft skills’ like motivation, integrity, persistence, resilience and interpersonal interaction. These non-cognitive factors are associated with an individual’s personality, temperament, and attitudes. Research at the international, national and school level is increasingly looking at the value of non-cognitive skills and at how education systems impact their development. As demand for these skills will continue to change as economies and labor market needs evolve, with trends such as automation causing fundamental shifts, this is an issue that should be addressed by researchers and those in industry.

_________________________________________________________________________________________________________

Teacher Assessment Literacy - Scale Develoment Project

Project Directors: Zita Lysaght, Darina Scully, Anastasios Karakolidis, Vasiliki Pitsia, Paula Lehane & Michael O’Leary (CARPE)

Assessment literacy (Stiggins, 1991) has long been viewed as an important characteristic of effective teachers. Assessment literacy can be defined as “an individual's understandings of the fundamental assessment concepts and procedures deemed likely to influence educational decisions” (Popham, 2011, p. 267). Correct use of different assessment types and forms, accurate administration and scoring of tests, appropriate interpretation of student performance etc., all form part of a teacher’s assessment literacy. At present, very few objective measures of teacher assessment literacy exist. CARPE is currently attempting to rectify that with the current research project as the centre is now trying to develop a scale to measure primary teachers’ assessment literacy in Ireland.

Completed Research Projects

Assessment of Critical Thinking in Dublin City University (ACT@DCU)

Project Director: Michael O'Leary (CARPE)

ACT@DCU investigated the extent to which an online test developed by the Educational Testing Service (ETS) in the United States to assess critical thinking in higher education was suitable for use in DCU. Findings from the initial validation study of the test using data from DCU students can be read here.

Over time the intention is that that data from the test will help to facilitate conversations among staff regarding pedagogy, curricula and educational interventions to improve teaching and learning of CT; be integrated with other non-cognitive and co-curricular indicators of student success at DCU; and provide evidence of institutional and program-level learning outcomes in CT.

_________________________________________________________________________________________________________

NCCA Assessment of Live Remote Proctoring

Project directors at CARPE: Gemma Cherry, Michael O'Leary and Darina Scully.

This study, conducted under the auspices of the National Commission for Certifying Agencies (NCCA) in cooperation with CARPE and Prometric, was undertaken to evaluate the extent to which credentialing testing programs in the US using remote proctoring were meeting NCCA Standards. Live remote proctoring (LRP) was defined by the Commission as remote proctoring that occurs with a person actively watching and monitoring a candidate during the time of the test administration and that provides safeguards for exam integrity and validity similar to in-person proctoring. Nine programes volunteered to participate and submitted self-study reports in June 2020, including a technical report, that compared outcomes based on LRP and other delivery methods (computer-based testing and paper-based testing). A subset of the NCCA Standards was used to evaluate each program’s self-study report. The report published in February 2021 can be accessed here.

_________________________________________________________________________________________________________

The use of cross-national achievement surveys for education policy reform in the European Union: Ireland

Project Leaders: Anne Looney, Michael O'Leary, Gerry Shiel & Darina Scully

This research contributed to a book volume that examined the range and salience of different international achievement surveys for policy design and reform within European countries: Germany, France, Italy, Netherlands, Sweden, Finland, Ireland, Poland, Estonia, and Slovakia. Collectively, the national profiles provide a critical analysis of the use (and misuses) of cross-national achievement surveys for monitoring educational outcomes and policy formation.

_________________________________________________________________________________________________________

Assessment of Transversal Skills in STEM

Project Partners: CARPE, NIDL (National Institute for Digital Learning), CASTeL (Centre for the Advancement of STEM Teaching and Learning), and representatives from education ministeries in the following countries: Ireland, Austria, Cyprus, Belgium, Slovenia, Spain, Finland and Sweden

This was an ambitious DCU led project that secured €2.34 million in Erasmus+ funding. Involving 8 EU countries (Ireland, Austria, Cyprus, Belgium, Slovenia, Spain, Finland and Sweden) and working with 120 schools across Europe, the partners devised, tested and scaled new digital assessments for STEM education that engaged and enhanced students’ transversal skills such as teamwork, communication and discipline-specific critical thinking. CARPE personnel worked with DCU colleagues to provide the theoretical and operational frameworks of the research (report #5). CARPE was also responsible for a review and synthesis of the research literature on STEM formative digital assessment (report #3) and for a report on virtual learning environments (VLEs) and digital tools for implementing formative assessment in STEM (report #4). These reports highlight how students can best be scaffolded towards the development of key STEM skills and how digital tools can capture the evidence for this and augment teaching practices to help provide constructive feedback on student progress. A paper outlining the workings of the project to date was published by the European Association of Distance Teaching Universities (The Netherlands) in October 2021.

_________________________________________________________________________________________________________

Interviews as a Selection Tool for Initial Teacher Education

Project Directors: Paula Lehane, Zita Lysaght & Michael O'Leary (CARPE)

Even when other factors such as student background and prior attainment are controlled for, having a ‘good’ teacher is one of the most important predictors of student success (Slater et al., 2009). Therefore, the goal of Initial Teacher Education (ITE) in Ireland should be to produce these ‘good’ teachers for employment in primary and post-primary schools. To achieve this, the admissions procedures for ITE programmes have a responsibility to select those applicants who are most suited to the profession and most likely to succeed in the required preparatory courses.

Many countries, including Ireland, now consider a range of admission criteria and selection tools when screening applicants for entry to ITE. Most Irish institutions use applicant performance on an interview as a selection tool for postgraduate ITE (Darmody & Smyth, 2016). However, research on the efficacy of interviews as a selection measure for ITE programmes is mixed. CARPE conducted an in-depth literature review that synthesises what research has found about the efficacy, or otherwise, of interviews as a selection mechanism for university based postgraduate programmes of teacher education. Based on this review, recommendations for future practice and policy were formulated. An article based on this research and published in 2021 can be accessed here.

_________________________________________________________________________________________________________

Irish primary and post-primary students’ performance at the upper levels of achievement in mathematics and science across national and international assessments (PhD Project)

Project Director: Vasiliki Pitsai (PhD Candidate); Project Supervisors: Michael O'Leary, Gerry Shiel, Zita Lysaght

High achievement at school is a strong predictor of students’ future professional and social success, and of a country’s future economic development and sustainability. High achievement in mathematics and science has been linked to building a knowledge society and driving sustainable economic growth, while also delivering social recovery. Therefore, it is important that educational systems promote and reward high achievement, especially the knowledge and skills that are deemed necessary for developing a smart economy and for living and working in the 21st century. While, on average, students in Ireland have often performed well on national and international assessments of mathematics and science, there is a notable absence of higher-achieving students (those who score at the highest proficiency levels). This study undertook an in-depth investigation of the nature of high achievement in mathematics and science in Ireland, using large-scale databases from the Programme for International Student Assessment (PISA), the Trends in International Mathematics and Science (TIMSS) study, Irish National Assessments and Irish state examinations (Junior and Leaving Certificates).

This PhD project contributes to this field of research by addressing the following research questions:

- What are the background characteristics of high achievers in mathematics and science in national and international assessments in Ireland and how do these characteristics differ from their counterparts’ in countries with average achievement similar to Ireland?

- Which factors at the student, home, class, and school level can predict high mathematics and science performance in national and international assessments in Ireland?

- Which subdomains of mathematics and science do high achievers in Ireland do well on, and which aspects do they struggle with? Are there factors at the student, home, class, and school level that may predict higher or lower performance of high achievers in Ireland in specific subdomains of mathematics and science?_________________________________________________________________________________________________________

Multimedia Items in Technology-Based Assessments (PhD Project)

Project Director: Paula Lehane (PhD Candidate); Project Supervisors: Michael O'Leary, Mark Brown, Darina Scully

Using digital devices and technology to conduct assessments in educational settings has become more and more prevalent in recent times. Indeed, it now seems inevitable that future assessments in education will be administered using these media (OECD, 2013). Therefore, it is essential that educational researchers know how to design reliable and appropriate technology-based assessments (TBAs). However, no guidelines for the design of TBAs exist. Although TBAs have many medium-unique items, including multimedia objects such as animations and videos, their impact on test-taker performance and behaviour, particularly in relation to attentional allocation and information processing, has yet to be fully clarified.

This PhD project contributes to this growing field of research by addressing the following research questions:

- How do test-takers allocate attention in TBAs that include multimedia items?

- What is the impact of multimedia items on test-taker performance in TBAs?

- Is there a difference in test-taker performance and attentional allocation behaviours in TBAs involving different types of multimedia items?

- What are the meaningful relationships, patterns and clusters in performance data that can be used to assess and score problem-solving skills in TBAs?

_________________________________________________________________________________________________________

Test Specifications in Certification and Licensure Assessments

Project Directors: Michael O’Leary (CARPE), Lisa Abrams (Virginia Commonwealth University) & Katherine Reynolds (Boston College)

Specifying test content, often in the form of professional knowledge, skills and judgments (KSJs), prior to item development is fundamental to test quality in the field of certification and licensure. Alignment between test items and KSJs can serve as a critical piece of content-related validity evidence for a testing program. Alignment studies, common in high-stakes achievement testing, are less frequent in credentialing and licensure. This research explored the application of the Webb model (2006), a popular alignment approach in educational settings, for use in professional testing. The Webb model provides four indices of alignment: categorical congruence, depth of knowledge consistency, range of knowledge correspondence and balance of representation. Together, these four indices can be taken as evidence of alignment between assessment items and KSJs, providing content validity evidence for a testing program. This form of validity evidence is particularly important, given that US test developers have a legal mandate to ensure test content is reflective of the knowledge, skills and judgements in a given profession. A paper outlining how a Webb alignment study might be carried out in a professional testing context and how such a study proceeds in practice was published in 2020 (available here).

_________________________________________________________________________________________________________

Standardised Assessment in Reading and Mathematics Project

Project Directors: Michael O’Leary (CARPE), Zita Lysaght (DCU IoE), Deirbhile Nic Craith (INTO) & Darina Scully (CARPE)

Since the publication of the Assessment Guidelines for Primary Schools in 2007, there has been a stronger focus on assessment in primary schools. There are many forms of assessment, of which standardised testing is one. Standardised tests have gained in importance since 2012 when schools have been obliged to forward the results of standardised tests to the Department of Education and Science.

The purpose of this research was to explore the use of standardised tests in literacy and numeracy in primary schools in Ireland (ROI). Issues addressed include teachers’ understanding of standardised tests, how standardised tests are used formatively and diagnostically and the experiences of schools in reporting on the results of standardised tests. Data on teachers' professional development needs with respect to standardised testing were also gathered. Following a year-long development and piloting process, a questionnaire was distributed in hard copy and online to a random sample of 5,000 teachers in May 2017. Over 1500 teachers returned completed questionnaires and the findings were released in June 2019, along with a number of policy recommendations to help address the needs and concerns of teachers regarding the use of standardised tests in primary schools. The report is available online from CARPE.

_________________________________________________________________________________________________________

Animations for Large Scale Testing Programmes Project

Project Director: Anastasios Karakolidis (PhD Candidate); Project Supervisors: Michael O’Leary and Darina Scully

Although technology provides a great range of opportunities for facilitating assessment, text is usually the main, if not the only, means used to explain the context, present the information, and communicate the question in a testing process. Written language is often a good fit for measuring simple knowledge-based constructs that can be clearly communicated via text (such as historical events), nevertheless, when assessments provide test takers with plenty of sophisticated information in order to measure complex constructs, text may not be suitable for facilitating this process (Popp, Tuzinski, & Fetzer, 2016). Animations could be a pioneering way of presenting complex information that cannot be easily communicated by text/written language. However, research literature on the use of animations in assessment is currently scarce.

Anastasios' recently completed PhD project focused on (a) the development and validation of an animation-based assessment instrument, (b) the investigation of test-takers’ views about this instrument and (c) the examination of the extent to which this animated test provides a more valid assessment of test-takers’ knowledge, skills and abilities, compared to a parallel text-based test. His preliminary findings will be published after September 2019.

___________________________________________________________________________________________________________

Computer Based Examinations for Leaving Certificate Computer Science

Project Director: Paula Lehane (CARPE) with the National Council for Curriculum and Assessment (NCCA)

In line with the recommendations of the Digital Strategy for Schools (Department of Education and Skills [DES], 2015), a more formal approach to the study of technology and computing in second-level schools has been established thanks to the newly developed Computer Science (CS) curriculum for Leaving Certificate students. In September 2018, forty schools were selected to trial the implementation of this subject which will culminate in an ‘end-of-course computer-based examination’ in 2020 (National Council for Curriculum and Assessment [NCCA]). This examination will represent 70% of a student’s overall CS grade.

The use of a computer-based exam (CBE) for the assessment of CS students is a significant departure in tradition for the Leaving Certificate programme. All other subjects in the Leaving Certificate involving an end-of-course examination employ paper-based tests. The planned CBE for CS will represent the first of its kind in the Irish education system when it is introduced in 2020. This challenge of developing and delivering a high-stakes CBE is also magnified by the inherent difficulties associated with the evaluation of students’ knowledge and learning in computing courses (Kallia, 2018). Therefore, to ensure that the pending CS exam delivers a CBE in a responsible manner that preserves the fairness, validity, utility and credibility of the Leaving Certificate examination system, CARPE was commissioned by the NCCA to write a report outlining what factors pertaining to the design, development and deployment of this CBE will need to be considered. The aim of this report is to guide the decisions of policy-makers and other relevant stakeholders. The report is available here.

________________________________________________________________________________________________________

Assessment in the re-developed Primary School Curriculum

Project Directors: Zita Lysaght, Darina Scully and Michael O'Leary (CARPE); Damian Murchan (TCD) & Gerry Shiel (ERC)

The National Council for Curriculum and Assessment (NCCA) is working with teachers and early childhood practitioners, school leaders, parents and children, management bodies, researchers and other stakeholders to develop a high-quality curriculum for the next 10-15 years. A discussion paper written by researchers from CARPE, TCD and the ERC highlights the importance of aligning assessment, learning and teaching in curricular reform and implementation. It is available to read here.

_________________________________________________________________________________________________________

The Leaving Certificate as Preparation for Third Level Education Project

Project Directors: Darina Scully & Michael O'leary (CARPE)

The Leaving Certificate Examination (LCE) plays a crucial role in the process of how people are selected for third level education. However, the extent to which the Leaving Certificate Programme (LCP) as a whole (i.e. 5th and 6th year + the examination) provides students with a good preparation for their Third Level education is unclear. This project aimed to shed some light on this issue.

For those who sat the LCE in 2017, their experiences of 5th and 6th year and preparing for and taking the LCE were still fresh in their minds as they started college in Sepetmber 2017. They also had a good understanding of what is being required of them in college by March 2018. With this in mind, this project gathered data from first year students at DCU in April 2018 who were in a position to offer important insights that can be used to evaluate the LCP and its relevance to first year in college.

Findings from the study are available online from CARPE.

_________________________________________________________________________________________________________

State-of-the-art in Digital Technology-Based Assessment Project

Project Directors: Michael O'Leary, Darina Scully, Anastasios Karakolidis & Vasiliki Pitsia

Following an invitation to contribute to a special issue of the European Journal of Education, a peer-reviewed journal covering a broad spectrum of topics in education, CARPE completed an article on the state-of-the-art in digital technology based assessment. The article spans advances in the automated scoring of constructed responses, the assessment of complex 21st century skills in large-scale assessments, and innovations involving high fidelity virtual reality simulations. An "early view" of the article was published online in April 2018, with the special issue (focused on the extent to which assessments are fit for their intended purposes) due to be published in June 2018.

_________________________________________________________________________________________________________

Learning Portfolios in Higher Education Project

Project Directors: Darina Scully, Michael O'Leary (CARPE) & Mark Brown (NIDL)

The ePortfolio is often lauded as a powerful pedagogical tool, and consequently, is rapidly becoming a central feature of contemporary education. Learning portfolios are a specific type of ePortfolio that may also include drafs and 'unpolished work', with the focus on both the process of compiling the portfolio as well as the finished product. It has been hypothesized that learning portfolios may be especially suited to the development and assessment of integrated, cross-curricular knowledge and generic skills/attributes (e.g. critical thinking, creativity, communication, emotional intelligence), as opposed to disciplinary knowledge in individual subject areas. This is of particular interest in higher education contexts, as universities and third-level face growing demands to bridge a perceived a gap between what students learn, and what is valued by employers.

In conjunction with the NIDL, CARPE have completed a comprehensive review examining the state of the field regarding learning portfolio use in third level education. Specifically, this review (i) evaluates the extent to which there is sufficient empirical support for the effectiveness of these tools, (ii) highlights potential challenges associated with their implementation on a university-wide basis and (iii) offers a series of recommendations with respect to ‘future-proofing’ the practice.

The review was formally launched in February 2018, and has garnered a great deal of attention in the intervening months. A roundtable discussion to discuss possible research opportunities within DCU on the basis of the findings is due to be held in May 2018. In addition, selected findings will be disseminated at various international conferences, including EdMedia in June 2018 (Amsterdam, Netherlands) and the World Education Research Association (WERA) in August 2018 (Cape Town, South Africa). The review is also in the process of being adapted and translated into Chinese by Prof. Junhong Xiao of Shantou Radio and Television University, with the translated article to feature in an upcoming addition of the peer-reviewed journal Distance Education in China, and CARPE have recently acquired funding to support an additional translation into Spanish.

_________________________________________________________________________________________________________

Validity Evidence in Maintenance of Certification (MOC) Assessments

Project Directors: Michael O’Leary (CARPE)

In the United States, Maintenance of Certification (MOC) was created in response to public health research in the 1990s revealing “significant variations in healthcare practices” among physicians, many of which lead to preventable negative patient outcomes (Chung, Clapham, & Lalonde, 2011, p. 3). A critical component of MOC is the cognitive exam, which until recently was typically administered by its respective medical specialty board in a secure environment near the end of a 10-year cycle.

Criticism of medical specialty boards’ Maintenance of Certification (MOC) 10-year exams have spurred the development of shorter, more frequent assessments. These assessment programs, such as MOCA Minute or Knowledge Check-In, aim to reduce examinee burden and provide better alignment to physician practice. But how can we tell if these forms of assessment are “better” than the traditional, 10-year exam? The answer is not straightforward; however, in this research a validity-based framework for addressing this question is proposed, emphasising validity evidence with respect to content, criteria, and consequences. The work was presented to Prometric clients in Baltimore in 2018.

_________________________________________________________________________________________________________

Situational Judgement Tests (SJTs) Project

Project Directors: Anastasios Karakolidis, Michael O'Leary, Darina Scully (CARPE) & Steve Williams (Prometric)

Originating in and most commonly associated with personnel selection, Situational Judgement Tests (SJTs) can be loosely defined as assessment instruments comprised of items that (i) present a job-related situation, and (ii) require respondents to select an appropriate behavioural response to that situation. Traditionally, SJTs are assumed to measure tacit, as opposed to declarative knowledge; or as Wagner and Sternberg (1985) put it: “intelligent performance in real-world pursuits… a kind of ‘street smarts’ that helps people cope successfully with problems, constraints and realities of day-to-day life.” Debate about the precise nature of the construct(s) underlying SJTs persists.

In recent years, the use of SJTS for selection, training and development purposes is increasing rapidly; however, these instruments are still not well understood. Experts continually debate issues such as how SJTs should be developed, and how they should be scored. For example, although it is common to score SJTs based on test-takers' ability to identify the best response to each given situation, it has been argued (e.g. Stemler, Aggarwal & Nithyanand, 2016) that it may be more appropriate to distinguish between test-takers based on their ability to avoid the worst option.

In collaboration with our funders, Prometric, this project investigated the use of an SJT designed using the 'critical incident approach' for the training and development of Prometric employees. Specifically, the project sought to explore validity evidence for the SJT as a measure of successful job performance across two different keying approaches (consensus vs. expert judgement) and five different scoring approaches (match best, match worst, match total, mismatch penalty and avoid total). The findings suggest that scoring approaches focused on the ability to identify the worst response are associated with moderate criterion-related validity. Furthermore, they underline the psychometric difficulties associated with critical incident SJTs. These findings were presented at the European Association of Test Publishers (E-ATP) conference in September 2017 (Noordwijk, Netherlands).

_________________________________________________________________________________________________________

Three vs. Four Option Multiple-Choice Items Project

Project Directors: Darina Scully, Michael O’Leary (CARPE) & Linda Waters (Prometric)

A strong body of research spanning 30+ years suggests that the optimal number of response options for a multiple-choice item is three (one key and two distractors). Three-option multiple choice items require considerably less time to construct and to administer than their four- or five-option counterparts. Furthermore, they facilitate broader content coverage and greater reliability through the inclusion of additional items. Curiously; however, the overwhelming majority of test developers have paid little heed to these factors. Indeed, it is estimated than <1% of contemporary high-stakes assessments contain three-option items (Edwards, Arthur & Bruce, 2012).

This phenomenon has often been commented on, but never satisfactorily explained. It is likely that fears of guessing have played a role, given that chance selection of the correct response theoretically rises from 20% to 25% or 33% when the number of response options is reduced to three. However, distractor analyses across various contemporary high-stakes assessments reveal that more than 90% of four- and five-option items have at least one non-functioning distractor. That is, most of the time, when test-takers need to guess, they do not do so blindly; rather, they eliminate at least one implausible distractor and guess from the remaining options. As such, the majority of four- and five-option items effectively operate as three-option items.

In collaboration with our funders, Prometric, a study comparing item performance indices and distractor functioning (based on responses from more than 1,000 test candidates) across 20 stem-equivalent three-and four-option items from a high-stakes certification assessment was conducted. Findings from the project were disseminated at the Association of Test Publishers (ATP) Conference in March 2017 (Scotsdale, Arizona) and are being used to inform the development of future items for a number of Prometric's examinations.

_________________________________________________________________________________________________________

Higher-Order Thinking in Multiple-Choice Items (HOT MC Items) Project

Project Directors: Darina Scully & Michael O'Leary (CARPE)

The nature of assessment can exert a powerful influence on students’ learning behaviours. Indeed, students who experience assessments that require them to engage in higher-order thinking processes (i.e. those represented by higher levels of Bloom’s (1956) Taxonomy, such as application, analysisand synthesis) are more likely to adopt more meaningful, holistic approaches to future study, as opposed to engaging in mere surface-level or ‘rote-learning’ techniques (Leung, Mok & Wong, 2008). It is often assumed that multiple-choice items are incapable of assessing higher-order thinking; or indeed, anything beyond recall/recognition, given that the correct answer is provided amongst the response options. However, a more correct assertion may be that multiple-choice items measuring higher-order processes are simply rarely constructed. It is true that MC items, like all assessment formats, are associated with some limitations, but it may be possible to construct these items at higher levels, provided certain strategies are followed. MC items remain attractive to and frequently used by educators and test developers due to their objective and cost-efficient nature; as such, it is worthwhile putting time and effort into identifying and disseminating these strategies within the assessment community.

This project involved a comprehensive review of the extant literature that (a) has investigated the capacity of multiple-choice items to measure higher-order thinking or (b) has offered strategies or guidance on how to do so. An article based on this review was published in the peer-reviewed journal Practical Assessment, Research and Evaluation in May 2017, and the work has also contributed to the development of training and development materials for Prometric's test developers

_________________________________________________________________________________________________________

Practice Tests in Large Scale Testing Programmes Project

Project Directors: Anastasios Karakolidis, Darina Scully & Michael O’Leary (CARPE)

This project was focused on developing a research brief reviewing the key findings arising from the literature regarding the efficacy of practice tests. This brief was published in the summer 2017 edition of Clear Exam Review, and the findings are also being used to inform Prometric's practices surrounding the development and provision of practice test materials.

_________________________________________________________________________________________________________

Feedback in Large Scale Testing Programmes Project

Project Directors: Michael O'Leary & Darina Scully (CARPE)

In recent years there is increasing pressure on test developers to provide diagnostic information that can assist unsuccessful test takers improve future performance and assist academic and training institutions in evaluating the success of their programmes and identifying areas that may need to be modified (Haberman & Sinharay, 2010; Haladyna & Kramer, 2004). This growing demand for diagnostic feedback is also evident in the Standards for Educational and Psychological Testing, which states that “candidates who fail may profit from information about the areas in which their performance was especially weak” (AERA, APA & NCME, 2014, p. 176). Test developers face a substantial challenge in attempting to meet this demand, whilst simultaneously upholding their ethical responsibility – also outlined in the Standards – to ensure that any test data that are reported and shared with stakeholders, or used to make educational, certification or licensure decisions are accurate, reliable and valid.

CARPE have conducted a review of the literature on the issues involved in reporting test sub-scores, including the identfication of a number of approaches (e.g. scale anchoring, level descriptors and graphical methods) that can be taken when reporting in large scale testing contexts. These findings are being used to inform Prometric's practices surrounding the provision of feedback to unsuccessful test candidates. _________________________________________________________________________________________________________

Partial Credit for Multiple Choice Items Project

Project Directors: Darina Scully & Michael O’Leary (CARPE)

Multiple-choice test developers have typically shown a strong preference for the use of the single-best answer response format and number-correct scoring. Despite this, some measurement experts have expressed dissatisfaction with these methods, on the basis that they assume a sharp dichotomy between knowledge and lack of knowledge. That is, the entire model fails to take into account the varying degrees of partial knowledge a test-taker may possess on an item-by-item basis. This is regrettable, as information regarding test-takers’ partial knowledge levels may contribute significantly to the estimation of true proficiency levels (DeAyala, 1992).

In response to this criticism, a number of alternative testing models that facilitate the allocation of partial credit have been proposed (e.g. Ben-Simon, Budesco & Nevo, 1997; Frary, 1989; Lau, Lau, Hong & Usop, 2011). Their exact nature varies considerably, but all share the aim of maximizing the information efficiency of individual items, and increasing precision of measurement. CARPE have conducted a literature review focusing on three approaches that facilitate the allocation of partial credit; namely: option-weighted scoring, confidence-weighted responding, and the liberal multiple-choice item format. To date, findings regarding the application of these approaches have been complex and equivocal, with no one method emerging as uniformly superior. Ultimately, whether or not it is worth pursuing these strategies depends on a combination of multiple factors, such as the overall purpose of the assessment, the overall difficulty (pass rate) of the test, the cognitive complexity of the items, and the particular psychometric properties that are most valued by the test developer.