How are EER judgements made?

Making judgements of TEO performance and quality is a step-by-step process involving several stages.

5.1 The judgement process

Wrapping up enquiry

The enquiry phase ends when all relevant information has been received and considered. The evaluation team will at this point have formed a complete evidential file, which either includes the key information required or formally notes the source of that material.

In some cases, crucial material may still be missing at this late stage. NZQA will advise the TEO of any resulting delay to the judgement phase and indicate next steps and timelines.
If, while completing the enquiry phase, NZQA determines that some information received by the TEO is deficient in any respect, the TEO will be advised. Depending on the circumstances, the TEO might be given the opportunity to provide further information.

Alternatively, NZQA might decide that the information gap or weakness is itself evidence, and close off the evidential file.


Once the evidential file has been completed, the evaluators begin the judgement phase, or synthesis. Synthesis aims to make sense of a wide range of information in order to reach a set of value judgements about the TEO.

The evaluation team will set aside sufficient time to consider all the relevant evidence and arrive at defensible judgements that form the basis for the EER report.

In many instances, this discussion will occur on the TEO’s premises after the conclusion of the enquiry phase. For more complex EERs, however, or when key information arrives later than expected, NZQA will schedule a separate synthesis meeting in its own offices.

The meeting falls into two parts:

  • Identifying and interpreting key evidence.
  • Reaching judgement.

Free and frank discussion

Synthesis meetings are conducted in the spirit of free and frank discussion. Usually, only those on the assigned evaluation team will be allowed to participate.

The evaluation team will note its emerging findings in a document, or set of documents, which will be added to the evidential file.

This record will note the ratings assigned and the key factors leading to those ratings. Any important points of internal disagreement as to findings or ratings will be noted.

The synthesis is an essential formative output for the EER process overall. It does not, however, bind the Lead Evaluator to the judgements arrived at during the team meeting. In the course of writing the report, the Lead Evaluator may revise one or more of the ratings.

Whenever significant changes of this kind are proposed, the Lead Evaluator will re-consult with fellow team members and consider their feedback before finalising the draft.

Consensus is not always reached or required. The final decision rests with the Lead Evaluator, who presents the resulting report as a recommendation to the Manager, Evaluation.

Steps to judgement

First, the evaluators review the range of evidence collected on the case to date. Since this information will directly affect the judgements, every effort is made to ensure that this evidence is complete and is being correctly interpreted.

Next, the evaluators will discuss the emerging findings. What are the most important things we know about the performance of the organisation?

Two kinds of judgements

Judgements will then be drafted. In every EER, these take two forms: ratings (which are attached to focus areas and key evaluation questions) and assessments of the overall quality of the TEO (i.e. statements of confidence).

  • For the ratings, the scale runs from Excellent to Poor.
  • For the statements of confidence, the scale runs from Highly Confident to Not Confident.

In most EERs, decisions on ratings will precede those on statements of confidence, because the latter are dependent on the former.

That is, statements of confidence indicate what NZQA considers the most important aspects of a TEO’s performance and its capability in self-assessment. As such, in reaching statements of confidence, evaluators will carefully select from the overall findings and ratings, highlighting some aspects and remaining silent on others.

By its nature, EER judgement is complex and responsive to the context of the TEO under review. The evaluative sequence indicated above may be varied or revised as circumstances require.

5.2 The judgement guidelines

Evaluating the range of information

During the judgment phase, the evaluators will look at the full range of information surfaced by their enquiry and consider its value.

The information considered can be a single fact, a set of related facts, or anything else that relates to the performance of the TEO, either in part or in its entirety.

Once EER information has been considered, grouped and valued, it becomes a finding.

To make a finding, the evaluators will ask themselves three main questions:

  • What is its materiality?
  • How representative is it?
  • How should it be weighted (and rated)?

Final judgments are reserved until all the relevant facts are known, properly understood and correctly valued.


Materiality refers to the relative importance or significance of any (emergent) finding.

Decisions on materiality depend on several factors. These commonly include:

  • the TEO context (kaupapa, sector, industry, programmes, stakeholders)
  • the size and impact of the finding
  • how effectively the TEO has identified, analysed and managed the finding.

For example: the materiality of high programme completion rates. We know that high programme completion rates often tell us something about the quality of a TEO. But what do they tell us? That depends.

Materiality is always relative. What is significant for one TEO or programme may not be true for another.

Materiality 1 – completion rates

In a two-day first aid skills course, a completion rate of over 90 per cent is common. Evidence that a first aid skills provider regularly reports 90 per cent completion rates suggests only that, in this respect, the provider is in line with industry norms. The materiality, i.e. the relative significance of this data set, may therefore be slight.

In contrast, a 90 per cent completion rate in a degree programme would be very differently considered. International research indicates that the higher the level of a programme, the greater the non-completion rate tends to be. The materiality therefore of an outlier result like this will be much more significant.

For the degree programme, NZQA would need to seek further explanation. Ideally, this explanation would come from the TEO itself, through its self-assessment. The high completion rate could be attributable to a range of causes, such as:

  • exceptional teaching and management (which would reflect well on the TEO)
  • lax assessment practices (which would reflect poorly on the TEO)
  • an unusually high-performing cohort (which might not reflect at all on the TEO).

Materiality 2 – breach of Code of Practice

A large provider of international education has committed a breach of the Code of Practice. The breach itself relates to missing evidence on a single student file and may be a one-off error. On the other hand, some of the students at the provider are under the age of 18.

The materiality in this case is not immediately clear. The evaluators will need to balance the size of the problem against its possible impact on ‘vulnerable learners’. And they will need to determine how well the TEO has self-managed the problem.


Representativeness refers to the relative ‘normality’ of a finding. That is, how much does a particular finding tell us about the TEO as a whole?

If the finding is very similar to other findings made in the same EER about the same TEO, then it can be considered highly representative. At the other extreme, the finding may be a complete outlier.

If most findings in an EER are representative, you can more easily generalise from them. A cluster of closely related findings will very often lead to a single, consistent EER outcome.

Findings not representative

If some findings are not representative, then it becomes more likely that the overall EER results will be mixed.

For example, if a large TEO has a high-performing programme, NZQA will look to see what inferences can reasonably be drawn. Can the same qualities be seen in other programmes within the same TEO? If so, why? If not, why not?

A second (negative) example: a TEO runs five programmes, three in hospitality, two in information technology (IT). Two of the hospitality programmes, and one of the IT, are selected as focus areas. All the programmes in scope reveal significant weaknesses in internal moderation. From this, NZQA might reasonably infer that moderation is a systemic problem across the TEO. This finding is therefore representative.

Alternatively, if only the hospitality programmes show moderation failings, then it would be unreasonable to assume the presence of moderation weaknesses in the IT programme that was not in scope. In this instance, the finding of moderation flaws might be considered as representative of the hospitality programmes – but not of the TEO as a whole.


When the evaluators have confirmed the findings, they can begin the ratings process.

At this point, the evaluators will already have reached preliminary views on:

  • the key findings
  • the materiality of these findings
  • the representativeness of these findings.

As an intermediate step, before ratings are assigned, the evaluators will consider how the findings will be ‘weighted’.

In most EERs, some of the findings will be more significant than others, because of their size, reach, impact, risk or value.

It should be noted that whenever weighting occurs in an EER judgement, it will be made clear from the wording (and ratings) of the EER report.

Weightier finding, more impact

In the course of their discussion, evaluators will identify which findings should carry the most weight. The more weighted the finding, the greater the impact on the ratings (and statements of confidence).

For example: a medium-sized PTE offers five programmes. Four are highly performing, one is poorly performing. Trainees are distributed equally across the programmes. In terms of overall numbers, then, the TEO would seem to be doing well.

However, there have been several (upheld) complaints from trainees in the ‘poor’ programme. Moderation results for that programme have been consistently bad for the past three years. The Tertiary Education Commission has reduced funding to this programme for a breach of the Investment Rules.

In this (imaginary) scenario, NZQA might reasonably place a disproportionate weight on this single programme’s under-performance in the overall EER outcome.

The act of judgement

EER looks for the best fit between the findings and the rating rubrics. Each rating comes with a descriptor, which indicate the kind of features usually to be found in a rating at that level. Taken together, the rating descriptors guide the evaluators towards choosing one ‘level’ of quality rather than another.

The text of the relevant section of the report should always give the reader a common-sense explanation of why a rating of Good (rather than, say, Marginal) has been assigned.

Testing emergent ratings

When ratings are first formed, they are considered emergent. They need to be checked and tested. To do so, NZQA evaluators will ask themselves questions such as:

  • Is there any other evidence which might confirm/disconfirm these findings?
  • (If a process or outcome) is it reproducible/sustainable?
  • (If a flaw or gap) is this a systemic weakness or an occasional error?
  • (If a flaw or gap) has this been self-identified/self-managed by the TEO?

Until the EER report is finalised for publication, NZQA remains open to changing any of its ratings. New evidence may come to light. Or an inadvertent error of fact or interpretation may have occurred.

NZQA encourages TEOs to question the ratings whenever they seem unclear or unwarranted. The feedback process is described later in this guide.

Statements of confidence

NZQA expresses statements of confidence about every TEO: for educational performance and for capability in self-assessment.

At some level, the statements are interdependent. NZQA assumes that every well-performing TEO is underpinned by authentic, ongoing self-assessment. A TEO maintains its quality of delivery through strong internal processes. Without strong performance outcomes, self-assessment is meaningless; without strong self-assessment, performance cannot be sustained.

Yet within our reports, the two statements are separated. Each carries specific meaning.

For example: a PTE has poor course completion results two years in a row. The result in the second year is, however, noticeably better than in the first, because an internal review between the moderation rounds improved the TEO’s assessment practices.

In this instance, NZQA might decide to rate capability in self-assessment higher than performance. Performance is still delivering weak results, but the TEO’s self-assessment has shown its resilience through proactive engagement and some evidence of process improvement.

NZQA is reluctant to invest a high degree of confidence in a TEO without compelling, comprehensive evidence of:

  • high-quality processes and resources
  • a track record of excellence
  • a sustainable operating model.

How ratings and statements of confidence differ

Ratings evaluate the TEO’s performance and its capability in self-assessment from the time of the previous EER to the present. They tend to be weighted towards more recent evidence (though, as always, context matters). Ratings are based on the present moment, looking back.

Statements of confidence summarise the most significant findings in the report. They reach a comprehensive judgment on the quality of the performance and capability in self-assessment of the TEO as a whole. They conclude by estimating the probable future quality of the TEO. As such, they are not an executive summary, but a consolidated view of ‘what matters most’ in the TEO as a whole. Statements of confidence are based on the present moment, looking forward.

Links to other sections of the guide

What is external evaluation and review?

How does EER begin?

How are EERs planned?

How does EER enquiry occur?

How are EER findings reported?

What happens next?

The process of external evaluation and review

Back to External evaluation and review

Skip to main page content Accessibility page with list of access keys Home Page Site Map Contact Us