Kirkpatrick’s system of evaluation has been widely used in the area of professional training for over 40 years. This system consists of four steps or levels of increasing complexity. Kirkpatrick’s four levels can be summarized as follows:
Level 1 — Reaction
This level represents the feelings of the learners about the training received. A variety of testing examples show a familiar series of questions where the student is asked to rate various aspects of the training on some kind of quantitative scale. While most questions are given in an objective form, some space is generally allowed for additional comments not addressed by the other questions.
Kirkpatrick emphasizes that this level of evaluation "does not include a measurement of any learning that takes place" (Kirkpatrick, 1976, p. 18-2).
Level 2 — Learning
Kirkpatrick defines learning in this context as "the principles, facts, and skills which were understood and absorbed by the conferees" (Kirkpatrick, 1976, p. 18-11). In other words, the learning he describes corresponds to Bloom’s (1956) Knowledge category and subcategories. Kirkpatrick recommends that this level of evaluation include before-and-after testing as well as a control group when possible in order to assess the actual impact of the training, the use of objective questions to provide quantifiable data which can then be subjected to a statistical analysis.
Level 3 — Behavior (also called Transfer)
At this evaluation level, the focus is on behavioral changes that are brought about by the learning which has presumably taken place. Kirkpatrick saw this as a way to quantify the common knowledge that there is often "a big difference between knowing principles and techniques and using them" (Kirkpatrick, 1976, p. 18-16). Here again, the use of before-and-after testing, a control group, and statistical analysis are recommended. In addition, he suggests appraisal by persons other than the individual being evaluated to aid in the objectivity of the results. He also recommends a post-training appraisal three months or more after the training has been completed in order to assess the lasting effect of behavioral changes resulting from the training
Kirkpatrick 4 Evaluation Levels
Level 4 — Results
This is the most vague of Kirkpatrick’s levels. The desired results can vary greatly from one type of training program to another, and therefore the testing to determine the degree to which those results have been met vary as well. For this reason, in the context of job-related training, Kirkpatrick suggests that evaluations focus on the first three levels. "From an evaluation standpoint, it would be best to evaluate training programs directly in terms of results desired. There are, however, so many complicating factors that it is extremely difficult, if not impossible, to evaluate certain kinds of programs in terms of results. Therefore, it is recommended that training directors evaluate in terms of reaction, learning, and behavior" (Kirkpatrick, 1976, p. 18-21).
Three assumptions associated with Kirkpatrick’s system are "implicit in the minds of researchers and trainers, although to all appearances unintended by Kirkpatrick himself when the model was proposed" (Alliger & Janak, 1989, p.332). These assumptions are:
1. Levels are hierarchical, with each providing more information than the last,
2. There is a causal relationship between each successive level, and
3. There is a positive correlation between levels. The authors challenge the validity of these assumptions with a detailed analysis of the available literature.
Hard data and Soft date
It can be useful to divide results into categories of "hard data" and "soft data" (Phillips, 1996). Hard data, the kind traditionally used to evaluate performance, includes things such as output (units produced, tasks completed, etc.), quality (waste, defects, etc.), time (project completion time, overtime, etc.), and cost (overhead, variable costs, etc.). Soft data are more subjective and harder to assign a monetary value. This includes work habits (punctuality, safety, etc.), work climate (grievances, job satisfaction, etc.), attitudes (loyalty, perception of responsibilities, etc.), new skills (decisions made, conflicts avoided, etc.), development (promotions, performance ratings, etc.), and initiative (implementation of new ideas, employee suggestions, etc.).