Growthresourcesinc place
RSS Feeds
Subscribe RSS feeds Click & copy/paste URL

Toolbox

     - Page -
PDF
Print
     - Space -
Index
     - Place -
What's new
     - Support -
Contact
FAQ
Features
Feedback
Participate
Tutorial
Wiki Syntax


Validity

  • Definition
  • Empirical validity
  • Theoretical or construct validity
  • Criterion validity
  • Synthetic validity
  • References and more reading

Definition

A test is valid if it measures what it is supposed to measure. When assessing validity the focus is the content of the test, its quality, its pertinence and what can be inferred from the scores. The analysis of validity follows the one of reliability which is necessary but not sufficient. Three categories of validity need to be considered: (1) empirical validity which includes content validity and face validity, (2) theoretical or construct validity which includes contrasted groups, convergent validity, development changes, correlation with other version of the test, internal consistency, inter-correlation, factorial analysis and structural equation modeling, and (3) criterion validity which includes predictive validity and concurrent validity. On top of these three first groups of validity are other forms of validity like synthetic validity which includes meta analysis.

Empirical validity

This is the form of validity that appeared first. The goal of the empirical validity is to appreciate if a test covers all that needs to be covered by the people who take the test. This procedure is appropriate for tests of knowledge and skills. But it is not for tests of personality (and aptitudes). Unlike for knowledge tests, the personality tests cannot be based and built on a uniformed content. The answers to the items can considerably vary between participants: the same test will give different “good” answers with different persons. Under these circumstances it is difficult to validate the dimensions measured by the test by just examining its content. The two procedures described below both more apply to knowledge tests.

Content validity

The focus of content validity is whether the items of the test represent a good sample of all the items that should be presented to the persons. The content validity is evaluated by a group of experts who will score each item. Then all the expert judgments are gathered in notations and only the items which are the most representative will be kept.

Face validity

The focus of face validity is if the technique “looks” valid to the examinees who take it, to the person who use the technique or to other observers. It does not refer to what the test actually measures but to what it appears superficially to measure. To assess face validity the persons who take the test are asked if the content seems to be in relation with the purpose of the measurement. The face validity will influence the attitudes of the persons regarding the test. If the test has a good face validity, one can expect cooperation from the persons when taking the test or making use of its results.

Theoretical or construct validity

Theoretical or construct validation procedures examine if the data measured really correspond to the theory of the authors. These procedures started in the 1950th, with the work of Cronbach and Meehl. They make explicit the importance of formulating some hypotheses and theories in psychology that need to be rationally and thoroughly checked. The labels of the test might suggest some constructs and contents that may be in fact different than expected. Or the different constructs with different labels might in fact have very similar content. Theoretical validity is a way to make evidence of these problems. Several procedures are commonly utilized which are presented below.

Contrasted groups

The global results of the test applied to different groups are examined. One might hypothesize different results from these groups. This validation procedure is particularly appropriate for personality tests in organizations where behaviors of people vary in different positions. For instance, the results of a sales force should be consistently different than those of the accounting department. It the results of the first group are significantly different from the one of the second, it implies then that the test enables to contrast them.

Convergent validity

With this type of validation, correlation coefficients are computed between two techniques. If the dimensions on each side show some dependency then it can be said that two techniques measure similar dimensions and display a convergent validity. If however, there is no relation between the dimensions, this implies that they measure different constructs.

Developmental changes

This validity procedure is utilized with children at different ages. It validates the chronological development of intelligence. It is not utilized with adults and personality tests.

Correlation with other version of the test

This procedure is utilized to validate the incremental validity of different versions of a test. Different techniques or tests are taken by the same persons and correlations between the measurements are computed. Average correlations are expected. Too high correlations would mean that the tests do not have sufficient incremental validity.

Internal consistency

The focus is on the items that constitute the technique. One wants to check that each item measures the dimension to which it is attached or if enables to differentiate each person through the considered dimension. A score is computed between each item and the test total score. This procedure is frequently utilized for personality tests; it is close to the reliability procedure that measures homogeneity.

Inter-correlation

This procedure calculates the correlation between all dimensions. The inter correlation results are produced in matrixes. A high correlation between two dimensions means that the two dimensions measure about the same construct. On the opposite a correlation value close to zero shows that the two dimension represents different constructs.

Factorial analysis

This technique is utilized to identify core personality dimensions within a test or within a battery of tests. The Explanatory Factor Analysis (EFA) is applied to detect clusters of adjectives among a large number of them in a lexicon and reach a number of three to seven “big” dimensions. When applied to validate the dimensions of a test it is called Confirmatory Factor Analysis (CFA). The factorial analysis might reveal a strong relation between some domains and suggest some simplifications.

Structural equation modeling

The utilization of this procedure is the most recent and sophisticated among validity techniques. It enables to take into consideration all variables and the relation between all these variables as well as their reliability.

Criterion validity

The criterion related validity enables to measure if a test is effective to predict the performance of a person in a given activity when a performance indicator is taken in consideration. To measure the validity of a criterion, a correlation coefficient is calculated between the score of the test and the score of the criterion. Two forms of criterion validity need to be considered.

The first one is the predictive criterion validity or criterion validity. The measure with the test occurs before the measurement of the criterion. In this situation, a prognostic needs to be established between the performance of a person in a future situation, given a prior measurement by a test. The predictive validity is measured by the correlation coefficient between the test results and the values of the criterion in the position once the person is being in this position. This procedure is useful in selection situation, or in counseling and personal development.

The second procedure is the concurrent criterion validity. It generally applies to people who are already in position and who take the two measurements simultaneously. This procedure is useful when analyzing and existing situation instead of predicting a result. It enables to construct a criterion with which then another predictive validity can be conducted.

The criterion can be of different kinds. For the intelligence test, it is often the examinations results. For the aptitudes tests, it can be the results of a specific training. In an organizational context it is more often the success results or the non performance criterions like absenteeism that are utilized. A contrasted group procedure enables to identify the criterions that emerge from different groups of people who have shown consistent performance in the organization. This last method is often utilized with personality tests. Another method is to obtain the performance results from the organization, most of the time from the manager. Of great concern is how criterions do emerge from the organization, taking in consideration the various factors of a work situation.

The criterion validity studies are the most utilized procedures and also the most practical ones to validate the usage of a test in a professional situation. Different precautions need to be considered like the homogeneity of the analyzed group or the linearity of the relation between the personality dimension and the criterion.

Synthetic validity

The synthetic procedures emerged in the 1990th so as to prove that aptitude tests and General Mental Ability (GMA) tests could show superior validity results than suspected through individual studies. On similar positions, insufficient or variable results do not enable generalizations. By working on aggregation of analyses and thus larger samples the meta-analyses bring higher validity coefficient and more confidence to generalize the results.

It must be said that the meta-analyses are still recent techniques and that their limits need to be better apprehended. Notably, the environmental variables must be better known as well as the performance criteria that are selected by the organization.

References and more reading

On the importance of construct validation procedures:

  • Cronbach, L. J., Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 51, pp. 281-302. Read the article online

On convergent and divergent validity see:

  • Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105.

More about structural equation modeling:

  • Campbell, J. P. (1990a). Modeling the performance prediction problem in industrial and organizational psychology. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of Industrial and Organizational Psychology (2nd ed., Vol. 1, pp. 687-732). Palo Alto, CA : Consulting Psychologists Press.

About the emergence of performance criterion in an organization see:

  • McCormick, E. J. (1979). Job analysis : Methods and applications. New York : AMACOM.
  • McCormick, E. J. (1983). Job and task analysis. In M. D. Dunnette (Ed.), Handbook of Industrial and Organizational psychology (pp. 651-696). New York : Willey.
  • Campbell, J. P. (1990a). Modeling the performance prediction problem in industrial and organizational psychology. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of Industrial and Organizational Psychology (2nd ed., Vol. 1, pp. 687-732). Palo Alto, CA : Consulting Psychologists Press.
  • Messick, S. (1995). Validity of psychological assessment : Validation of inferences from person's responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741-749.

About meta analyses see:

  • Glass, G. V. (1976). Primary, Secondary, and meta-analysis of research. Educational Researcher, 5, 3-8.
  • Schmidt, F. L., Hunter J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62, 529-540.
  • Schmidt, F. L. (1992). What do data really mean ? Research findings, meta-analysis, and cumulative knowledge in psychology, American Psychologist, 47, 1173-1181.

About generalization of studies see:

  • Pearlman, K., Schmidt, F. L., & Hunter, J. E. (1980). Validity generalization results for test used to predict job proficiency and training success in clerical occupations. Journal of Applied Psychology, 65, 373-406.
  • Schmidt, F. L., Gast-Rosenberg, L., & Hunter, J. E. (1980). Validity generalization results for computer programmers. Journal of Applied Psychology, 65, 643-661.

About generalization of meta analyses see:

  • Salgado, J. F., Ones, D. S. et Wiswevaran, C. (2001). Predictors used for personnel selection : An overview of constructs, methods and techniques. In N. Anderson, D. S. Ones, H. K. Sinangil & C. Viswesvaran (Eds.) International Handbook of Work and Organizational Psychology. Vol. 1. London, UK : Sage.

About the environment variables and other variables from the organization that need to be better known, see:

  • Algera, J. A., Jansen, P. S., Roe, R. A., Vijn P. (1984). Validity generalization : some critical remarks on the Schmidt-Hunter procedure, Journal of Occupational Psychology, 57, 197-210.

More information online:






Comments

No comments for this document
Add Comment...
Enter your name
Type the characters you see in the picture below
refresh
legal terms | privacy policy | contact | © 2006-2008 Netcipia® Inc. - All rights reserved