FOOTNOTES:

Intelligence2.86placesPerseverance3.32"Kindliness3.55"Conceit3.57"Courage3.69"Humor3.90"Deceitfulness4.14"

This means that in the long run a stranger will place a given individual in a group of twenty persons not over three or four positions away from the place to which other strangers would assign him. The individual's physiognomy, however little it may actually reveal of his personality, nevertheless suggests rather definite characteristics to those whom he meets, and to that degree determines their reaction toward him, expectations of him, and belief in him. The definiteness or agreement of these impressions seems also to vary with the trait in question; it is high for intelligence and perseverance, low for humor and deceitfulness, and intermediate for kindliness, conceit, and courage. Our own results, however, must be taken only as suggestive, rather than as general, since they may easily have been determined partly by the particular set of photographs we used and by our particular and diverse sets of judges.[3]

Results of this character, and many similar ones which we are accumulating, suggest, however, an interesting set of problems. It is psychologically as interesting to inquire just what impressions people actually receive from one's physiognomy and expression, as it is to ask whether these impressions are correct. One's ultimate vocational accomplishment often depends on the first impression he creates, the type of reception his appearance invites, even though there may be no necessary connection whatever between appearance and mental constitution. Vocational success depends not only on the traits one really possesses, but also somewhat on the traits one is believed to possess.

It is also interesting to observe that high correlations exist between some of the traits as judged merely on the basis of photographs. Let 1.00 be taken to indicate complete correspondence between two orders of merit, so that the highest in the one scale is also the highest in the other scale, the second in one the second in the other, and so on; then -1.00 will indicate a completely reversed order, the best in one class being the poorest in the other, etc.; a coefficient of 0 will mean only achance relationship, i. e., none at all. Then from 1.00 through 0 to -1.00 we have represented all possible degrees of correspondence.[4]These figures are called "coefficients of correlation," and can easily be computed by proper statistical methods. In the present case the coefficients for all combinations of two traits are as follows:

IntelligenceHumorPerseveranceKindlinessConceitCourageHumor.47Perseverance.88.33Kindliness.76.65.39Conceit.28-.03.08-.56Courage.89.43.79.72-.25Deceitfulness-.11-.28-.02-.69.66-.49

It will be seen that the intelligent, humorous, persevering, kindly, and courageous countenances tend to be the same ones, and that the faces suggesting the opposites or low degrees of these traits also tend to be very much the same ones.This is indicated by the high positive coefficients between these traits. But conceit and deceitfulness show negative or very low positive correlation with all traits except each other. In this latter case the correlation is positive and high (.66). Other interesting relations between these judgments of character can be inferred from the table of coefficients. But it should be remembered that we are not here dealing with traits as demonstrably present, but only as judged on the basis of facial characteristics and expression. The actual relation between the physiognomic details and the true character of the individual displaying them is a totally different matter. The close correlations between the several desirable traits and between the several undesirable traits, as found in this table of coefficients, seem to have a further significance and suggest that the observers do not judge each trait on the basis of particular and specific physiognomic details. They seem, rather, to get a general impression of favorableness or unfavorableness, and to rank the photographs on the basis of this general impression, no matter which trait is being judged.

It is a common practice for employers, superintendents, agencies, etc., to request the applicant for a position to send his or her photograph for inspection.The urgency of some of these requests and the emphasis placed on them seem to indicate that the photograph is believed to be valuable not only for its service in revealing the general features but also for some further and more specific indications which it affords. Very few attempts seem to have been made to test actually the value of judgments of character when they are based on photographs rather than on acquaintance. Experiments recently conducted yield some interesting preliminary data on this question. The question proposed was: "What relation exists between the judgments which strangers form, on the basis of an individual's photograph, and the judgments which acquaintances make on the basis of daily familiarity and long observation?"[5]

All the members of a group of college women were judged by twenty-four of their associates, for a number of more or less definite characteristics. The twenty-five individuals constituting the group were arranged in an order of merit for each trait, by each of the twenty-four judges. Only one arrangement, for one trait, was made by anyone judge within a given week. The judgments were thus distributed over a considerable interval so that judgments for one trait might influence as slightly as possible the judgments of later traits. All these twenty-four judgments were then averaged for each trait, and the final position of each person in each trait thus determined by the consensus of opinion of the judges. This measure is then a combined estimate on the basis of actual conduct and behavior.

Photographs of all the members of the group were then secured, all of them taken by the same photographer, in the same style and size. These photographs were now judged, by a group of twenty-five men and a group of twenty-five women, all of whom weretotally unacquaintedwith the individuals who were being judged. These strangers arranged the photographs in order of merit for the various traits of character, just as the earlier group of judges had arranged the names of the members of the group, with all of whom they were acquainted. The various arrangements of the photographs were then averaged, yielding for each photograph an average position in each trait. We thus have three measures of the group of college women: (1) the judgments of their intimate associates; (2) the judgmentsof twenty-five men, on the basis of photographs, and (3) the judgments of twenty-five women, on the basis of photographs. All of these measures may be compared with each other, and correlated so as to show their respective amounts of correspondence. The results are as follows:

TraitJudgments by Associates Compared with the Judgments of the PhotographsBy 25 MenBy 25 WomenAverageNeatness.03.07.05Conceit.10.27.19Sociability.29.29.29Humor.21.45.33Likeability.30.45.38Intelligence.42.61.51Refinement.50.52.51Beauty.60.49.55Snobbishness.58.53.56Vulgarity.61.69.65Average.36.43.40

The correspondence between judgments of acquaintance and judgments of photographs is seen to vary with the trait in question. Such traits as neatness, conceit, sociability, humor, and likeability, important as they are for vocational success or failure, show very low correlation. The judgments of the photographs tell almost nothing at all of the nature of the impression which theindividual makes on her acquaintances, her true character. With the remaining traits—beauty, intelligence, refinement, snobbishness, and vulgarity—the coefficients are considerably larger, and suggest that the photographs tend to be judged by the strangers in somewhat the same way as the individuals are judged by their acquaintances.

Two points of special importance should be noted in this connection. The first is that these correlations are not between the judgments of single individuals. It is the combined or group judgment of twenty-five judges which is required to yield these coefficients which even then average only about .40 correlation with the estimates of associates. The following table shows the ability of ten judges, chosen at random, to estimate these characteristics through the examination of the photographs. In securing this table the arrangement made by each individual judge was correlated with the established order as determined by the estimates of associates, in the case of the three traits—intelligence, neatness and sociability.

JudgeIndividual Correctness of Judges in EstimatingIntelligenceNeatnessSociabilityI.51.11.39II.11.10.08III.15.29.05IV-.27.06.49V.08.24.08VI.43.41.28VII.04.11.02VIII.39-.09.32IX.22-.08.00X.30.02.55Average.19.11.22

These random samples of individual judicial capacity show at once how unreliable individual judgment is in these matters. The individual judges vary widely among themselves and they also depart widely from the established order. Moreover, a judge who may happen to show a reasonable degree of correctness in judging sociability may be very far away from correctness in judging the other traits, or may, indeed, judge in quite the reverse of the correct order. To have accepted the verdicts of a single judge would not only have been manifestly unfair to the individual but also hazardous to the employer. The combined impressions of twenty-five judges is here required for the correlations for even half of the traits to reach over .38.

The second point to be noted is that even under these circumstances the coefficients are far from perfect, even for those traits in which they arethe highest. Only if beauty, snobbishness, or vulgarity are the traits which are crucial, are judgments of the photographs reliable enough to be worth considering. It would appear that the vocations which depend markedly on these characteristics are exceedingly few. And even here, although the reliance on coefficients of .55 might in all probability aid the employer in decreasing the percentage of the snobbish or the vulgar among his employees, grave injustice would most certainly be done to those many individuals who constitute exceptions and keep the correlations from being perfect. Only when correlation coefficients are very high can their indications be applied in the guidance of individuals (as distinguished from the selection of groups) with safety and justice.

Dean Schneider reports an attempt to verify the principles of a certain system of physiognomics by putting them to an actual test. He writes:

"A group in the scientific management field affirmed that an examination of physical characteristics such as the shape of the fingers and shape of the head, disclosed aptitudes and abilities. For example, a directive, money-making executive will have a certain shaped head and hand. A number of money-making executives were picked at random and their physical characteristics charted.We do not find that they conform at all to any law. Also we found men who had the physical characteristics that ought to make them executives, but they were anything but executives. A number of tests of this kind gave negative results. We were forced to the conclusion that this system was not reliable."

We must content ourselves on this point by insisting that the formulated facts of physiognomy are so unsupported, contradictory, and extravagant that the vocational psychologist cannot afford to trifle with them. General impressions on the basis of the totality of an individual's appearance, bearing, and behavior we shall always tend to receive. Whether one judges more accurately by an analytic recording of each detail or by ignoring these in favor of his own more or less unanalyzed total impression has never been demonstrated. Under any circumstances one is likely to look about for such details as may lend support to the total impression. But it is quite unjustifiable—though perhaps commercially expedient—to pretend that the judgment is really based on the details selected.

The life of him who bases his expectations of human conduct on the physiognomy of his neighbors is bound to be full of delightful as well asfearful surprises. I shall never forget the practical lesson in the principles of physiognomics I learned when watching a shipload of immigrants pass the physical and mental examinations at Ellis Island. Admission to the new land, and to the theater of their vocational plans, depended on the results of these examinations. Ellis Island is perhaps the one place in the world where principles of individual psychology are most in demand, and where such principles as are relied on lead to results of the most serious human consequences. I watched the line file past the preliminary gate, by the inspectors who scrutinized them still more carefully, and on into the inner room where the suspected ones were submitted to more searching examination. One young woman stood out among her companions as easily the most comely and attractive of the women. She was the only one of that shipload who was finally certified as an imbecile, and refused admission to the mainland.

The physiognomic analyses, then, do not merit serious consideration as instruments of vocational guidance and selection. The mere facts of physical structure, contour, shape, texture, proportion, color, etc., yield no more information concerning capacities and interests than did the incantationsof the primitive medicine-man or the absurd charts of the phrenologists. In so far as character and ability may be determined by facts of structure, it is by the minute structure of the microscopic elements of the brain and other vital tissues, about which we now know exceedingly little. We shall therefore dismiss from further consideration the futile attempts to diagnose mental constitution on the basis of bodily structure, and turn to the more reliable and scientifically conceived methods of inferring the individual's mental traits from his behavior or his actual performance when tests are made under controlled conditions.

FOOTNOTES:[1]An interesting review of the origin and development of phrenology and other systems of character analysis is given by Joseph Jastrow, in an article inPopular Science Monthly, June, 1915.[2]To make clear the way in which these figures are secured, and to show concretely what they mean, suppose that the twenty photographs are lettered A, B, C, D, etc. They are to be arranged in an order by each judge according to his judgment of the intelligence of the individuals, the individuals being unknown to the judges. Suppose that the ten judges place photograph A respectively in the following positions: 9, 11, 5, 8, 9, 12, 7, 8, 7, 14. The average of these ten positions is 9, which we then take as the standard or most probable position of photograph A. Only two of the judges actually place A in the ninth position. The other eight judges all vary more or less from this position. We then find how much each judge varies from the average of the group, and the ten variations are respectively 0, 2, 4, 1, 0, 3, 2, 1, 2, 5 positions. The average of these individual variations is 2.0 positions. This figure indicates how closely the ten judges agree in their estimates of photograph A, a small average deviation indicating close agreement. In this way we find for each of the twenty photographs its average deviation; and if the twenty figures thus secured are in their turn averaged we secure an approximate measure of the disagreement of the judges when estimating the intelligence suggested by the photographs. Similarly we may compute average deviations for any other trait which is judged. These final figures are the ones which are given in the table, each of them being the average of twenty photographs as judged by ten persons.[3]In such experiments the actual magnitude of the measure of variation becomes larger as the number of judges is reduced, the number of photographs increased, or the photographs so selected as to resemble one another more closely.[4]Since such coefficients of correlation will be frequently used throughout the book as measures of the amount of correspondence or relationship between two things, it may be well at this point to indicate briefly how they are computed. Suppose that, as arranged in order on the basis of their final averages, the photographs stand in the following positions for the two traits—courage and kindliness.PhotoCourageKindlinessdd2A2539When the several valuesunder d2are added theirsum is 376. This,multiplied by 6, accordingto the formula, gives 2256.The denominator of thefraction is, since thereare 20 cases, 7980. Dividing2256 by 7980 gives us.28; for 7980 is20 times 399, whichin turn is 202—1.When this is subtracted from1.00 it gives us .72,which is the measureof correlation between thetwo orders. Since itis very high it suggests thatthe two traits are judgedin much the same way.B51416C101339D1439E7611F11839G1410416H2015525I1612416J4224K814636L3300M1220864N1511416O171811P9724Q617981R139416S181624T191900A formula is provided by mathematicians which enables us to compute the degree of resemblance between these two orders. There are, in fact, several formulae for such purposes, all of which yield substantially the same results. The one used in this case was r = 1.00-(6Σd2)/(n(n2-1)). In this formularstands for the coefficient of correlation for which we are working;dis the difference between the positions which each of the photographs receives in the two traits; Σ means the sum of these differences when each has been squared or multiplied by itself;nmeans the number of cases, which is in this case 20, since there are that number of photographs. When these substitutions are made and the equation solved, the result will be the measure of resemblance, which will lie somewhere between +1.00 and -1.00, as explained in the text. This calculation is carried out here for the two sample traits, for the convenience of readers who may not be familiar with statistical methods.[5]These experiments were conducted by Lucy G. Cogan, M. A., to whom I am indebted for permission to use the results in advance of their more detailed publication in her forthcoming paper on "Judgments of Character on the Basis of Photographs."

[1]An interesting review of the origin and development of phrenology and other systems of character analysis is given by Joseph Jastrow, in an article inPopular Science Monthly, June, 1915.

[2]To make clear the way in which these figures are secured, and to show concretely what they mean, suppose that the twenty photographs are lettered A, B, C, D, etc. They are to be arranged in an order by each judge according to his judgment of the intelligence of the individuals, the individuals being unknown to the judges. Suppose that the ten judges place photograph A respectively in the following positions: 9, 11, 5, 8, 9, 12, 7, 8, 7, 14. The average of these ten positions is 9, which we then take as the standard or most probable position of photograph A. Only two of the judges actually place A in the ninth position. The other eight judges all vary more or less from this position. We then find how much each judge varies from the average of the group, and the ten variations are respectively 0, 2, 4, 1, 0, 3, 2, 1, 2, 5 positions. The average of these individual variations is 2.0 positions. This figure indicates how closely the ten judges agree in their estimates of photograph A, a small average deviation indicating close agreement. In this way we find for each of the twenty photographs its average deviation; and if the twenty figures thus secured are in their turn averaged we secure an approximate measure of the disagreement of the judges when estimating the intelligence suggested by the photographs. Similarly we may compute average deviations for any other trait which is judged. These final figures are the ones which are given in the table, each of them being the average of twenty photographs as judged by ten persons.

[3]In such experiments the actual magnitude of the measure of variation becomes larger as the number of judges is reduced, the number of photographs increased, or the photographs so selected as to resemble one another more closely.

[4]Since such coefficients of correlation will be frequently used throughout the book as measures of the amount of correspondence or relationship between two things, it may be well at this point to indicate briefly how they are computed. Suppose that, as arranged in order on the basis of their final averages, the photographs stand in the following positions for the two traits—courage and kindliness.PhotoCourageKindlinessdd2A2539When the several valuesunder d2are added theirsum is 376. This,multiplied by 6, accordingto the formula, gives 2256.The denominator of thefraction is, since thereare 20 cases, 7980. Dividing2256 by 7980 gives us.28; for 7980 is20 times 399, whichin turn is 202—1.When this is subtracted from1.00 it gives us .72,which is the measureof correlation between thetwo orders. Since itis very high it suggests thatthe two traits are judgedin much the same way.B51416C101339D1439E7611F11839G1410416H2015525I1612416J4224K814636L3300M1220864N1511416O171811P9724Q617981R139416S181624T191900A formula is provided by mathematicians which enables us to compute the degree of resemblance between these two orders. There are, in fact, several formulae for such purposes, all of which yield substantially the same results. The one used in this case was r = 1.00-(6Σd2)/(n(n2-1)). In this formularstands for the coefficient of correlation for which we are working;dis the difference between the positions which each of the photographs receives in the two traits; Σ means the sum of these differences when each has been squared or multiplied by itself;nmeans the number of cases, which is in this case 20, since there are that number of photographs. When these substitutions are made and the equation solved, the result will be the measure of resemblance, which will lie somewhere between +1.00 and -1.00, as explained in the text. This calculation is carried out here for the two sample traits, for the convenience of readers who may not be familiar with statistical methods.

PhotoCourageKindlinessdd2A2539When the several valuesunder d2are added theirsum is 376. This,multiplied by 6, accordingto the formula, gives 2256.The denominator of thefraction is, since thereare 20 cases, 7980. Dividing2256 by 7980 gives us.28; for 7980 is20 times 399, whichin turn is 202—1.When this is subtracted from1.00 it gives us .72,which is the measureof correlation between thetwo orders. Since itis very high it suggests thatthe two traits are judgedin much the same way.B51416C101339D1439E7611F11839G1410416H2015525I1612416J4224K814636L3300M1220864N1511416O171811P9724Q617981R139416S181624T191900

A formula is provided by mathematicians which enables us to compute the degree of resemblance between these two orders. There are, in fact, several formulae for such purposes, all of which yield substantially the same results. The one used in this case was r = 1.00-(6Σd2)/(n(n2-1)). In this formularstands for the coefficient of correlation for which we are working;dis the difference between the positions which each of the photographs receives in the two traits; Σ means the sum of these differences when each has been squared or multiplied by itself;nmeans the number of cases, which is in this case 20, since there are that number of photographs. When these substitutions are made and the equation solved, the result will be the measure of resemblance, which will lie somewhere between +1.00 and -1.00, as explained in the text. This calculation is carried out here for the two sample traits, for the convenience of readers who may not be familiar with statistical methods.

[5]These experiments were conducted by Lucy G. Cogan, M. A., to whom I am indebted for permission to use the results in advance of their more detailed publication in her forthcoming paper on "Judgments of Character on the Basis of Photographs."

Barren as phrenology and physiognomics were of formulable and useful results, they nevertheless served the purpose of directing attention toward the study of individual differences in mental characteristics as a distinct branch of inquiry. The next step consisted in the semi-experimental plan of observing the individual'sbehaviorunder a variety of uncontrolled circumstances or on more carefully planned occasions, in the endeavor to secure more or less exact quantitative expressions of the degree to which he displayed certain types of ability. Underlying the various abilities and involved in them there were assumed to lie a limited number of faculties or powers of the mind. Each individual was conceived to possess much the same faculties, but in varying degrees or amounts or forms. Attention, memory, apperception, reasoning, will, feeling, etc., were the fundamental "faculties"; and differences in character were thought of as depending upon the varyingamounts and interrelations of these fundamental faculties. In the endeavor to discover types of experiment which would measure these "faculties" it was found, in time, that a given "faculty" did not appear, on close examination, to be as unitary as it was formerly supposed to be. It was seen that to have a good memory for one kind of material did not at once signify a good memory for every sort of thing. Determination in one direction did not imply the general quality of resoluteness. It began to be realized that attention, memory, discrimination, and the other "faculties" are very much more highly specialized than these general names indicate. The unitary soul had early been split up into the list of "faculties" or categories, and now these in turn came each to be split up into finer and finer aptitudes and tendencies, until, in the radical reaction of recent years, we find the human mind described as made up of an infinite number of independent connections or bonds between more or less specific stimulus and more or less definite response. The old "faculties" came now to be looked on as descriptive terms for certain rather general and abstracted characteristics of these multitudinous and detailed reaction tendencies, rather than as in themselves agents orpowers or forces, as they were formerly conceived.

During this change in theoretical description and continuing into our present era of compromise and revision, methods were developed of measuring the amount and quality, or, more simply conceived, the speed, strength and regularity of mental and motor ability. Beginning in the form of experiments on sensory discrimination, reaction time and imagery type, and combined with physiological measurements of motor strength, rapidity and fatigue, these experiments developed, in certain hands, into what are now known as "mental tests." Since the principle and method of mental and physical tests is the chief characteristic of the present status of vocational psychology, and since the work of the immediate future seems destined to develop mainly in this same direction, we may profitably consider at this point the history and development of the mental test. We may later take up the general principle and theory of the test as an instrument of psychological analysis and diagnosis, with special reference to the requirements and implications of such tests as may be of service in vocational psychology. We shall then be in position to review the special vocational tests that have as yet been proposed, to evaluate their outstandingresults, and to point to some of the more immediate prospects and problems under consideration by those interested in the application of psychological tests in vocational analysis and guidance.

We may begin with an account of the first definite attempt to explore systematically the personality of individuals by the method of tests. The "Columbia Freshman Tests" are of especial interest in the history of vocational psychology, since in their formulation and plan explicit thought was given to the practical use to which the results of tests might be put by the individuals examined, and by the statistical study of the results by students of the subject. In 1894, under the guidance of Professor Cattell, there was instituted the plan of testing the students of Columbia College during their first and fourth academic years. A description of the tests employed was published by Cattell and Farrand in 1896, and a statistical study of results was published by Wissler in 1901.

The motive back of these tests is well expressed in the following paragraph which was also used as material for a test of logical memory:

"Tests such as we are now making are of value both for the advancement of science and for theinformation of the student who is tested. It is of importance for science to learn how people differ and on what factors these differences depend. If we can disentangle the complex influences of heredity and environment we may be able to apply our knowledge to guide human development. Then it is well for each of us to know in what way he differs from others. We may thus in some cases correct defects and develop aptitudes which we might otherwise neglect."

The nature of these Columbia tests and the method of recording and reporting them are indicated in the forms which were printed and used for this special purpose. (Samples of these are given in the Appendix.) They are given here not so much for the sake of the enumeration of the tests, since many of these are no longer in common use, but because of their historic interests for vocational psychology and because of the general plan outlined in them. In general this plan is that of accumulating measurements of a large number of individuals and thus showing each one how he compares with the normal or average, or where he stands in the general curve of distribution of the members of the group. These tests were applied to the same individuals on their entrance to and their graduation from college, inorder to indicate changes that might have been made during the intervening period.

Especially interesting also are other blanks containing additional data, such as age, health, physical characteristics, physiognomic features, enumeration of stigmata, etc. In addition to the tests and measurements, the examiner, both before and after the interview, recorded his general impression of the individual, in the terms indicated on the blank form. We shall have occasion to refer to these judgments of general impression in more detail when we come to consider the use of the interview and the testimonial in vocational psychology. Account was also taken of the gymnasium records of the student, as to nationality, birth, parentage, habits, health, etc.

The Columbia tests may be thought of as representative of several similar projects developed in this country and in Germany, France and England by many workers. The names of Galton, Cattell, Kraepelin, Binet, Henri, and Jastrow stand out conspicuously in the early history of mental tests. The first step was thus the invention, description and trial of a great number of miscellaneous tests, with little analysis of the tests themselves, the nature of the functions tested by them, or their relation to each other. Aside from the strictlymotor and physical tests those devised were mainly of so-called intellectual character: measurements of speed and accuracy with which certain definite tasks could be accomplished. They were, moreover, very simple in character, not necessarily related to the work of daily life, with only a single or but a few trials made on each individual. Tests of affective and volitional factors were slower in developing. Little account was taken of interests, instinctive and emotional characteristics, attitudes, adaptation, methods of attack, limits of ability after practice, or many other aspects of individuality which later work has shown to be important.

The next step in the development of tests consisted in the coöperative effort to standardize the nature and methods, the conditions and mode of record. Many hands had part in this process, until in recent years, through publication, comparison and discussion of the subject, fairly uniform principles of technique, record, and treatment of measures have been agreed upon. This made possible the comparison of results secured by different investigators, and facilitated the statistical treatment of the data, so that later work might profit by what had already been tried or accomplished by earlier workers. After manyyears of this sort of coöperative work, another series of studies was inaugurated to attempt what has come to be known as "testing the tests." These studies proceeded by examining into the degree to which the various tests correlate with each other, with other indications of the individual's ability, with age, sex, health, education, school standing, special training, etc. Such questions as the following will suggest the problems involved in "testing the tests."

1. Which of the various tests correlate with each other?

2. What correlation exists between mental and motor abilities?

3. Do the tests measure fundamental qualities or general powers of the individual, or specialized capacities, or perhaps mainly the effect of general or special training?

4. If they measure general qualities, which of the existing tests are the best for this purpose?

5. How many trials are needed to afford a reliable index of the individual's ability?

6. What are the principal incidental factors that influence the result of tests?

7. Which tests are most easily influenced or disturbed by extraneous factors?

8. Can tests of the simpler laboratory type beused to indicate the individual's ability as shown in his daily work and play?

9. How simple or complex should the various tests be in order to give the best results?

10. How many tests, and which, are required to give a fairly correct picture of the individual's psychological make-up?

11. To what degree do preliminary trials indicate the final capacity of an individual?

12. Does the intercorrelation of tests change in any way with practice, repetition, and familiarity with the material?

13. Just what mental functions may the particular tests be said to measure?

14. How important are these functions in practical, educational and vocational life?

15. By what amounts and in what various ways do individuals differ among themselves in such abilities as the tests measure?

16. Are there other important aspects of psychological constitution and equipment for which there now exist no adequate tests?

The investigation of these numerous problems has resulted in the accumulation of a considerable literature of mental tests. Many of the earlier forms of tests were abandoned because of their unsatisfactory or meaningless character. Othershave been retained and improved in form, and many new ones are constantly being devised and elaborated, described and standardized. The precautions to be observed, the instructions to be given, and the methods of record and interpretation have been presented in various books and manuals. The tests have been developed for more and more complex functions, and now relate not only to relatively simple capacities but to highly elaborate and subtle forms of achievement. As rapidly as is consistent with accuracy, norms and standards of performance for different ages, school grades, vocational requirements, etc., are being accumulated and reported. Typical charts of age norms in selected tests are given in the Appendix.

As the tests have thus developed they have been organized for a variety of special purposes, such as for school measurement, educational diagnosis, clinical examination, laboratory experiment, and more recently for the purposes of vocational guidance and selection. Among the first of these to develop systematically, and also the ones with the most immediate vocational application, are the graded intelligence scales, which shall be our next concern.

An important step in the history of general tests is represented by the accumulation of norms and standards of performance for the different selected tests, and the arrangement of scales of tests with increasing difficulty, as further aids in fixing the individual's status.

After a standardized and tested form of test has been selected, norms of performance are accumulated by applying the test to large numbers of persons of the same general type. The classification may be on the basis of age, school grade, occupation, nationality, etc. In this way it becomes possible to determine for a given individual how he compares with other members of his group; whether he is above or below the average, and how far; whether he would belong among the best ten, or the poorest ten, or the third ten, etc., of one hundred selected at random. Such norms also reveal to what degree the tested ability varies with the other factors, on the basis of which the group was selected, as age, sex, education, size, health, race, etc.

As rapidly as reliable norms are established, it becomes possible to select for each age, school grade, occupation, etc., a set of tests which theaverage person of that age, schooling or calling should be able to perform to a certain known degree of proficiency. Failure to accomplish this indicates performance lower than that expected and in so far as success is dependent solely on mental ability, indicates inferior capacity. Similarly, ability to do more than the average or normal record requires indicates a capacity that is precocious, rare, and superior.

In this way are derived standard graded scales which represent a decided advance in the science of psychological diagnosis. There are three rather different forms in which attempts have been made to secure such scales. In one form the scale consists of a series of steps, each step consisting of different sorts of performance; that is, different tests or tasks are used. These tasks are arranged in groups, each group representing tests which should be passed acceptably by individuals of the given age, school grade, etc. In another form of scale the type of task is the same throughout, but the different points on the scale are represented by increasingly difficult specimens of material. The scale thus presents graded steps of difficulty in doing the same general sort of thing. In the third form the task remains precisely the same throughout, and performance is measured in termsof the time in which the task can be completed and the accuracy which is displayed. Sometimes, in scales of this type, although the instructions are always the same, the test is performed with varying degrees of approximation to a qualitative standard, and the steps may then consist of these graded qualitative achievements.

As representative of the first form of scale we may refer to the widely used Binet-Simon scale for the determination of mental age. Whatever we mean by intelligence, it is a characteristic which is essential to vocational activity. It is furthermore a characteristic which normally tends to increase in its degree or manifestation from infancy up to at least ten or twelve years of age. Beyond that point there are, to be sure, striking individual differences in that characteristic which we call intelligence, but beyond this point it does not seem so dependent on the physical age of the organism. Five-year-old children tend to be pretty much alike in intelligence. At least, the change from five years to seven years is commonly attended by very apparent growth in this respect, and a five-year-old is more like other five-year-olds in the things he can do than he is like seven-year-olds.

Experiment and observation show that the agesup to ten or twelve tend to indicate rather definite mental status, in the long run, although, to be sure, children of a given age vary considerably from one another. But beyond this point the age of an individual is not by any means an indication of the sort or degree of ability to be expected of him. The further we go beyond this point, the less significant becomes the mere statement of the individual's age. We may thus indicate the mental attainment of a child of less than twelve years by stating the average age of children who can do the things, know the facts, display the abilities that he can. This figure we will use to indicate hismentalage as distinguished from hischronologicalorphysicaloractual age. A record-blank which enumerates the tests comprising the Binet-Simon scale is given in the Appendix. Those who may be interested in using this or similar scales should familiarize themselves with some of the many books and manuals that have been written concerning them, the methods of using them, their characteristic results and their evaluation. These scales will be again considered in a later section, when we discuss the measures of general intelligence as they relate to vocational guidance and selection.

Other scales than the Binet-Simon series havebeen proposed, and this series has itself undergone modifications at the hands of later investigators—changes calculated to render it more reliable and adaptable. Much work is now being done in the attempt to develop scales or sets of tests which will reveal characteristic differences among people whose mentality has gone beyond the point which the juvenile scales reach.

The work of Trabue in standardizing the "completion test" so that individuals may be quantitatively compared on the basis of it may serve as an example of the second form of scale. This particular test consists in requiring the individual to supply meaningful words or phrases in the blank spaces formed by mutilating logical text. It is similar to the simple exercise sometimes found in elementary text books of grammar and spelling. It seems that the ability to supply the missing words or phrases quickly in such mutilated material calls for the exercise of a type of ability which correlates to a high degree with most other measures of intelligence. Individual differences as shown by school grades, age, opinion of teachers, estimates of associates, results of other mental tests, etc., are readily and with considerable reliability revealed in the individual's ability to perform this type of test. This investigator has,after much preliminary labor, constructed a form of this test in which the material gradually increases in difficulty from beginning to end. Efficiency in the test may be measured by the point one can reach in the text in a given time. This test has been standardized, not on the basis of physical age, as in the case of the Binet-Simon scale, but on the basis of school grade, from the second grade through the high school, some four or six years beyond the point where the Binet-Simon scale ceases to be useful. A copy of this test is also given in the Appendix. Those who wish to use it should consult the original description of it, for technique, precautions, norms, and interpretation.

A good example of the third form of scale is to be found in Sylvester's standardization of the "form-board" test. The "form-board" is one of the most useful tests in detecting intellectual defect that is so pronounced as to constitute the individual a "mental defective." Out of a solid base board are cut various geometrical forms, such as diamonds, stars, squares, triangular blocks, etc. These blocks are placed alongside the base from which they have been cut. The task is that of replacing all the blocks in their appropriate places, with the greatest possible speed.The test tends to reveal characteristic defects in understanding instructions, perceiving the general and specific situations, profiting by experience, recognizing form and size and other space relations, etc. The individual may work blind-folded or may use his eyes.

In the standardized form the sizes, shapes and positions are uniformly adopted and the technique of instruction and procedure is specified. Under these conditions the time required to complete the task by normal children of the ages five to fourteen years has been recorded. Sylvester presents a curve based on the examination of 1,537 normal children. The curve shows the average time of performance for each age and also indicates the range of performance for each age. In the case of a given individual it is thus easy, by referring to the standard table of norms, to determine whether he is up to the normal record for his age, whether he is within the normal range of variation for this age, and how deficient or precocious he may be in this respect. Tables of this type are now being accumulated for a great variety of single standard tests.

In addition to scales of this type, which proceed by setting for the individual a graded series of tasks and determining his success in their accomplishment,there is a further type of graded scale which is now represented by several standard specimens. This is the type of scale which is designed to afford an instrument for the measurement of such products as the actual work of the individual incidentally yields. Thorndike's "Scale for the Measurement of Handwriting" is the model on which many of the later scales of this type have been based. In this scale actual specimens of handwriting are arranged in a graduated series in such a way that the steps from specimen to specimen are equally appreciable or noticeable, and in this sense uniform. When such a scale extends from an actual zero point, it is possible to "measure" the quality of handwriting in quite the same way as that in which one measures the height of an individual or the length of a table. The quantitative measure consists in the statement of the number of stages which intervene between that quality of product represented by the specimen and the zero point of the scale. The position assigned to the specimen being measured is determined by moving the specimen along the graded series of standards until a point is reached where the specimen seems, on the basis of direct inspection, to belong. Such scales have been formulated for various special forms of school work,such as handwriting, drawing, arithmetic, literary composition, mechanical construction, etc. By such means it is possible not only to measure the "general intelligence" of the worker, but also his actual ability in creating a definite type of product. There seems to be no limit to the possibilities of scales of this form, and their value in determining the more definite and particular capacities, whether from the point of view of original endowment or from the point of view of the effects of training, is obvious.

These various scales for measuring general intelligence have been used chiefly for the purposes of educational diagnosis, in determining the degree of backwardness of children in the grades, their need for special educational attention, or the hopelessness of further pedagogical effort with them. But it is obvious at once that tests of this type are of great use to an employer in eliminating, from among the candidates for work, those who are hopelessly mentally defective, feeble-minded, and irresponsible. There are many sorts of work in which the employment of feeble-minded persons, unrecognizable as such by their physical traits or by a casual inspection, not only entails loss and annoyance but may constitute a positive danger and constant menace to those who rely onthe defective individual. Such work as that of delivery boys, messengers, domestic servants, nurses, elevator operators, drivers, motormen, etc., may be cited as instances of work into which the feeble-minded easily slip, unless there is some standardized means of recognizing them.

The importance of detecting these incompetents and keeping them from work in which their irresponsibility means economic waste and personal and social danger is of distinct vocational interest. Studies of cases brought to the Clearing House for Mental Defectives in New York City show that of the first two hundred and eighty-one feeble-minded women of child-bearing age, about two-thirds had been engaged in some form of economic labor in which their incompetence was distinctly dangerous to those associated with them. The following table shows how these two hundred and eighty-one feeble-minded women had been employed:

Back to Index Next