The spatial character of high and low tones.

Originally published in Journal of Experimental Psychology, 13, 278-85, 1930.

By Carroll C. Pratt

Harvard University

In all sensory phenomena one may detect some degree of spatial orientation. The different modalities do not, however, have equal shares of spatial material. The sense of hearing has come off rather badly in this respect. Sounds, in spite of their volumic proportions, are not spread out in space, and localization of them frequently reveals curious inaccuracies and vagaries. And yet for the differentiation of certain tonal qualities it becomes necessary to draw upon terms which are of purely spatial origin. The tones at one end of the pitch-continuum are called high, and those at the other end are called low, and qualitative changes along this continuum are spoken of as rising or falling inflections, ascending or descending cadences, downward or upward movements. The conspicuous fitness of such phrases is immediately apparent.

The application to a given sense impression, of adjectives which belong strictly to impressions from other modalities, occasions as a rule little comment. Colors are called warm and odors heavy not because the psychological thermometers and balances are raised or tipped, but rather because by fairly obvious associations they remind one of experiences in other sense departments. It has been suggested that these cross-modality references give evidence that the varieties of sensory experience have their common origin in an undifferentiated matrix of preperceptual stuff, but it is probably wiser in our present state of knowledge to look upon them simply as evidence of perceived resemblances, or, in some cases, as instances of actual fusions of qualities from different sense departments, as when one calls a flavor puckery.[1] It has not been so simple a matter, however, to discover the associative bond which has led so uniformly to the application of the terms high and low to tonal pitch.

Stumpf has found that adjectives meaning high and low (or words closely related in meaning) have been applied to tones in almost every known language.[2] But why should tones be characterized as high or low? Do these characteristics refer to differences in spatial height and depth? The answer to this second question has been almost without exception in the negative. A high tone does not mean a tone which is high in space. The phrase is merely figurative, and must be accounted for in terms of secondary criteria such as, e.g., the apparent localization of high vocal tones in the head and low ones in the chest. The composer Berlioz makes sport of such explanations and reminds his readers that high and low tones for the pianist lie in the horizontal directions of right and left and that the violoncellist must reach downward to produce high tones, and suggests that those composers of opera who use descending passages for a person falling downstairs have stupidly transferred to the tones the arbitrary downward character of the printed notes on the staff. And yet Stumpf, convinced that there can be no intrinsic height and depth in tones, has felt obliged to argue that here again some associative mechanism, strangely obscure and elusive, has been at work. Even Wundt was forced to agree with Stumpf in calling these terms metaphorical when applied to tones [3], and most psychologists who have given the matter any thought have expressed similar views [4].

What are the associative bonds or the relations of similarity which cause tones to be so appropriately designated as high and low? Stumpf has put himself to great pains to discover possible clues. Since there are no verbal expressions, aside from letters of the alphabet, for tonal qualities, language has borrowed from various sense departments words which apply to impressions accompanied by feelings similar to those to which tones give rise [5].  Thus the affective character of low tones is gloomy and dark while that of high tones is sharp and bright, even painful in very high tones, as though the ear were pierced by a needle [6]. Low tones give the impression of voluminousness and massiveness as contrasted with the thinness and smallness of high tones [7].  The times for Anklingen and Abklingen of low tones are longer than for high tones, with the result that rapid passages in the upper part of the scale seem light and airy while passages at the same tempo in the bass sound heavy, clumsy, and labored [8].  Even these few examples, among others which Stumpf cites [9], enable one to discern the associative trail which language has followed in the selection of spatial metaphors for pitch-differences. Thin, small, light, airy: these are terms suitable for objects which, if not always found at high altitudes, are at any rate up and away from the ground. Dark and gloomy objects tend to be nearer the surface of the earth, the massive parts of a structure support the smaller parts, and heavy objects are generally lower in space than light ones.

Such was the type of explanation which Stumpf was forced to adopt in his attempt to account for the use of the words high and low for tonal quality, and most psychologists have been inclined to accept his view." Titchener, e.g., refers to Stumpf's discussion of the question, and then warns the reader that in the study of tonal sensation he must not be misled by the frequent use of metaphor and analogy; that it is not altogether clear how the adjectives high and low have come to be applied to tones, but that it is quite clear that they must not be taken as indicative of a spatial character in pitch [11].

To account for cross-modality references by means of association appears, in many cases, plausible enough. If association yields eventually to a more satisfactory principle of explanation it will not be difficult to understand why association was tried. But the most ingenious arguments from association, even in the hands of Stumpf, fail to carry conviction when applied to tonal height and depth. Some of this doubt finds expression in the reservations which Titchener makes when he says that "it is not altogether easy to see" [12] and "we cannot yet say certainly" [13] how the terms high and low came to be applied to tones. Even in the face of this doubt it is legitimate, of course, to suppose that the associative items which Stumpf and others have mentioned serve in some way to reinforce those tonal properties which lead to spatial characterizations. The doubt attaches rather to explanations couched solely in terms of such items. Does not the universality of spatial characterization point to some factor more fundamental than any which derive from cross-modality analogy and supplementation? The experiments about to be reported suggest that the most obvious explanation, but the one universally rejected [14], is very likely the correct one, viz., that prior to any associative addition there exists in every tone an intrinsic spatial character which leads directly to the recognition of differences in height and depth along the pitch-continuum [15].

Observers were asked to locate on a numbered scale running from the floor to the ceiling the position of tones coming from a Western Electric No. 2-A Audiometer. The scale was 22 meters in height and divided into 14 equal parts. The observer sat facing the scale at a distance of 3 meters, while the experimenter operated the audiometer in back of a large screen to which the scale was attached. Five tones were used: 256, 512, 1024, 2048, and 4096. They were led to a telephone receiver and were presented in haphazard order at five different positions in back of the vertical scale. The observers were allowed to know that the receiver was being placed at different points up and down the vertical scale in order that they might be put on their guard against making judgments on the basis of difference of pitch which  had nothing to do with difference of pitch-location [16]. The instructions to the observers were simple. They were merely asked to indicate by one of the numbers on the scale the region from which the tone seemed to come.

Only at the outset did the observers experience any difficulty with the judgment. The tonal impression seemed at first to pervade the whole room, but as the attentional direction fell in line with the task imposed by the instructions this difficulty entirely vanished and the judgments were made easily and quickly, and with surprising consistency.

The results are clear-cut and unequivocal. High tones are phenomenologically higher in space than low ones. For every observer the tones were uniformly placed in the order, from top to bottom, 4096, 2048, 1024, 512, and 256. Not a single inversion of this order occurred in the average values (Table I), and only an occasional reversal appeared between the single items of a series. Hence one may say that of two tones of different pitch the one of greater frequency is called higher, not because of any extraneous associations with altitude, but simply because it is perceived as occupying a higher position in phenomenological space.

Positions on a vertical scale (numbered 1 to 15) at which tones of different pitch were localized.
Each figure represents the average of ten judgments.

Pitch A B C D E F
4096 12.4 10.4 13.6 13.4 10.0 14.4
2048 9.4 9.4 10.7 11.0 9.1 11.8
1024 7.8 8.3 8.8 8.0 8.4 9.7
512 6.4 7.1 7.2 7.4 6.8 6.9
256 4.6 6.2 6.4 5.4 5.8 1.9

An objection is likely to be raised at this point. May it not be that the observers were not really making judgments of spatial location, but were merely assigning to clearly recognized differences in pitch the spatial characters which inevitably attach to the terms universally employed to designate these pitch-differences? In other words, an observer calls a tone high in space simply because he recognizes it as high in pitch. I can only reply that on the basis of the introspective conviction of the observers and myself I think emphatically that such is not the case. As soon as one has his attention directed to the spatial property of a tone the phenomenon of pitch-locality becomes very real and unmistakable. Moreover, the pitches used in the experiment all bore the octave-relation to each other. Now it is a well-known fact that confusions of the octave within which a pitch lies are very frequent—a judgment sometimes being off by as much as two or three octaves. One would then naturally suppose that if the observers were placing the notes on the basis of pitch-quality the error of octave-confusions (especially since these tones were practically pure) would introduce frequent reversals of localization. But the significant fact is that such reversals rarely occurred. A given pitch was placed always very close to the same point.

These results possess obvious implications and consequences. If they receive further verification, auditory theory must look for the physiological correlate which underlies the spatial difference in pitch. The fact that on any place-theory of hearing the lowest tones would fall at the apex and the highest tones at the base of the cochlea opposite the oval window no more means that we hear the world upside down than the inversion of the retinal image forces us to stand on our heads to see the world right side up [17].  The experiments were done, however, not so much with auditory theory in mind as with the query as to whether the results would throw any light on the moot question of the apparent auditory movement which is set up by tones of different pitch when presented in succession.

We usually think of movement as the change in spatial location of an object which during the change is recognized as the same object or quality. But in musical movement it appeared as though the qualities changed while the spatial location was constant. From the present results, however, one would have to argue that musical movement resembles any other kind of movement to the extent that when different pitches are presented successively they change their spatial location with respect to one another. In presenting successively, e.g., the notes of the diatonic scale from C3 to C4 one is aware of upward movement because each note is actually perceived as occupying a higher spatial position than the preceding one. This fact will surely be of importance in the analysis of the variegated movements produced by temporal shifts of tonal quality in music.

(Manuscript received November 15, 1929)


1.  Köhler has attached great weight to these cross-modality references in his attempt to account for the facility with which we ascribe to others experiences similar to our own. Cf. his Gestalt Psychology, 1928, 241 ff.

2.  C. Stumpf, Tonpsychologie, 1883, I, 192 ff.

3.  "Nun sind natürlich die Bezeichnungen 'tier' and 'hoch' für verschiedene Tonqualitäten bildliche Bezeichnungen, die jedenfalls zu einem wesentlichen Teile durch die Gefühle mitbestimmt werden, die sich schon mit den einfachen Tonempfindungen verbinden." W. Wundt, Grundzüge der physiologischen Psychologie, 1910, 11, 78.

3a.  [Translation by Christopher Aruffo:] "Now, of course, the terms 'low' and 'high' for various musical sound qualities are affected, at least to some significant degree, from the emotional response connected to the simple sensation of the sound."

4. The emphatic naivete of the physicist, when dealing with matters psychological, is well illustrated by the following quotation. "There is no reason, either physical or psychological, for regarding notes of different pitch as being one 'higher' and the other 'lower.' Considered from the point of view of the pulsation alone, there is no reason for regarding A1 as being 'higher' than A. The pulsation of the former note is, indeed, shorter by half than that of the latter. But this fact offers no possible excuse for calling it a 'higher' note... Custom should not blind us to the fact that there is no reason other than custom for speaking of one note as being 'higher' than another. Nothing actually happening, either in the air or within consciousness, furnishes any justification whatever for the practice." J. Redfield, Music: a Science and an Art, j1928, 42 ff.

5.   Stumpf, op. cit., 202 f.

6.  Ibid., 203.

7.  Ibid., 207 ff.

8.  Ibid., 211 f.

9.  Ibid., 202-226.

10. Hugo Riemann, after raising objections to most of Stumpf's views on music, seems even to reject Stumpf's conclusion regarding pitch. "Wir verbinden erstens ganz bestimmt doch auch mit dem einzelnen Tone die Vorstellung oder Empfindung von dessen Stellung in dem keineswegs unendlichen Tonraume and zwar um so bestimmter, je weniger ihm etwas von der Begrenztheit des ihm hervorbringenden Organs anhörbar ist; and zweitens ist die Vorstellung der raumlichen Entfernung zweier einander folgenden Töne eine sehr genau bestimmte, wenn auch nicht nach Centimetern oder Metern, so doch nach Oktaven, Quinten, usw. gemessene. Gerade die offen zu Tage liegende Unterscheidung von Zeitmessung (Metrum, Rhythmus, Takt, Tempo) and Messung der Tonhöhen-Abstande in der Melodiebewegung and im akkordischen Lagenwesen ist ganz besonders geeignet, die Immanenz raumlicher Verstellungen im Tonhöhenbewusstsein zu erweisen." H. Riemann, Die Elemente der musikalischen Aesthetik, 1900, 38.

10a.  [Translation by Christopher Aruffo:]  We combine principally with an individual tone the idea or perception of its position in the infinite spectrum, somewhat influenced by the timbre of the instrument which generated it; secondly, the idea of the spatial distance between two successive tones is very specific, albeit not measured by meters or centimeters but octaves, fifths, and the like.  Indeed, the method of distinguishing between time measurement (meter, rhythm, measure, tempo) and measurement of tone height distances in the melody, and movement in the tonality, proves particularly suited to the spatial movements of pitch awareness.

11.  E. B. Titchener, A Text-book of Psychology, 1910, 94f.

12.  Ibid., 94.

13.  A Beginner's Psychology, 1915, 52.

14.  In a recent book by D. W. Prall, Esthetic Judgment, 1929, the suggestion is made that high and low tones may be spatially high and low. "The words we use for these extremes (of pitch) are high and low, but we must remember that we use them somewhat figuratively, as ultimately we use all words. Height is spatial and so is depth, and it is somewhat of an accident, perhaps, that high and low are our most nearly literal description of variations of pitch... And perhaps it is also true that all high sounds seem to come from higher spatial regions than low ones. Thunder itself, though it originates in the clouds, comes to us from the trembling earth beneath and all about us, and it is not entirely figurative to say that sounds from heights are high and sounds from depths are low." (83 f. Italics are mine.) Just what position Prall is trying to defend here is difficult to say. In the italicized passages he expresses a view which has been generally rejected, but which the present article will attempt to defend. A few sentences earlier, however, he states that the words high and low are used figuratively or accidentally. It is a convenient hypothesis that works equally well in opposing directions! And surely no one will deny that sounds which come from heights are high ones. But how about their pitch? Many sounds, including thunder, have their pitch-salient so obscured that localization of their source (in the clouds or the trembling earth) would reveal nothing about the localization of their pitch.

15.  The idea for the present experiments was first suggested by my colleague, Dr. J. G. Beebe-Center, during a conversation in which the question arose as to whether the apparent movement in a musical phrase might not be due to actual differences in the spatial position of the pitches.

16.  The fact that accurate localization of the source of a sound in the median plane is almost impossible was ignored on the assumption that even if the observers happened to remember this fact they would probably not consider it critically in connection with their instructions, and that they would therefore be set rigidly to judge in terms of pitch-localization rather than pitch-quality. For three of the observers localization tended to vary directly (but very slightly) with the position of the receiver. For the other three observers the position of the receiver had no effect whatsoever.

17.  One observer stood on his head and found that the direction of high and low tones then seemed reversed.