Kapralos - Auditory Perception and Spatial (3D) Auditory Systems 2003-07.pdf - Akustyka - pg2464

Auditory Perception and Spatial (3D) Auditory Systems

Bill Kapralos

Michael R. M. Jenkin

Evangelos Milios

Technical Report CS-2003-07

July 20, 2003

Department of Computer Science

4700 Keele Street North York, Ontario M3J 1P3 Canada

Auditory Perception and Spatial (3D)

Auditory Systems 4

B. Kapralos 1;3 , M. Jenkin 1;3 and E. Milios 2;3

1 Dept. of Computer Science, York University, Toronto, ON, Canada. M3J 1P3

2 Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada. B3H 1W5

3 Centre for Vision Research, York University, Toronto, ON, Canada. M3J 1P3

fbillk, jenking@cs.yorku.ca

eem@cs.dal.ca

Abstract

In order to enable the user of a virtual reality system to be fully immersed in the virtual

environment, the user must be presented with believable sensory input. Although the

majority of virtual environments place the emphasis on visual cues, replicating the complex

interactions of sound within an environment will benet the level of immersion and hence

the user's sense of presence. Three dimensional (spatial) sound systems allow a listener to

perceive the position of sound sources, and the eect of the interaction of sound sources

with the acoustic structure of the environment. This paper reviews the relevant biological

and technical literature relevant to the generation of accurate acoustic displays for virtual

environments, beginning with an introduction to the process of auditory perception in

humans. This paper then critically examines common methods and techniques that have

been used in the past as well as methods and techniques which are currently being used to

generate spatial sound. In the process of doing so, the limitations, drawbacks, advantages

and disadvantages associated with these techniques are also presented.

4 The nancial support of NSERC (Natural Sciences and Engineering Research Council of Canada),

CRESTech (Centre for Research in Earth and Space Technology) and IRIS (Institute for Robotics and

Intelligent Systems), is gratefully acknowledged.

Contents

1 Introduction 1

1.1 What Exactly is Sound? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1.1 Measuring Sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.1.2 Near Field vs. Far Field . . . . . . . . . . . . . . . . . . . . . . . . 9

1.1.3 Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2 Sound Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.1 Duplex Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2.2 Head Related Transfer Function (HRTF) . . . . . . . . . . . . . . . 14

1.2.3 Reverberation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.2.4 Precedence Eect . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.2.5 Head Movements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.2.6 Auditory Distance Perception . . . . . . . . . . . . . . . . . . . . . 23

2 Recording Techniques 30

2.1 Listener Sweet Spot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.2 Microphones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3 Monaural Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.4 Stereophonic Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.4.1 Articial Stereo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.4.2 Coincident Microphone Techniques . . . . . . . . . . . . . . . . . . 37

2.4.3 Spaced Microphone Techniques . . . . . . . . . . . . . . . . . . . . 40

2.4.4 Combining Coincident and Spaced Microphone Techniques . . . . . 40

2.5 Binaural Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.5.1 Binaural Recording Techniques . . . . . . . . . . . . . . . . . . . . 42

2.6 Surround Sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.6.1 Quadraphonic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.6.2 Ambisonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.6.3 Dolby Stereo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.6.4 Dolby Pro Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.6.5 Dolby Digital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.6.6 Digital Theater Systems (DTS) Digital Surround . . . . . . . . . . 59

3 Simulating Audio in a Virtual Environment 61

3.1 Modeling the ITD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.2 Binaural Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.3 HRTF Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.3.1 Interpolation of HRTFs . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.3.2 The Use of Non-individualized (\Generic") HRTFs . . . . . . . . . 67

3.3.3 Available HRTF Datasets . . . . . . . . . . . . . . . . . . . . . . . 70

3.3.4 Equalization of the HRTF Impulse Response . . . . . . . . . . . . . 76

3.4 Modeling of Reverberation and Room Acoustics . . . . . . . . . . . . . . . 78

3.4.1 Auralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.5 Distance Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.5.1 Loudness as a Distance Cue . . . . . . . . . . . . . . . . . . . . . . 84

3.5.2 Reverberation as a Distance Cue . . . . . . . . . . . . . . . . . . . 88

3.5.3 Source Spectral Content as a Distance Cue . . . . . . . . . . . . . . 89

3.5.4 Binaural Cues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.5.5 Sound Source Familiarity . . . . . . . . . . . . . . . . . . . . . . . . 91

4 Conveying Sound in a Virtual Environment 93

4.1 Headphone Listening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.1.1 Headphones and Comfort . . . . . . . . . . . . . . . . . . . . . . . 94

4.1.2 Inside-the-Head Localization . . . . . . . . . . . . . . . . . . . . . . 95

4.2 Loudspeaker Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.2.1 Transaural Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.2.2 Amplitude Panning . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5 Discussion

109

Chapter 1

Introduction

The sounds we hear provide us with detailed information about our surroundings and can

assist us in determining both the distance and direction to objects, at times, very accurately

[159]. This ability is extremely benecial for both humans and a variety of other species

and in many situations, is crucial for survival. We can hear a sound in the dark where we

may not necessarily make use of vision (sight) and in contrast to the limited visual eld

of view, the auditory system is omni-directional, allowing us to hear sounds reaching us

from any position in three dimensional space. Given this omni-directional aspect, hearing

serves to guide our visual senses, or to quote Cohen and Wenzel [30], \the function of the

ears is to point the eyes". Hearing, or audition also serves to guide the more \nely tuned"

visual attention system thereby easing the burden of the visual system [137].

Although sound is a critical cue to perceiving our environment, it is often overlooked

in immersive virtual environments, where, historically, emphasis has been placed on the

visual senses [30, 25]. The spatial audio cues present in many virtual environments are

rather poor and do not necessarily reect natural cues despite the fact that natural (spatial)

sound cues can allow a user to orient themselves in a virtual environment. In addition,

audio cues can add a \pleasing quality" to the simulation, add a better sense of \presence"

or \immersion" and compensate for poor visual cues (graphics) [3, 137]. Furthermore,

the virtual environments which actually employ spatial audio typically, assume a far eld

source acoustical model, emphasizing the direction (azimuth and elevation) to a sound

source only, oering little, if any, sound source distance information [138, 108]. Despite

the importance of distance discrimination in maintaining a sense or realism among the

virtual sound sources [16], accurate sound source distance is often ignored in virtual audio

Kapralos - Auditory Perception and Spatial (3D) Auditory Systems 2003-07.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: