Michael Cohen was born in Hartford, Connecticut, USA on 31 March 1959. He received an Sc.B. in EE from Brown University (Providence, Rhode Island) in 1980, M.S. in CS from the University of Washington (Seattle) in 1988, and Ph.D. in EECS from Northwestern University (Evanston, Illinois) in 1991. He has worked at the Air Force Geophysics Lab (Hanscom Field, Massachusetts), Weizmann Institute (Rehovot; Israel), Teradyne (Boston, Massachusetts), BBN (Cambridge, Massachusetts and Stuttgart; Germany), Bellcore (Morristown and Red Bank, New Jersey), the Human Interface Technology Lab (Seattle, Washington), and the Audio Media Research Group at the NTT Human Interface Lab (Musashino and Yokosuka; Japan). He is currently an Associate Professor in the Human Interface Lab at the University of Aizu (Aizu-Wakamatsu, Japan).
He has research interests in user interfaces for telecommunications, groupware and CSCW (computer-supported collaborative work), stereotelephonics, digital typography and electronic publishing, hypermedia, ubicomp (ubiquitous computing) and virtual reality. Besides teaching various undergraduate courses including Information Theory, he co-teaches graduate-level courses in Acoustic Modeling, and Computer Music.
He is a member of the ACM, IEEE, 3D-Forum, TUG (TeX Users Group), and VRSJ (Virtual Reality Society of Japan). Cohen is the author or coauthor of two patents, three book chapters, and over forty papers.
Current foci of spatial audio research in recent literature comprise sound localization; lateralization and binaural masking; echoes, precedence, and depth perception; motion perception; sound source segregation and free-field masking; physiology of spatial hearing; models of spatial hearing; (childhood) development of spatial hearing; and applications of binaural technology to auditory displays for human-computer interaction. To cut across these categories in an attempt to outline the current state-of-the-art in spatial auditory displays for a particular range of applications, with an emphasis upon the expected performance of the technology in producing specific user responses required for those applications, this seminar considers the value of spatial audio technology in the creation and presentation of virtual environments. The shared synthetic worlds that networked computer users occupy constitute an alternative reality that has come to be termed `cyberspace.' Auditory display technology that attempts to provide such users with satisfying experiences of virtual acoustical space is termed here "cyberspatial audio" technology.
|intimate personal||headset,wearable computers chair||eartop (ex: headphones) nearphones||eyetop (ex: HMDs) laptop display, desktop monitor|
|inter-personal||couch or bench||transaural speakers SDP (stereo dipole)||HDTV|
|multi-personal||automobiles||surround sound (ex: Ambisonics)||projection spatially immersive displays (ex: Cave, Cabin)|
|social||clubs, theaters||speaker array (ex: VBAP [vector-based amplitude panning])||large-screen displays (ex: IMAX)|
|public||stadia, concert arenas||public address||(ex: Jumbotron)|
Precursors to modern spatial audio systems, many of which are still sold with labels like "3D sound" or "multidimensional sound," include spatial enhancers and stereo spreaders. Spatial enhancers filter mono or stereo inputs to add a sense of depth and spaciousness to the signal, allowing for simple operation and backwards compatibility, but precluding placement of individual sounds. Stereo spreaders filter mono inputs, placing the sound along a linear range to extend the sound field. While such approaches allow for realtime user adjustments, the techniques do not include distance or elevation models.
The most direct approach to spatializing audio is to simply position sound sources relative to the listeners, as in antiphonal concerts. Directly spatialized audio--- including attractions like Wisconsin's House on the Rock (www.thehouseontherock.com), a museum which features entire rooms lined with orchestral automata--- has charm, but is not practical for anything but special-purpose venues and LBE (Location-Based Entertainment). Fully articulated spatial audio allows dynamic (runtime), arbitrary placement and movement of multiple, separate, sources in a soundscape as well as extra dimensions encoding sound image size, orientation, and environmental characteristics.
Delivery mechanisms for virtual spatial audio can be organized along a continuum of scale. At one end of the spectrum, simple amplitude-panning (balance), perhaps best deployed as constant power cross-fader, can be thought of as a "poor person's spatializer." In conjunction with exocentric visual cues (like a map of sources and sinks, even such a degenerately simple technique, capable of only lateral (left<->right) effects, can be effective for some applications. In conjunction with egocentric visual cues (like first-person perspective shifts in a large part of the visual field), lateral shifts in the auditory image can disambiguate frontward from rearward sound source incidence angles. Therefore, this simple manipulation can carry surprisingly useful information in applications allowing locomotion through a virtual environment (such as computer games that require tracking other players while exploring a 3D-model-based world).