Patent classifications
H04S7/301
Method and system for generating an HRTF for a user
A method of obtaining a head-related transfer function for a user is provided. The method comprises generating an audio signal for output by a handheld device and outputting the generated audio signal at a plurality of locations by moving the handheld device to those locations. The audio output by the handheld device is detected at left-ear and right-ear microphones. A pose of the handheld device relative to the user's head is determined for at least some of the locations. One or more personalised HRTF features are then determined based on the detected audio and corresponding determined poses of the handheld device. The one or more personalised HRTF features are then mapped to a higher-quality HRTF for the user, wherein the higher-quality HRTF corresponds to an HRTF measured in an anechoic environment. This mapping may be learned using machine learning, for example. A corresponding system is also provided.
Sum-difference arrays for audio playback devices
In some embodiments, a method comprises receiving audio content comprising left input channel signals and right input channel signals, and generating first and second input signals from the left and right input channel signals. The first input signal is based on a sum of the left and right input channel signals, and the second input signal is based on a difference of the left and right input channel signals. An array transfer function is applied to the first and second input signals to produced audio output signals, which can be provided to a plurality of audio transducers on one or more playback devices.
DEPTH SENSING VIA DEVICE CASE
Examples are disclosed that relate to displaying a hologram via an HMD. One disclosed example provides a method comprising obtaining depth data from a direct-measurement depth sensor included in the case for the HMD, the depth data comprising a depth map of a real-world environment. The method further comprises determining a distance from the HMD to an object in the real-world environment using the depth map, obtaining holographic imagery for display based at least upon the distance, and outputting the holographic imagery for display on the HMD.
Apparatus and method for processing volumetric audio
A method including receiving an audio scene including at least one source captured using at least one near field microphone and at least one far field microphone. The method includes determining at least one room-impulse-response associated with the audio scene based on the at least one near field microphone and the at least one far field microphone, accessing a predetermined scene geometry corresponding to the audio scene, and identifying best match to the predetermined scene geometry in a scene geometry database. The method also includes performing RIR comparison based on the at least one RIR and at least one geometric RIR associated with the best matching geometry and rendering a volumetric audio scene based on a result of the RIR comparison.
Selecting spatial locations for audio personalization
An audio system generates customized head-related transfer functions (HRTFs) for a user. The audio system receives an initial set of estimated HRTFs. The initial set of HRTFs may have been estimated using a trained machine learning and computer vision system and pictures of the user's ears. The audio system generates a set of test locations using the initial set of HRTFs. The audio system presents test sounds at each of the initial set of test locations using the initial set of HRTFs. The audio system monitors user responses to the test sounds. The audio system uses the monitored responses to generate a new set of estimated HRTFs and a new set of test locations. The process repeats until a threshold accuracy is achieved or until a set period of time expires. The audio system presents audio content to the user using the customized HRTFs.
SYSTEM AND METHOD FOR AUTOMATICALLY TUNING DIGITAL SIGNAL PROCESSING CONFIGURATIONS FOR AN AUDIO SYSTEM
Embodiments include a processing device communicatively coupled to a plurality of audio devices comprising at least one microphone and at least one speaker, and to a digital signal processing (DSP) component having a plurality of audio input channels for receiving audio signals captured by the at least one microphone, the processing device being configured to identify one or more of the audio devices based on a unique identifier associated with each of said one or more audio devices; obtain device information from each identified audio device; and adjust one or more settings of the DSP component based on the device information. A computer-implemented method of automatically configuring an audio conferencing system, comprising a digital signal processing (DSP) component and a plurality of audio devices including at least one speaker and at least one microphone, is also provided.
Orientation-based playback device microphone selection
Aspects of a multi-orientation playback device including at least one microphone array are discussed. A method may include determining an orientation of the playback device which includes at least one microphone array and determining at least one microphone training response for the playback device from a plurality of microphone training responses based on the orientation of the playback device. The at least one microphone array can detect a sound input, and the location information of a source of the sound input can be determined based on the at least one microphone training response and the detected sound input. Based on the location information of the source, the directional focus of the at least one microphone array can be adjusted, and the sound input can be captured based on the adjusted directional focus.
Audio response playback
A computing device is configured to perform functions comprising: receiving via a network microphone device of a media playback system, a voice command detected by at least one microphone of the network microphone device, wherein the media playback system comprises a plurality of zones, and the network microphone device may be a member of a default playback zone. The computing device may be further configured to perform functions comprising: dynamically selecting an audio response zone from the plurality of zones to play an audio response to the voice input and foregoing selection of the default playback zone. The selected zone may comprise a playback device, and the dynamically selecting may comprise determining that the network microphone device is paired with the playback device. The computing device may cause the playback device of the selected zone to play the audio response.
Audio processing
A method for rendering a spatial audio signal that represents a sound field in a selectable viewpoint audio environment that includes one or more audio objects associated with respective audio content and a respective position in the audio environment. The method includes receiving an indication of a selected listening position and orientation in the audio environment; detecting an interaction concerning a first audio object on basis of one or more predefined interaction criteria; modifying the first audio object and one or more further audio objects linked thereto; and deriving the spatial audio signal that includes at least audio content associated with the modified first audio object in a first spatial position of the sound field that corresponds to its position in the audio environment in relation to said selected listening position and orientation, and audio content associated with the modified one or more further audio objects.
Generating sound zones using variable span filters
The invention provides a method for generating output filters to a plurality of loudspeakers at respective positions for playback of a plurality of different input signals in respective spatially different sound zones by means of a processor system. The method comprising computing spatio-temporal correlation matrices in response to spatial information, e.g. measured transfer functions, and in response to desired sound pressures in the plurality of sound zones. Joint eigenvalue decomposition of the spatial correlation matrices are then computed, or at least an approximation thereof, to arrive at eigenvectors accordingly. Next, variable span filters a reformed from a linear combination of the eigenvectors in response to a desired trade-off between acoustic contrast and acoustic errors in the sound zones. Finally, output filter for each of the plurality of loudspeakers, for each of the plurality of input signals, in accordance with the variable span filters. The method is applicable also for optimization in one zone, e.g. for room equalization.