Patent classifications
H04R1/40
Apparatus, Method and Computer Program for Enabling Audio Zooming
Examples of the disclosure relate to apparatus, methods and computer programs for enabling audio zooming. The apparatus can include circuitry configured for determining, for an audio signal, if sound energy in at least one first direction is different to sound energy in at least one second direction by at least a threshold amount. The circuitry may also be configured for controlling an amount of headroom provided based on whether or not sound energy in at least one first direction is different to the sound energy in at least one second direction by at least a threshold amount.
DISPLAY SYSTEM AND METHOD
A system for obtaining content for display to a user of a head-mountable display device, HMD, the system comprising one or more audio detection units operable to capture audio in the environment of the user, a motion prediction unit operable to predict motion of the HMD in dependence upon the captured audio, and a content obtaining unit operable to obtain content for display in dependence upon the predicted motion of the HMD.
FACE DETECTION GUIDED SOUND SOURCE LOCALIZATION PAN ANGLE POST PROCESSING FOR SMART CAMERA TALKER TRACKING AND FRAMING
A videoconferencing system includes a camera acquiring image data and a microphone array acquiring audio data. Image data is used in conjunction with sound source localization (SSL) data to locate a talker depicted in the image data. SSL processes the audio data and determines SSL pan angle values indicative of an estimated direction of a sound. Columns of pixels in an image are associated with bins. A bin count is incremented for each SSL pan angle value of the audio data that falls within a given bin. A bounding box in the image data is determined that encompasses a face depicted in the image data. A range of pixels is determined for the bounding box, such as extending from a leftmost column to a rightmost column. The bin with the highest bin count that also overlaps a range of pixels for a bounding box is deemed to contain the talker.
POSITION LOCATING SYSTEM, MARINE VESSEL, AND TRAILER FOR MARINE VESSEL
A position locating system that locates relative position information between a marine vessel and a trailer includes a wave signal generator located on a first object that is one of the marine vessel and the trailer to emit wave signals from at least three different positions having known relative positional relationships with each other, a wave signal receiver located a second object that is the other of the marine vessel and the trailer to receive the wave signal emitted from each of the positions of the wave signal generator, and a position locator configured or programmed to locate relative position information between the marine vessel and the trailer that includes at least a direction of the second object as viewed from the first object based on the wave signal from each of the positions received by the wave signal receiver.
CALL ENVIRONMENT GENERATION METHOD, CALL ENVIRONMENT GENERATION APPARATUS, AND PROGRAM
Provided is a technique to generate a call environment that prevents call contents from being heard by a person other than a person speaking on the phone in a case where call voice is output from a speaker. Speakers installed in an automobile are denoted by SP.sub.1, ..., SP.sub.N, a first filter coefficient used to generate an input signal for a speaker SP.sub.n is denoted by F.sub.n (ω), and a second filter coefficient that is different from the first filter coefficient and is used to generate an input signal for the speaker SP.sub.n is denoted by .sup.~F.sub.n (ω). A call environment generation method includes: an acoustic signal generation step of generating, when detecting a start signal of a call, a call-time acoustic signal that is obtained by adjusting volume of an acoustic signal to be reproduced during the call, by using a predetermined volume value; a first local signal generation step of generating a sound signal S.sub.n as an input signal for the speaker SP.sub.n from a voice signal of the call by using the first filter coefficient F.sub.n (ω); and a second local signal generation step of generating an acoustic signal A.sub.n as an input signal for the speaker SP.sub.n from the call-time acoustic signal by using the second filter coefficient .sup.~F.sub.n (ω).
ELECTRONIC DEVICE AND METHOD FOR PROCESSING SPEECH BY CLASSIFYING SPEECH TARGET
Various embodiments of the disclosure provide a method and a device which includes multiple cameras arranged at different positions, multiple microphones arranged at different positions, a memory, and a processor operatively connected to at least one of the multiple cameras, the multiple microphones, and the memory, wherein the processor is configured to: determine, using at least one of the multiple cameras, whether at least one of a user wearing the electronic device or a counterpart having a conversation with the user makes an utterance, configure directivity of at least one of the multiple microphones based on the determination, obtain an audio from at least one of the multiple microphones based on the configured directivity, obtain an image including a mouth shape of the user or the counterpart from at least one of the multiple cameras, and process speech of an utterance target in a different manner based on the obtained audio and the image.
ELECTRONIC DEVICE AND METHOD FOR PROCESSING SPEECH BY CLASSIFYING SPEECH TARGET
Various embodiments of the disclosure provide a method and a device which includes multiple cameras arranged at different positions, multiple microphones arranged at different positions, a memory, and a processor operatively connected to at least one of the multiple cameras, the multiple microphones, and the memory, wherein the processor is configured to: determine, using at least one of the multiple cameras, whether at least one of a user wearing the electronic device or a counterpart having a conversation with the user makes an utterance, configure directivity of at least one of the multiple microphones based on the determination, obtain an audio from at least one of the multiple microphones based on the configured directivity, obtain an image including a mouth shape of the user or the counterpart from at least one of the multiple cameras, and process speech of an utterance target in a different manner based on the obtained audio and the image.
METHOD FOR GENERATING A DIGITAL MODEL-BASED REPRESENTATION OF A VEHICLE
A method for generating a digital model-based representation of a vehicle. The method includes: receiving sensor data of a plurality of acoustic sensors of a vehicle, wherein the sensor data describes sounds of the vehicle and/or sounds of an environment of the vehicle, and wherein the sensor data has been recorded for a plurality of trips of the vehicle; evaluating the sensor data and the creation of relations between the received sounds of the vehicle and/or the environment and the particular sound-causing statuses of the vehicle and/or the environment; and storing in a model-based representation of the vehicle and/or the environment, the determined relations between the sounds of the vehicle and/or the environment in a model-based representation of the vehicle.
ELECTRONIC DEVICE
An electronic device includes a housing sidewall defining an opening and a display component, such as a display cover, disposed in the opening to form a gap between the housing sidewall and the display component. In at least one example, the cavity is defined by the sidewall and the display cover with the cavity in fluid communication with an external environment through the gap. In at least one example, an epoxy component at least partially defines the cavity and can be in direct contact with the housing sidewall.
PRIVACY-PRESERVING SOCIAL INTERACTION MEASUREMENT
Various systems, devices, and methods for social interaction measurement that preserve privacy are presented. An audio signal can be captured using a microphone. The audio signal can be processed using an audio-based machine learning model that is trained to detect the presence of speech. The audio signal can be discarded such that content of the audio signal is not stored after the audio signal is processed using the machine learning model. An indication of whether speech is present within the audio signal can be output based at least in part on processing the audio signal using the audio-based machine learning model.