G10L25/03

Microphone with adjustable signal processing

A microphone may comprise a microphone element for detecting sound, and a digital signal processor configured to process a first audio signal that is based on the sound in accordance with a selected one of a plurality of digital signal processing (DSP) modes. Each of the DSP modes may be for processing the first audio signal in a different way. For example, the DSP modes may account for distance of the person speaking (e.g., near versus far) and/or desired tone (e.g., darker, neutral, or bright tone). At least some of the modes may have, for example, an automatic level control setting to provide a more consistent volume as the user changes their distance from the microphone or changes their speaking level, and that may be associated with particular default (and/or adjustable) values of the parameters attack, hold, decay, maximum gain, and/or target gain, each depending upon which DSP is being applied.

Detecting deep-fake audio through vocal tract reconstruction

A method is provided for identifying synthetic “deep-fake” audio samples versus organic audio samples. Methods may include: generating a model of a vocal tract using one or more organic audio samples from a user; identifying a set of bigram-feature pairs from the one or more audio samples; estimating the cross-sectional area of the vocal tract of the user when speaking the set of bigram-feature pairs; receiving a candidate audio sample; identifying bigram-feature pairs of the candidate audio sample that are in the set of bigram-feature pairs; calculating a cross-sectional area of a theoretical vocal tract of a user when speaking the identified bigram-feature pairs; and identifying the candidate audio sample as a deep-fake audio sample in response to the calculated cross-sectional area of the theoretical vocal tract of a user failing to correspond within a predetermined measure of the estimated cross sectional area of the vocal tract of the user.

Method and apparatus for audio data processing
11538471 · 2022-12-27 · ·

Embodiments of the disclosure provide methods and apparatuses processing audio data. The method can include: acquiring audio data by an audio capturing device, determining feature information of an enclosure in which the audio capturing device is located, and reverberating the feature information into the audio data.

Method and apparatus for audio data processing
11538471 · 2022-12-27 · ·

Embodiments of the disclosure provide methods and apparatuses processing audio data. The method can include: acquiring audio data by an audio capturing device, determining feature information of an enclosure in which the audio capturing device is located, and reverberating the feature information into the audio data.

DETECTION APPARATUS, METHOD AND PROGRAM FOR THE SAME

A detection device includes a labeling acoustic feature calculation unit configured to calculate a labeling acoustic feature from voice data, a time information acquisition unit configured to acquire a label with time information corresponding to the voice data from a label with no time information corresponding to the voice data and the labeling acoustic feature through a use of a labeling acoustic model configured to receive, as inputs, a label with no time information and a labeling acoustic feature and output a label with time information, an acoustic feature prediction unit configured to predict an acoustic feature corresponding to the label with time information and acquire a predicted value through a use of an acoustic model configured to receive, as an input, a label with time information and output an acoustic feature, an acoustic feature calculation unit configured to calculate an acoustic feature from the voice data, a difference calculation unit configured to determine an acoustic difference between the acoustic feature and the predicted value, and a detection unit configured to detect a labeling error on a basis of a relationship regarding which of the difference and a predetermined threshold value is larger or smaller than the other.

RENDERING VIRTUAL ARTICLES OF CLOTHING BASED ON AUDIO CHARACTERISTICS
20220406001 · 2022-12-22 ·

Systems and methods for generating a virtual article of clothing at a display are described. Some examples may include: obtaining video data and audio data, analyzing the video data to determine one or more body joints of a target object appearing in the video data. A mesh based on the determined one or more body joints may be generated. The audio data may be analyzed to determine audio characteristics associated with the audio data. Texture rendering information associated with a virtual article of clothing may be determined based on the audio characteristics. A rendered video may be generated by rendering the virtual article of clothing to the generated mesh using the texture rendering information.

RENDERING VIRTUAL ARTICLES OF CLOTHING BASED ON AUDIO CHARACTERISTICS
20220406001 · 2022-12-22 ·

Systems and methods for generating a virtual article of clothing at a display are described. Some examples may include: obtaining video data and audio data, analyzing the video data to determine one or more body joints of a target object appearing in the video data. A mesh based on the determined one or more body joints may be generated. The audio data may be analyzed to determine audio characteristics associated with the audio data. Texture rendering information associated with a virtual article of clothing may be determined based on the audio characteristics. A rendered video may be generated by rendering the virtual article of clothing to the generated mesh using the texture rendering information.

COMPUTERIZED MONITORING OF DIGITAL AUDIO SIGNALS
20220399945 · 2022-12-15 ·

A digital audio quality monitoring device uses a deep neural network (DNN) to provide accurate estimates of signal-to-noise ratio (SNR) from a limited set of features extracted from incoming audio. Some embodiments improve the SNR estimate accuracy by selecting a DNN model from a plurality of available models based on a codec used to compress/decompress the incoming audio. Each model has been trained on audio compressed/decompressed by a codec associated with the model, and the monitoring device selects the model associated with the codec used to compress/decompress the incoming audio. Other embodiments are also provided.

Method and system for identifying recipients of a reward associated with a conversion

The present teaching relates to method and system for evaluating a conversion. The method extracts meta-information including a conversion parameter and a reward. The meta-information corresponds to a conversion associated with an advertisement displayed previously by a plurality of entities. The method receives a plurality of claims for the conversion from one or more entities, and selects a claim corresponding to an entity from the plurality of claims based on the conversion parameter and information included in the plurality of claims. Further, the method transmits information related to the selected claim.

Method and system for identifying recipients of a reward associated with a conversion

The present teaching relates to method and system for evaluating a conversion. The method extracts meta-information including a conversion parameter and a reward. The meta-information corresponds to a conversion associated with an advertisement displayed previously by a plurality of entities. The method receives a plurality of claims for the conversion from one or more entities, and selects a claim corresponding to an entity from the plurality of claims based on the conversion parameter and information included in the plurality of claims. Further, the method transmits information related to the selected claim.