Patent classifications
G10L21/14
Digital audio workstation with audio processing recommendations
Presentation of a recommendation to a user for individual processing of audio tracks in a digital audio workstation. Training audio tracks are provided to a human sound mixer and responsive to the training audio tracks individually processed training audio tracks are received from the human sound mixer. The training audio tracks and the individually processed training audio tracks are input to a machine to train the machine. Audio processing operations are output from the trained machine and stored in a record of a database.
Method and system for learning and using latent-space representations of audio signals for audio content-based retrieval
A method and system are provided for extracting features from digital audio signals which exhibit variations in pitch, timbre, decay, reverberation, and other psychoacoustic attributes and learning, from the extracted features, an artificial neural network model for generating contextual latent-space representations of digital audio signals. A method and system are also provided for learning an artificial neural network model for generating consistent latent-space representations of digital audio signals in which the generated latent-space representations are comparable for the purposes of determining psychoacoustic similarity between digital audio signals. A method and system are also provided for extracting features from digital audio signals and learning, from the extracted features, an artificial neural network model for generating latent-space representations of digital audio signals which take care of selecting salient attributes of the signals that represent psychoacoustic differences between the signals.
Method and system for learning and using latent-space representations of audio signals for audio content-based retrieval
A method and system are provided for extracting features from digital audio signals which exhibit variations in pitch, timbre, decay, reverberation, and other psychoacoustic attributes and learning, from the extracted features, an artificial neural network model for generating contextual latent-space representations of digital audio signals. A method and system are also provided for learning an artificial neural network model for generating consistent latent-space representations of digital audio signals in which the generated latent-space representations are comparable for the purposes of determining psychoacoustic similarity between digital audio signals. A method and system are also provided for extracting features from digital audio signals and learning, from the extracted features, an artificial neural network model for generating latent-space representations of digital audio signals which take care of selecting salient attributes of the signals that represent psychoacoustic differences between the signals.
Methods and systems for manipulating audio properties of objects
In one implementation, a method of changing an audio property of an object is performed at a device including one or more processors coupled to non-transitory memory. The method includes displaying, using a display, a representation of a scene including a representation of an object associated with an audio property. The method includes displaying, using the display, in association with the representation of the object, a manipulator indicating a value of the audio property. The method includes receiving, using one or more input devices, a user input interacting with the manipulator. The method includes, in response to receiving the user input, changing the value of the audio property based on the user input and displaying, using the display, the manipulator indicating the changed value of the audio property.
Audio processing for detecting occurrences of loud sound characterized by brief audio bursts
A boundary of a highlight of audiovisual content depicting an event is identified. The audiovisual content may be a broadcast, such as a television broadcast of a sporting event. The highlight may be a segment of the audiovisual content deemed to be of particular interest. Audio data for the audiovisual content is stored, and the audio data is automatically analyzed to detect one or more audio events indicative of one or more occurrences to be included in the highlight. Each audio event may be a brief, high-energy audio burst such as the sound made by a tennis serve. A time index within the audiovisual content, before or after the audio event, may be designated as the boundary, which may be the beginning or end of the highlight.
Audio processing for detecting occurrences of loud sound characterized by brief audio bursts
A boundary of a highlight of audiovisual content depicting an event is identified. The audiovisual content may be a broadcast, such as a television broadcast of a sporting event. The highlight may be a segment of the audiovisual content deemed to be of particular interest. Audio data for the audiovisual content is stored, and the audio data is automatically analyzed to detect one or more audio events indicative of one or more occurrences to be included in the highlight. Each audio event may be a brief, high-energy audio burst such as the sound made by a tennis serve. A time index within the audiovisual content, before or after the audio event, may be designated as the boundary, which may be the beginning or end of the highlight.
METHODS AND SYSTEMS FOR COMPUTER-GENERATED VISUALIZATION OF SPEECH
Methods, systems and apparatuses for computer-generated visualization of speech are described herein. An example method of computer-generated visualization of speech including at least one segment includes: generating a graphical representation of an object corresponding to a segment of the speech; and displaying the graphical representation of the object on a screen of a computing device. Generating the graphical representation includes: representing a duration of the respective segment by a length of the object and representing intensity of the respective segment by a width of the object; and placing, in the graphical representation, a space between adjacent objects.
METHODS AND SYSTEMS FOR COMPUTER-GENERATED VISUALIZATION OF SPEECH
Methods, systems and apparatuses for computer-generated visualization of speech are described herein. An example method of computer-generated visualization of speech including at least one segment includes: generating a graphical representation of an object corresponding to a segment of the speech; and displaying the graphical representation of the object on a screen of a computing device. Generating the graphical representation includes: representing a duration of the respective segment by a length of the object and representing intensity of the respective segment by a width of the object; and placing, in the graphical representation, a space between adjacent objects.
HEADSET ENABLING EXTRAORDINARY HEARING
An apparatus includes a headphone driver and a processor in communication with the headphone driver. The processor is configured to receive an audio setting selection from among a plurality of audio setting selections. Each audio setting selection is associated with a frequency range that includes at least one frequency that is outside of a range of human hearing. The processor is further configured to receive an audio signal and to process the audio signal according to the selected audio setting selection to generate an output signal. The processor is configured to provide an output signal to the headphone driver.
HEADSET ENABLING EXTRAORDINARY HEARING
An apparatus includes a headphone driver and a processor in communication with the headphone driver. The processor is configured to receive an audio setting selection from among a plurality of audio setting selections. Each audio setting selection is associated with a frequency range that includes at least one frequency that is outside of a range of human hearing. The processor is further configured to receive an audio signal and to process the audio signal according to the selected audio setting selection to generate an output signal. The processor is configured to provide an output signal to the headphone driver.