Patent classifications
G10L21/0316
AI-BASED DJ SYSTEM AND METHOD FOR DECOMPOSING, MISING AND PLAYING OF AUDIO DATA
The present invention relates to a method for processing and playing audio data comprising the steps of receiving mixed input data and playing recombined output data. Furthermore, the invention relates to a device 10 for processing and playing audio data, preferably DJ equipment, comprising an audio input unit for receiving a mixed input signal, a recombination unit 32 and a playing unit 34 for playing recombined output data. In addition, the present invention relates to a method and a device for representing audio data, i.e. on a display.
COMPENSATION FOR FACE COVERINGS IN CAPTURED AUDIO
The technology disclosed herein enables compensation for attenuation caused by face coverings in captured audio. In a particular embodiment, a method includes determining that a face covering is positioned to cover the mouth of a user of a user system. The method further includes receiving audio that includes speech from the user and adjusting amplitudes of frequencies in the audio to compensate for the face covering.
SYSTEM FOR DELIVERABLES VERSIONING IN AUDIO MASTERING
Some implementations of the disclosure relate to using a model trained on mixing console data of sound mixes to automate the process of sound mix creation. In one implementation, a non-transitory computer-readable medium has executable instructions stored thereon that, when executed by a processor, causes the processor to perform operations comprising: obtaining a first version of a sound mix; extracting first audio features from the first version of the sound mix obtaining mixing metadata; automatically calculating with a trained model, using at least the mixing metadata and the first audio features, mixing console features; and deriving a second version of the sound mix using at least the mixing console features calculated by the trained model.
ROOM SOUNDS MODES
Example techniques described herein involve a media playback system of one or more playback devices that are operable in a plurality of modes. Operating in a given mode may enhance a use case corresponding to the mode. For instance, the plurality of modes may include a foreground mode, which may enhance active listening to the playback device. The plurality of modes may also include a background mode, which may enhance passive listening to the playback device by facilitating other activities during passive listening. In some example implementations, the plurality of modes are non-contemporary; when operating in one mode, the playback device will not be operating in the other modes, and vice versa.
Modulation of packetized audio signals
Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.
Modulation of packetized audio signals
Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.
Respiration monitoring devices, systems and processes for making the same
A device for directing respired air includes a frame having a top portion, a bottom portion opposite the top portion, and at least one shoulder disposed between the top portion and the bottom portion to receive a portion of a headset. The device further includes an attachment mechanism coupled to the frame for releasably securing the frame to the headset. In addition, the device also includes a wall surface downwardly depending from the bottom portion of the frame to form a curved baffle. The curved baffle directs air corresponding to respiration toward the bottom portion of the frame, and thus toward an input interface of the headset when the frame is releasably secured to the headset.
Respiration monitoring devices, systems and processes for making the same
A device for directing respired air includes a frame having a top portion, a bottom portion opposite the top portion, and at least one shoulder disposed between the top portion and the bottom portion to receive a portion of a headset. The device further includes an attachment mechanism coupled to the frame for releasably securing the frame to the headset. In addition, the device also includes a wall surface downwardly depending from the bottom portion of the frame to form a curved baffle. The curved baffle directs air corresponding to respiration toward the bottom portion of the frame, and thus toward an input interface of the headset when the frame is releasably secured to the headset.
Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.
Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.