G10H2250/235

Media content identification on mobile devices

A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.

System for creating, practicing and sharing of musical harmonies

Collaboratively creating musical harmonies includes receiving a user selection of a particular harmony. In response to this selection, there is displayed on a display screen of a computing device a plurality of musical note indicators or notes to specify a first harmony part of a musical piece to be performed. Real-time pitch detection is used to determine a pitch of each note which is voiced by a person, and a graphic indication of the actual pitch which is sung is displayed in conjunction with the musical note indicators.

Singing voice separation with deep u-net convolutional networks

A system, method and computer product for training a neural network system. The method comprises applying an audio signal to the neural network system, the audio signal including a vocal component and a non-vocal component. The method also comprises comparing an output of the neural network system to a target signal, and adjusting at least one parameter of the neural network system to reduce a result of the comparing, for training the neural network system to estimate one of the vocal component and the non-vocal component. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate vocal or instrumental components of an audio signal, depending on which type of component the system is trained to estimate.

Singing voice separation with deep U-Net convolutional networks

A system, method and computer product for training a neural network system. The method comprises applying an audio signal to the neural network system, the audio signal including a vocal component and a non-vocal component. The method also comprises comparing an output of the neural network system to a target signal, and adjusting at least one parameter of the neural network system to reduce a result of the comparing, for training the neural network system to estimate one of the vocal component and the non-vocal component. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate vocal or instrumental components of an audio signal, depending on which type of component the system is trained to estimate.

TIMBRE CREATION SYSTEM

A timbre creation method, system, and computer program product include performing a timbre analysis of a sound from an input source to generate a digital fingerprint of the sound, performing deep learning to create a patch that matches the digital fingerprint, and generating a second patch for a synthesizer which reproduces a timbre that complements the digital fingerprint based on the patch.

Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
10902831 · 2021-01-26 · ·

Methods and apparatus to classify media based on a pitch-independent timbre attribute from a media signal are disclosed. An example apparatus includes means for accessing a media signal; and means for: determining a spectrum of audio corresponding to the media signal; and determining a timbre-independent pitch attribute of audio of the media signal based on an inverse transform of a complex argument of a transform of the spectrum.

METHOD OF DISPLAYING LIGHT WITH THE RHYTHM OF MUSIC
20240008154 · 2024-01-04 ·

A method of displaying light with the rhythm of music that uses a host system to control display units to display light-emitting colors along with music. The control unit of each display unit controls display elements to operate separately or synchronously. The processor of the host system analyzes the transitions of intro, verse, and hook segments to play the rhythm of the music melody. The host system is used to transmit the display signal to the display units in conjunction with the music to be played, to change various display lighting methods, so as to achieve the effect of displaying light with the rhythm of the music.

AUDIO SYNTHESIS METHOD, COMPUTER APPARATUS, AND STORAGE MEDIUM
20200410975 · 2020-12-31 · ·

The present disclosure relates to an audio synthesis method, a computer apparatus and storage medium for synthesizing the audio. The method includes: obtaining an original audio; identifying a rhythm point in the original audio, and labeling an audio effect area in the original audio according to the rhythm point; obtaining an audio effect audio corresponding to the audio effect area, and synthesizing an audio effect of the audio effect audio into the audio effect area of the original audio to obtain a synthesized audio.

Algorithm-based audio optimization method, intelligent terminal and storage device

The invention discloses an algorithm-based audio optimization method, an intelligent terminal and a storage device. The method includes steps of converting an original audio file on time domain into an audio file on frequency domain through Fourier transform; before extracting and matching frequency range and amplitude information of an audio signal with those of different types of existing audio test standard sound sources to determine type of the audio signal; matching corresponding frequency mapping function by function library for transform to obtain a processed audio file, obtaining an optimized audio file by inverse Fourier transform. The present invention searches a similar type of sound source by comparing and matching original audio with standard sound sources, determines the frequency mapping function for transform, maps the frequency, compresses or expands relevant frequency, and automatically performs tuning to achieve effect of improving sound quality.

Humbucking pair building block circuit for vibrational sensors
20200365129 · 2020-11-19 ·

This invention eliminates most mechanical switching in vibrational pickup circuits by using variable gains to combine signals of sensors in differential amplifiers as J1 humbucking pairs for J>1 number of sensors, with the sensors matched to produce the same level and phase of unwanted hum from external sources. It can also combine J>1 number of matched sensors with K>1 number of dissimilar sensors which are matched only to each other in the same manner. This produces not only all the possible mechanically switched humbucking signals, but all the continuously-varying combinations of humbucking signals in between.