Patent classifications
G10H2210/051
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media Content Identification on Mobile Devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Evaluation device and evaluation method
An evaluation device according to an embodiment includes an acquisition unit acquiring an input sound; a feature value calculation unit calculating a feature value from the input sound acquired by the acquisition unit; a detection unit detecting a break position corresponding to a starting point of each sound included in the input sound acquired by the acquisition unit based on the feature value calculated by the feature value calculation unit; and an evaluation value calculation unit calculating, based on a plurality of break positions detected by the detection unit, an evaluation value regarding a degree of temporal regularity of the plurality of break positions.
Media-media augmentation system and method of composing a media product
A media-content augmentation system includes a database with a multiplicity of media files and associated metadata. Each media file is mapped to at least one contextual theme defined by beginning and end timings. A processing system couples to the database; and an input couples to the processing system. The input is in the form of temporally-varying events data. The processing system resolves the input into one or more categorized contextual themes, correlates the themes with metadata associated with selected media files relevant to the themes, and then splices or fades together selected media files to reflect the events as the input varies with time, thus generating as an output, a media product in which transitions between media are aligned with the temporally-varying events. The database may contain sections of digital media files. A method aligns sections in digital media files with temporally-varying events data to compose a media product.
Auditory augmentation system and method of composing a media product
An auditory augmentation system includes a database with a multiplicity of audio sections and associated metadata for digital audio files. Each audio section is mapped to a contextual theme, each contextual theme being mapped to an audio section having an entry point and an exit point. The entry and exit points support seamless splice or fade transitions between different audio sections. A processing system couples to the database along with an input; the input is in the form of temporally-varying events data that defines a temporal input. The processing system resolves the temporal input into one or more of a plurality of categorized contextual themes, correlates the categorized contextual themes with metadata associated with selected audio sections relevant to the one or more categorized contextual themes, and splices or fades together selected audio sections, and generates, as an output, a media product in which transitions between audio sections are seamless.