Patent classifications
G06F16/634
SOUND SIGNAL DATABASE GENERATION APPARATUS, SOUND SIGNAL SEARCH APPARATUS, SOUND SIGNAL DATABASE GENERATION METHOD, SOUND SIGNAL SEARCH METHOD, DATABASE GENERATION APPARATUS, DATA SEARCH APPARATUS, DATABASE GENERATION METHOD, DATA SEARCH METHOD, AND PROGRAM
To provide database generation techniques that can accurately and efficiently generate a database useable in text-based sound signal search. A sound signal database generation apparatus includes: a latent variable generation unit that generates, from a sound signal, a latent variable corresponding to the sound signal using a sound signal encoder; a data generation unit that generates a natural language representation corresponding to the sound signal from the latent variable and a condition concerning an index for a natural language representation using a natural language representation decoder; and a sound signal database generation unit that generates a record including the natural language representation corresponding to the sound signal and the sound signal from the natural language representation corresponding to the sound signal and the sound signal, and generates a sound signal database made up of the record.
Media content identification and playback
Systems, devices, apparatuses, components, methods, and techniques for identifying and playing media content are provided. An example media-playback device for identifying and playing media content for a user traveling in a vehicle includes an audio identification engine and a media playback engine. Audio content is recorded and identified by comparison to media content databases. The audio content is identified and immediately played on the same device. Additional media content is selected for playback based on user listening preferences.
Systems and methods for identifying a media asset from an ambiguous audio indicator
Systems and methods are disclosed herein for identifying a media asset in response to an ambiguous input. The media guidance application may detect a portion of music provided by a user, e.g., a melody from user humming. The media guidance application may retrieve information about the user's location for a predetermined time period prior to detecting the portion of music. The media guidance application may then determine content accessible by the user at the location, e.g., a commercial played at a display screen at a train station when the user was waiting for the train, to identify the media asset corresponding to the user humming.
MUSIC STREAMING, PLAYLIST CREATION AND STREAMING ARCHITECTURE
A system and method for making categorized music tracks available to end user applications. The tracks may be categorized based on computer-derived rhythm, texture and pitch (RTP) scores for tracks derived from high-level acoustic attributes, which is based on low level data extracted from the tracks. RTP scores are stored in a universal database common to all of the music publishers so that the same track, once RTP scored, does not need to be re-RTP scored by other music publishers. End user applications access an API server to import collections of tracks published by publishers, to create playlists and initiate music streaming. Each end user application is sponsored by a single music publisher so that only tracks capable of being streamed by the music publisher are available to the sponsored end user application.
Intelligent automated assistant for media exploration
Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a first natural-language speech input indicative of a request for media, where the first natural-language speech input comprises a first search parameter; providing, by a digital assistant, a first media item identified based on the first search parameter. The method further includes, while providing the first media item, receiving a second natural-language speech input and determining whether the second input corresponds to a user intent of refining the request for media. The method further includes, in accordance with a determination that the second speech input corresponds to a user intent of refining the request for media: identifying, based on the first parameter and the second speech input, a second media item and providing the second media item.
System and method for generating a playlist from a mood gradient
Systems and methods for generating and playing a sequence of media objects based on a mood gradient are also disclosed. A mood gradient is a sequence of items, in which each item is media object having known characteristics or a representative set of characteristics of a media object, that is created or used by a user for a specific purpose. Given a mood gradient, one or more new media objects are selected for each item in the mood gradient based on the characteristics associated with that item. In this way, a sequence of new media objects is created but the sequence exhibits a similar variation in media object characteristics. The mood gradient may be presented to a user or created via a display illustrating a three-dimensional space in which each dimension corresponds to a different characteristic. The mood gradient may be represented as a path through the three-dimensional space and icons representing media objects are located within the three-dimensional space based on their characteristics.
Accessibility management system for media content items
A system operates to manage accessibility of media content items based on a user's performance of a repetitive motion activity. The system can generate rule data based on a rule designed to permit access to certain media content items. The rule data can include information about various conditions to be satisfied to make the media content items accessible for playback. Such conditions can be associated with a user's performance or status of a repetitive motion activity.
METHODS AND APPARATUS TO IDENTIFY MEDIA THAT HAS BEEN PITCH SHIFTED, TIME SHIFTED, AND/OR RESAMPLED
Methods, apparatus, systems and articles of manufacture are disclosed to identify media that has been pitch shifted, time shifted, and/or resampled. An example apparatus includes: memory; instructions in the apparatus; and processor circuitry to execute the instructions to: transmit a fingerprint of an audio signal and adjusting instructions to a central facility to facilitate a query, the adjusting instructions identifying at least one of a pitch shift, a time shift, or a resample ratio; obtain a response including an identifier for the audio signal and information corresponding to how the audio signal was adjusted; and change the adjusting instructions based on the information.
Information processing apparatus, information processing method and information processing program
The present invention provides an information processing apparatus which can direct a user to a playlist different from a playlist being reproduced. There is provided the information processing apparatus including a content storage unit storing a plurality of contents therein, a playlist storage unit storing a plurality of playlists which is related to at least some of the plurality of contents, a reproducing unit sequentially reproducing a plurality of contents belonging to a first playlist in a plurality of playlists, a candidate content extracting unit extracting one or more candidate contents relating to a content being reproduced by the reproducing unit from the content storage unit, a playlist extracting unit extracting a second playlist to which the extracted candidate contents belong from the playlist storage unit, and a playlist switching unit switching a playlist to be reproduced by the reproducing unit from the first playlist into the second playlist.
SYSTEMS AND METHODS FOR IDENTIFYING DYNAMIC TYPES IN VOICE QUERIES
The system receives a voice query at an audio interface and converts the voice query to text. The system identifies entities included in the query based on comparison to an information graph, as well as dynamic types based on the structure and format of the query. The system can determine dynamic types by analyzing parts of speech, articles, parts of speech combinations, parts of speech order, influential features, and comparisons of these aspects to references. The system combines tags associated with the identified entities and tags associated with the dynamic types to generate query interpretations. The system compares the interpretations to reference templates, and selects among the query interpretations using predetermined criteria. A search query is generated based on the selected interpretation. The system retrieves content or associated identifiers, updates metadata, updates reference information, or a combination thereof. Accordingly, the system responds to queries that include non-static types.