Patent classifications
G06F16/685
METHOD, APPARATUS, ELECTRONIC DEVICE, AND COMPUTER-READABLE MEDIUM FOR DISPLAYING LYRIC EFFECTS
The present disclosure provides a method, an apparatus, an electronic device, and a computer-readable medium for displaying lyric effects. The method includes: obtaining image data and music data to be displayed; obtaining target lyrics corresponding to the target time point in the lyrics, the target lyrics being displayed in a plurality of colors; staggering the target lyrics displayed in the plurality of colors; superimposing the staggered target lyrics on a part of the image data corresponding thereto for display, and playing audio data corresponding to the target lyrics.
METHOD AND APPARATUS FOR LYRIC VIDEO DISPLAY, ELECTRONIC DEVICE, AND COMPUTER-READABLE MEDIUM
Provided are a method and an apparatus for lyric video display, an electronic device, and a computer-readable medium. The method includes: acquiring multimedia data to be displayed, the multimedia data including audio data and lyrics; determining a target time point, and acquiring a target lyric fragment corresponding to the target time point in the lyrics; and displaying the target lyric fragment in combination with a preset background, and playing a part of the audio data corresponding to the target lyric fragment.
AUDIO RECOGNITION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
A method includes obtaining a query content. The query content includes segment information representing a to-be-recognized audio. The method further includes selecting the preset quantity of candidate audios corresponding to the query content from a preset library. Each candidate audio includes a candidate audio segment matched with the segment information. The method further includes inputting the candidate audio segment into a trained detection model so as to obtain target segment information including the segment information and a target audio where the target segment information is located.
MULTI-FORMAT CONTENT REPOSITORY SEARCH
An audio file format of an audio portion of a natural language content is determined. Using a trained audio language identification model, a human language included in the audio portion is identified. Using a trained audio to text model trained on the human language, the audio portion is converted to a corresponding set of text data. The set of text data is indexed. Using the indexed set of text data responsive to a search query, a search result is generated, the search query specifying a search including a non-textual portion of the natural language content.
Automated clinical documentation system and method
A method, computer program product, and computing system for visual diarization of an encounter is executed on a computing device and includes obtaining encounter information of a patient encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information.
BACKGROUND AUDIO IDENTIFICATION FOR SPEECH DISAMBIGUATION
Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.
Spoken words analyzer
A lyrics analyzer generates tags and explicitness indicators for a set of tracks. These tags may indicate the genre, mood, occasion, or other features of each track. The lyrics analyzer does so by generating an n-dimensional vector relating to a set of topics extracted from the lyrics and then using those vectors to train a classifier to determine whether each tag applies to each track. The lyrics analyzer may also generate playlists for a user based on a single seed song by comparing the lyrics vector or the lyrics and acoustics vectors of the seed song to other songs to select songs that closely match the seed song. Such a playlist generator may also take into account the tags generated for each track.
Grouping Events and Generating a Textual Content Reporting the Events
Systems, methods and non-transitory computer readable media for grouping events and generating a textual content reporting the events are provided. An indication of a plurality of events may be received. A group of two or more events of the plurality of events may be identified. The group of events may not include at least a particular event of the plurality of events. A quantity associated with the group of events may be determined. A description of the group of events that includes an indication of the quantity may be generated. Data associated with the particular event may be analyzed to generate a description of the particular event. A textual content that includes the description of the group of events and the description of the particular event may be generated. The generated textual content may be provided.
Systems and methods for identifying and providing information about semantic entities in audio signals
Systems and methods for determining identifying semantic entities in audio signals are provided. A method can include obtaining, by a computing device comprising one or more processors and one or more memory devices, an audio signal concurrently heard by a user. The method can further include analyzing, by a machine-learned model stored on the computing device, at least a portion of the audio signal in a background of the computing device to determine one or more semantic entities. The method can further include displaying the one or more semantic entities on a display screen of the computing device.
Audio media playback user interface
The present disclosure generally relates to a media playback user interface. In some examples, the media playback user interface displays text corresponding to speech of audio content. In some examples, the media playback user interface facilitates management of bookmarks corresponding to the audio content. In some examples, the media playback user interface enables a search for text corresponding to speech of the audio content.