Patent classifications
G06F16/433
Remote Control Device with Environment Mapping
A remote control device for controlling devices in an environment can utilize an environment map and location information to accurately determine an intended device to provide control for multiple devices in an environment. The environment mapping can be performed using the remote control device including a plurality of sensors. A spatial map can be generated for an environment along with location information for controllable devices within the environment. The spatial map and location information can be stored on the remote control device. The mapping can allow the remote control device to quickly group devices or drag and drop content from one type of device to another type of device. The remote control device can perform search queries based on combinations of image and audio data in some examples.
METHODS, SYSTEMS AND COMPUTER PROGRAM PRODUCTS FOR MEDIA PROCESSING AND DISPLAY
The present disclosure relates generally to methods, systems and computer program products for classifying and identifying input data using neural networks and displaying results (e.g., images of vehicles, vehicle artifacts and geographical locations dating from the 1880s to present day and beyond). The results may be displayed on displays or in virtual environments such as on virtual reality, augmented reality and/or mixed-reality devices.
AUTOMATICALLY ENHANCING STREAMING MEDIA USING CONTENT TRANSFORMATION
A method includes receiving media content comprising audio data for distribution through content distribution platform that requires the media content to include video content, transforming the audio data into textual content, determining, based on a search of a searchable database, that the textual content of the audio data matches characteristics of visual data in the searchable database, integrating the visual data having the matched characteristics with the media content to create an augmented content stream in response to the determination that the textual content of the audio data matches the characteristics of the visual data, and distributing the augmented content stream through the content distribution platform that requires the media content to include video content.
Voice command system and voice command method
A voice command system according to a first disclosure comprises a gateway apparatus having an interface configured to receive a voice command, and a controller configured to perform a registration process of registering a speaker permitted to receive the voice command. The controller is configured to perform an authentication process of rejecting a reception of the voice command when a speaker of the voice command is not registered, and permitting a reception of the voice command when a speaker of the voice command is registered. The controller is configured to perform the authentication process for each voice command.
Three-dimensional room analysis with audio input
System and methods are provided that generate a three-dimensional model from a physical space. While a user is scanning and/or recording the physical space with a user computing device, user speech describing the physical space is recorded. A transcript is generated from the audio captured during the scan and/or image recording of the physical space. Keywords from the transcript are used to improve computer-vision object identification, which is incorporated in the three-dimensional model.
CONTENT-BASED MULTIMEDIA RETRIEVAL WITH ATTENTION-ENABLED LOCAL FOCUS
Examples of the present disclosure describe systems and methods for content-based multimedia retrieval with attention-enabled local focus. In aspects, a search query comprising multimedia content may be received by a search system. A first semantic embedding representation of the multimedia content may be generated. The first semantic embedding representation may be compared to a stored set of candidate semantic embedding representations of other multimedia content. Based on the comparison, one or more candidate representations that are visually similar to the first semantic embedding representation may be selected from the stored set of candidate semantic embedding representations. The candidate representations may be ranked, and top ‘N’ candidate representations (or corresponding multimedia items) may be retrieved and provided as search results for the search query.
Playlist trailers for media content playback during travel
Systems, devices, apparatuses, components, methods, and techniques for media content playback during travel are provided. An example method a method of generating a playlist trailer for a media playback device, the method comprising: identifying a plurality of media content items from a playlist for inclusion in a playlist trailer identifying snippets from the identified plurality of media content items; and combining the identified snippets to generate a playlist trailer.
DISTRIBUTED NETWORK RECORDING SYSTEM WITH SINGLE USER CONTROL
A master recording session at a server computer corresponds to a video content stored in memory accessible by the server computer. A first device and a second device are provided access to the master recording session and the master recording session is updated responsive to receipt of an update from the first device, where the update reflects initiation of playback of the video content at a time stamp corresponding to the timeline of the video content and includes an audio input configuration for the second device. The update is provided to the second device and an audio recording is received from the second device corresponding to a portion of the video content from the time stamp, where the audio recording is recorded by the second device using the implemented audio input configuration for the second device.
Systems and methods for displaying a context image for a multimedia asset
Systems and methods for displaying a context image for a multimedia asset are disclosed. In one embodiment, a system includes a programmable processor, and a display device. In some embodiments, the programmable processor is configured to identify a first multimedia asset being broadcast in a region, determine and retrieve a first context image associated with the first multimedia asset, and direct the display device to display the first context image during the broadcast of the first multimedia asset.
Methods, systems, and apparatuses to respond to voice requests to play desired video clips in streamed media based on matched close caption and sub-title text
Methods, Systems, and Apparatuses are described to implement voice search in media content for requesting media content of a video clip of a scene contained in the media content streamed to the client device; for capturing the voice request for the media content of the video clip to display at the client device wherein the streamed media content is a selected video streamed from a video source; for applying a NLP solution to convert the voice request to text for matching to a set of one or more words contained in at least close caption text of the selected video; for associating matched words to close caption text with a start index and an end index of the video clip contained in the selected video; and for streaming the video clip to the client device based on the start index and the end index associated with matched closed caption text.