Patent classifications
H04N21/439
Systems, apparatus, and methods to improve watermark detection in acoustic environments
Methods, apparatus, systems, and articles of manufacture to improve watermark detection in acoustic environments are disclosed. An example apparatus includes at least one memory, instructions in the apparatus, and processor circuitry to execute and/or instantiate the instructions to encode a first symbol in a media file at a first symbol position on a first encoding layer of a multilayered watermark, and encode a second symbol in the media file at a second symbol position on a second encoding layer of the multilayered watermark, the first encoding layer and the second encoding layer including a plurality of symbol positions, one or more of the plurality of the symbol positions on at least one of the first encoding layer or the second encoding layer to be empty.
Display device
Provided is a display device including a display unit, a storage unit configured to store information on a web page, a microphone configured to receive a user's voice command, a network interface unit configured to perform communication with a natural language processing (NLP) server, and a controller configured to transmit text data of the voice command to the NLP server, to receive intention analysis result information corresponding to the voice command from the NLP server, to select, as a final candidate address, one of a plurality of candidate addresses related to a search word included in the received intention analysis result information if the search word is not stored in the storage unit, and to access a website corresponding to the selected final candidate address.
Media content identification on mobile devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Media content identification on mobile devices
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
Guide voice output control system and guide voice output control method
A guide voice output control system includes a voice output control unit having a function of outputting a guide voice in response to a trigger and a function of executing interaction related processing having a reception stage for receiving voice, a recognition stage for recognizing voice, and an output stage for outputting voice based on a recognition result, in which the voice output control unit controls the output of the guide voice according to the processing stage of the interaction related processing when the trigger is generated during the execution of the processing, and dynamically controls the output of the guide voice according to whether or not the processing stage is a stage that does not affect the accuracy of voice recognition or listening difficulty of a user even if the guide voice is output.
METHODS AND APPARATUS TO IDENTIFY USER PRESENCE TO A METER
Methods, apparatus, systems and articles of manufacture are disclosed to identify user presence to a meter. An example apparatus includes memory, instructions, and at least one hardware processor to execute the instructions to at least: obtain presence information from a configuration device, the configuration device separate from the apparatus, the presence information indicating that user is present at the apparatus; verify the presence information matches user information, the user information stored in a memory of the apparatus; cause a confirmation prompt to be displayed on the configuration device, the confirmation prompt to indicate the presence information was obtained by the apparatus; and store the presence information in memory.
METHODS AND SYSTEMS FOR SELECTIVE PLAYBACK AND ATTENUATION OF AUDIO BASED ON USER PREFERENCE
Systems and methods are presented for providing to filter unwanted sounds from a media asset. Voice profiles of a first character and a second character are generated based on a first voice signal and a second voice signal received from the media device during a presentation. The user provides a selection to avoid a certain sound or voice in association with the second character. During a presentation of the media asset, a second audio segment is analyzed to determine, based on the voice profile of the second character, whether the second voice signal includes the voice of a second character. If so, the second voice signal output characteristics are adjusted to reduce the sound.
COMMENTARY VIDEO GENERATION METHOD AND APPARATUS, SERVER, AND STORAGE MEDIUM
A commentary video generation method and apparatus, server, and storage medium. The method includes: obtaining a game instruction frame, the game instruction frame including at least one game operation instruction, and the game operation instruction being used for controlling a virtual object; generating a commentary data stream based on the game instruction frame, the commentary data stream including at least one piece of commentary audio describing a game event, and the game event being triggered during the virtual object performing the in-game behavior; rendering a game screen based on the game instruction frame to generate a game video stream, the game video stream including at least one game video frame; and combining the commentary data stream with the game video stream, the game video frame and the commentary audio corresponding to the same game event in the commentary video stream being aligned in time.
COMMENTARY VIDEO GENERATION METHOD AND APPARATUS, SERVER, AND STORAGE MEDIUM
A commentary video generation method and apparatus, server, and storage medium. The method includes: obtaining a game instruction frame, the game instruction frame including at least one game operation instruction, and the game operation instruction being used for controlling a virtual object; generating a commentary data stream based on the game instruction frame, the commentary data stream including at least one piece of commentary audio describing a game event, and the game event being triggered during the virtual object performing the in-game behavior; rendering a game screen based on the game instruction frame to generate a game video stream, the game video stream including at least one game video frame; and combining the commentary data stream with the game video stream, the game video frame and the commentary audio corresponding to the same game event in the commentary video stream being aligned in time.
System for the automated, context sensitive, and non-intrusive insertion of consumer-adaptive content in video
Described herein is a method and system for automated, context sensitive and non-intrusive insertion of consumer-adaptive content in video. It assesses ‘context’ in the video that a consumer is viewing through multiple modalities and metadata about the video. The method and system described herein analyzes relevance for a consumer based on multiple factors such as the profile information of the end-user, history of the content, social media and consumer interests and professional or educational background, through patterns from multiple sources. The system also implements local-context through search techniques for localizing sufficiently large, homogenous regions in the image that do not obfuscate protagonists or objects in focus but are viable candidate regions for insertion for the intended content. This makes relevant, curated content available to a user in the most effortless manner without hampering the viewing experience of the main video.