Patent classifications
G06F16/7844
Assigning case identifiers to video streams
A process mining system performs process mining using visual logs generated from video streams of worker devices. Specifically, for a given worker device, the process mining system obtains a series of images capturing a screen of a worker device while the worker device processes one or more tasks related to an operation process. The process mining system determines activity labels for a plurality of images. An activity label for an image may indicate an activity performed on the worker device when the image was captured. The activity label is determined by extracting information from pixels of the image and inferring the activity of the worker device from the extracted information. The process mining system generates event logs from the visual logs of worker devices and uses the event logs for process mining.
Content Analysis to Enhance Voice Search
Methods and apparatus for improving speech recognition accuracy in media content searches are described. An advertisement for a media content item is analyzed to identify keywords that may describe the media content item. The identified keywords are associated with the media content item for use during a voice search to locate the media content item. A user may speak the one or more of the keywords as a search input and be provided with the media content item as a result of the search.
Electronic apparatus, document displaying method thereof and non-transitory computer readable recording medium
The disclosure relates to an artificial intelligence (AI) system using a machine learning algorithm such as deep learning, and an application thereof. In particular, an electronic apparatus, a document displaying method thereof, and a non-transitory computer readable recording medium are provided. An electronic apparatus according to an embodiment of the disclosure includes a display unit displaying a document, a microphone receiving a user voice, and a processor configured to acquire at least one topic from contents included in a plurality of pages constituting the document, recognize a voice input through the microphone, match the recognized voice with one of the acquired at least one topic, and control the display unit to display a page including the matched topic.
Video Title Generation Method, Device, Electronic Device and Storage Medium
Provided are a video title generation method, an electronic device and a storage medium, which relate to a technical field of video, and in particular to a technical field of short video. The method includes: obtaining a plurality of pieces of optional text information, for a first video file; determining central text information, from the plurality of pieces of optional text information, the central text information being optional text information with the highest similarity to content of the first video file; and determining the central text information as a title of the first video file. That is, an interest point in an original video file can be determined according to user's interactive behavior data on the original video file, and the original video file can be clipped based on the interest point to obtain a plurality of clipped video files, namely, short videos.
GENERATING VERIFIED CONTENT PROFILES FOR USER GENERATED CONTENT
Systems and methods for searching, identifying, scoring, and providing access to companion media assets for a primary media asset are disclosed. In response to a request for companion content, metadata within a predefined time period of a play position when the request was made, is downloaded. A dynamic search template that contains search parameters based on the downloaded metadata is generated. In response to the search conducted using the search template, a plurality of companion media assets are identified and then verified. A trust score for the companion media asset is accessed. The trust score may be analyzed and modified based on its contextual relationship to the play position of the primary media asset. If the trust score is within a rating range, then a link to access the companion media asset, or a specific segment or play position within the companion media asset, is provided.
VIDEO GENERATION
A video generation method is provided. The video generation method includes: obtaining global semantic information and local semantic information of a text, where the local semantic information corresponds to a text fragment in the text, searching, based on the global semantic information, a database to obtain at least one first data corresponding to the global semantic information; searching, based on the local semantic information, the database to obtain at least one second data corresponding to the local semantic information; obtaining, based on the at least one first data and the at least one second data, a candidate data set; matching, based on a relevancy between each of at least one text fragment and corresponding candidate data in the candidate data set, target data for the at least one text fragment; and generating, based on the target data matched with each of the at least one text fragment, a video.
Relevance-based image selection
A system, computer readable storage medium, and computer-implemented method presents video search results responsive to a user keyword query. The video hosting system uses a machine learning process to learn a feature-keyword model associating features of media content from a labeled training dataset with keywords descriptive of their content. The system uses the learned model to provide video search results relevant to a keyword query based on features found in the videos. Furthermore, the system determines and presents one or more thumbnail images representative of the video using the learned model.
SYSTEMS AND METHODS FOR INSERTING EMOTICONS WITHIN A MEDIA ASSET
Systems and methods are described herein for inserting emoticons within a media asset based on an audio portion of the media asset. Each audio portion of a media asset is associated with a respective part of speech, and an emotion corresponding to the audio portion for the media asset is determined. A corresponding emoticon is identified based on the determined emotion in the audio portion and causing to be presented at the location within the media asset.
Facilitating contextual video searching using user interactions with interactive computing environments
A method includes detecting control of an active content creation tool of an interactive computing system in response to a user input received at a user interface of the interactive computing system. The method also includes automatically updating a video search query based on the detected control of the active content creation tool to include context information about the active content creation tool. Further, the method includes performing a video search of video captions from a video database using the video search query and providing search results of the video search to the user interface of the interactive computing system.
VIDEO PROCESSING OPTIMIZATION AND CONTENT SEARCHING
Techniques are disclosed for automatic scene detection and character extraction. In one example, audiovisual content with video frames, an audio recording, and timing information is received. A score, based on the frame's visual characteristics, is determined for a first frame and subsequent frames. The first frame's score and subsequent frame's scores are compared to determine if the difference between the scores is above a threshold. When the difference in scores is above a threshold, the subsequent frame is classified as a new scene. The audiovisual content is segmented into scenes and textual characters are identified in at least one frame from each scene. The characters are stored and indexed in a searchable database with the timing information for the scene where the characters were identified. The audio recording is transcribed and the transcribed words are stored and indexed in the searchable database with timing information.