G06F16/7328

EFFICIENT EXPLORER FOR RECORDED MEETINGS
20230029278 · 2023-01-26 ·

One example method includes generating a searchable video library. Video files are processed to extract text corresponding to the speech and to the images. The extracted text is semantically searched such that specific portions or locations of video files can be identified and returned in response to a query.

Automated programming of a remote control
11699341 · 2023-07-11 · ·

An electronic device that obtains a set of remote-control commands is described. During operation, the electronic device may receive an image associated with a second electronic device, where a brand and a model of the second electronic device are initially unknown to the electronic device. Then, the electronic device may perform image analysis on the image to determine at least the brand of the second electronic device. Moreover, the electronic device may access, based at least in part on the determined brand, the set of remote-control commands that are associated with the second electronic device. Next, the electronic device may store the set of remote-control commands in memory. Subsequently, when the electronic device receives user-interface activity information associated with a portable electronic device that specifies selection of the second electronic device, the electronic device may provide the set of remote-control commands to the second electronic device.

Method for Constructing Positioning DB Using Clustering of Local Features and Apparatus for Constructing Positioning DB
20220414150 · 2022-12-29 ·

A method of constructing a positioning DB performed by an apparatus for constructing the positioning DB, may comprise: extracting a plurality of local features from a plurality of keyframes capturing a predetermined region; determining an individual 3D keypoint including information on a 3-dimensional position of each of the plurality of local features; clustering the plurality of local features into a plurality of clusters based on the individual 3D keypoint; determining a representative position information representatively indicating a position of each of the plurality of clusters by using the individual 3D keypoint of the local feature included in each of the plurality of clusters; and storing, for each of the plurality of keyframes, an cluster identification for identifying each of the plurality of clusters and the representative position information of each of the plurality of clusters in the positioning DB.

METHODS, SYSTEMS, AND MEDIA FOR IMAGE SEARCHING
20220405322 · 2022-12-22 ·

Methods, systems, and media for image searching are described. Images comprising one query image and a plurality of candidate images are received. For each candidate image, a first model similarity measure from an output of a first model configured for scene classification to perceive scenes in the images is determined. Further, for each candidate image of the plurality of candidate images, a second model similarity measure from the output of a second model configured for attribute classification to perceive attributes in the images is determined. For each candidate image of the plurality of candidate images, a similarity agglomerate index of a weighted aggregate of the first model similarity measure and the second model similarity measure is computed. The plurality of candidate images based on the respective similarity agglomerate index of each candidate image are ranked and a first ranked candidate images corresponding to the searched images are generated.

Providing approximate top-k nearest neighbours using an inverted list

Various embodiments are provided for implementing an approximation nearest neighbour (ANN) search in a computing environment are provided. An approximation nearest neighbour (ANN) of a plurality of feature vectors in hyper-planes with dynamically variable subspaces by searching an inverted index may be retrieved.

Computer-implemented method, computer program and apparatus for generating a video stream recommendation
11594114 · 2023-02-28 · ·

A computer-implemented method of generating a video stream recommendation comprises identifying a plurality of peripheral devices monitoring zones of a physical area, the peripheral devices comprising a plurality of video cameras providing video streams of at least some of the monitored zones. The method further comprises querying a knowledge graph representing the peripheral devices and the monitored zones as ontology entities connected by edges representing physical paths between the monitored zones, and by edges representing which monitored zones the peripheral devices monitor, in order to identify a set of one or more video camera(s) monitoring zones other than a selected monitored zone, as a result of the querying. The method then comprises generating a video stream recommendation based on the result of the querying.

Adaptive search results for multimedia search queries

Certain embodiments involve adaptive search results for multimedia search queries to provide dynamic previews. For instance, a computing system receives a search query that includes a keyword. The computing system identifies, based on the search query, a video file having keyframes with content tags that match the search query. The computing system determines matching scores for respective keyframes of the identified video file. The computing system generates a dynamic preview from at least two keyframes having the highest matching scores.

Media item matching using search query analysis
11487806 · 2022-11-01 · ·

A system and method are disclosed for media item matching using search query analysis. In an implementation, the method includes identifying, by a processing device, a first media item that has been removed from a media hosting platform due to a removal request associated with a reference media item of a first media owner; identifying, by the processing device, a search query corresponding to the first media item based on a history of search queries, wherein a search result of the search query included the first media item; obtaining, by the processing device, one or more additional media items included in the search result of the search query; and providing the one or more additional media items to the first media owner to determine whether to initiate one or more actions regarding the one or more additional media items.

QUESTION ANSWERING APPARATUS AND METHOD

A question answering method that is performed by a question answering apparatus includes: receiving a data set including video content and question-answer pairs; generating input time-series sequences from the video content of the input data set and also generating a question-answer time-series sequence from the question-answer pair of the input data set; calculating weights by associating the input time-series sequence with the question-answer time-series sequence and also calculating first result values by performing operations on the calculated weights and the input time-series sequences; calculating second result values by paying attention to portions of the input time-series sequences that are directly related to characters appearing in questions and answers; and calculating third result values by concatenating the time-series sequences, the first result values, the second result values, and Boolean flags and selecting a final answer based on the third result values.

METHOD AND SYSTEM FOR RETRIEVING VIDEO SEGMENT BY A SEMENTIC QUERY

Provided is a method of detecting a semantics section in a video. The method includes extracting all video features by inputting an inputted video to a pre-trained first deep neural network algorithm, extracting a query sentence feature by inputting an inputted query sentence to a pre-trained second deep neural network algorithm, generating video-query relation integration feature information in which all of the video features and the query sentence feature have been integrated by inputting all of the video features and the query sentence feature to a plurality of scaled-dot product attention layers, and estimating a video segment corresponding to the query sentence in the video based on the video-query relation integration feature information.