Patent classifications
G06F16/7847
Method and system for performing content-aware deduplication of video files
The invention relates to a method and system for performing content-aware deduplication of video files and content storage cost optimization. The method includes pre-processing video files into a plurality of groups of video files based on type of genre and run-time of a video. The genre of a plurality of video files is automatically detected using a sliding-window similarity index, which is utilized to improve accuracy of genre detection. After the pre-processing step, each group of the plurality of groups of video files are simultaneously fed into a plurality of machine learning (ML) instances and models which measure a degree of similarity corresponding to each group of video files by detecting one or more conditions that exists in the video files. The one or more conditions are detected by performing deep inspection of content in the video files using hash-based active recognition of objects.
Media fingerprinting and identification system
The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy.
Surgical video retrieval based on preoperative images
A system includes a display, and a database including surgical videos, images of organs in a human body obtained from a medical imaging device, and images of disease in a human body obtained from the medical imaging device. A controller including a processor is coupled to memory, the database, and the display, and the memory stores information that when executed by the processor causes the system to perform operations. For example, the processor may determine first organ information from the images of the organs, and first disease information from the images of the disease. The processor my calculate a similarity score between the first organ information and the first disease information and second disease information and second organ information indexed to the surgical videos. The processor selects one or more of the surgical videos based on the similarity score, and displays the surgical videos on the display.
Method and apparatus for multi-dimensional content search and video identification
A multi-dimensional database and indexes and operations on the multi-dimensional database are described which include video search applications or other similar sequence or structure searches. Traversal indexes utilize highly discriminative information about images and video sequences or about object shapes. Global and local signatures around keypoints are used for compact and robust retrieval and discriminative information content of images or video sequences of interest. For other objects or structures relevant signature of pattern or structure are used for traversal indexes. Traversal indexes are stored in leaf nodes along with distance measures and occurrence of similar images in the database. During a sequence query, correlation scores are calculated for single frame, for frame sequence, and video clips, or for other objects or structures.
GENERATING CONGRUOUS METADATA FOR MULTIMEDIA
A method of generating congruous metadata is provided. The method includes receiving a similarity measure between at least two multimedia objects. Each multimedia object has associated metadata. If the at least two multimedia objects are similar based on the similarity measure and a similarity threshold, the associated metadata of each of the multimedia objects are compared. Then, based on the comparison of the associated metadata of each of the at least two multimedia objects, the method further includes generating congruous metadata. Metadata may be tags, for example.
Temporal localization of mature content in long-form videos using only video-level labels
Techniques for temporal localization of mature content in long-form videos using only video-level labels are described. According to some embodiments, computer-implemented method includes receiving a request to train a machine learning model on a training video file comprising at least one mature content label, training the machine learning model to generate a feature vector for each of a plurality of video frames of the training video file, generate a plurality of frame-level mature content classification scores of the training video file from the feature vectors of the training video file, and generate a video-level mature content classification score of the training video file from the plurality of frame-level mature content classification scores for the training video file based at least in part on the at least one mature content label of the training video file, receiving a request for an input video file, generating, by the machine learning model in response to the request, a feature vector for each of a plurality of video frames of the input video file, a plurality of frame-level mature content classification scores of the input video file from the feature vectors of the input video file, and a video-level mature content classification score of the input video file from the plurality of frame-level mature content classification scores for the input video file, and transmitting the plurality of frame-level mature content classification scores of the input video file or the video-level mature content classification score of the input video file to a client application or to a storage location.
Method and system for manufacturing operations workflow monitoring using structural similarity index based activity detection
The present invention discloses a method and a system for monitoring manufacturing operation workflow using Structural Similarity (SSIM) index based activity detection. The method comprising receiving video data corresponding to a manufacturing operation activity, extracting a plurality of video frames from the video data, measuring SSIM index for each video frame of the plurality of video frames with respect to next consecutive video frame of the plurality of video frames, comparing the SSIM index of the each video frame with the SSIM index of next consecutive video frame of the plurality of video frames to identify one or more local maxima, and determining at least one manufacturing operation activity based on the one or more local maxima using machine learning technique.
Systems and methods for generating bookmark video fingerprint
Systems and methods for replacing original media bookmarks of at least a portion of a digital media file with replacement bookmarks is described. A media fingerprint engine detects the location of the original fingerprints associated with the portion of the digital media file and a region analysis algorithm characterizes regions of media file spanning the location of the original bookmarks by data class types. The replacement bookmarks are associated with the data class types and are overwritten or otherwise are substituted for the original bookmarks. The replacement bookmarks then are subjected to a fingerprint matching algorithm that incorporates media timeline and media related metadata.
Media fingerprinting and identification system
The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy.
Media fingerprinting and identification system
The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy.