Patent classifications
G06F16/73
Multi-detector probabilistic reasoning for natural language queries
Systems and methods for solving queries on image data are provided. The system includes a processor device coupled to a memory device. The system includes a detector manager with a detector application programming interface (API) to allow external detectors to be inserted into the system by exposing capabilities of the external detectors and providing a predetermined way to execute the external detectors. An ontology manager exposes knowledge bases regarding ontologies to a reasoning engine. A query parser transforms a natural query into query directed acyclic graph (DAG). The system includes a reasoning engine that uses the query DAG, the ontology manager and the detector API to plan an execution list of detectors. The reasoning engine uses the query DAG, a scene representation DAG produced by the external detectors and the ontology manager to answer the natural query.
Multi-detector probabilistic reasoning for natural language queries
Systems and methods for solving queries on image data are provided. The system includes a processor device coupled to a memory device. The system includes a detector manager with a detector application programming interface (API) to allow external detectors to be inserted into the system by exposing capabilities of the external detectors and providing a predetermined way to execute the external detectors. An ontology manager exposes knowledge bases regarding ontologies to a reasoning engine. A query parser transforms a natural query into query directed acyclic graph (DAG). The system includes a reasoning engine that uses the query DAG, the ontology manager and the detector API to plan an execution list of detectors. The reasoning engine uses the query DAG, a scene representation DAG produced by the external detectors and the ontology manager to answer the natural query.
Server and operating method thereof
An operating method of a server, comprising, mediating a video call session between a first terminal and a second terminal; receiving satisfaction information on the video call session of a user of the first terminal from the first terminal; preparing combination of a specific type of personal information about a user of the first terminal and facial characteristic information about a user of the second terminal; calculating a correlation between the combination and the satisfaction information corresponding to the combination; receiving a video call mediating request from a third terminal and from a plurality of candidate terminals, and predicting satisfaction information of a user of the third terminal with respect to each of the candidate terminals; and choosing a fourth terminal among the candidate terminals.
Server and operating method thereof
An operating method of a server, comprising, mediating a video call session between a first terminal and a second terminal; receiving satisfaction information on the video call session of a user of the first terminal from the first terminal; preparing combination of a specific type of personal information about a user of the first terminal and facial characteristic information about a user of the second terminal; calculating a correlation between the combination and the satisfaction information corresponding to the combination; receiving a video call mediating request from a third terminal and from a plurality of candidate terminals, and predicting satisfaction information of a user of the third terminal with respect to each of the candidate terminals; and choosing a fourth terminal among the candidate terminals.
METHOD OF PROCESSING VIDEO, METHOD OF QUERING VIDEO, AND METHOD OF TRAINING MODEL
The present application provides a method of processing a video, a method of querying a video, and a method of training a video processing model. A specific implementation solution of the method of processing the video includes: extracting, for a video to be processed, a plurality of video features under a plurality of receptive fields; extracting a local feature of the video to be processed according to a video feature under a target receptive field in the plurality of receptive fields; obtaining a global feature of the video to be processed according to a video feature under a largest receptive field in the plurality of receptive fields; and merging the local feature and the global feature to obtain a target feature of the video to be processed.
METHOD OF PROCESSING VIDEO, METHOD OF QUERING VIDEO, AND METHOD OF TRAINING MODEL
The present application provides a method of processing a video, a method of querying a video, and a method of training a video processing model. A specific implementation solution of the method of processing the video includes: extracting, for a video to be processed, a plurality of video features under a plurality of receptive fields; extracting a local feature of the video to be processed according to a video feature under a target receptive field in the plurality of receptive fields; obtaining a global feature of the video to be processed according to a video feature under a largest receptive field in the plurality of receptive fields; and merging the local feature and the global feature to obtain a target feature of the video to be processed.
Systems and methods for determining playback points in media assets
Systems and methods are described for determining playback points in media assets based on both a keyword and a context of a current playback point in a media asset. For example, in response to user input of a keyword (e.g., “Matt Damon”) while the user is consuming a media asset, a current playback point in the media asset is determined. Context of the media asset at the current playback point is then determined (e.g., the current playback point involves a car chase). Playback points in the media asset are determined that match both the context and the keyword and are presented to the user (e.g., playback points with Matt Damon in a car chase).
Systems and methods for determining playback points in media assets
Systems and methods are described for determining playback points in media assets based on both a keyword and a context of a current playback point in a media asset. For example, in response to user input of a keyword (e.g., “Matt Damon”) while the user is consuming a media asset, a current playback point in the media asset is determined. Context of the media asset at the current playback point is then determined (e.g., the current playback point involves a car chase). Playback points in the media asset are determined that match both the context and the keyword and are presented to the user (e.g., playback points with Matt Damon in a car chase).
MEDIA STORAGE
A user of a storage system can upload files for a media asset, which can include a high quality media file and various related files. As part of the upload process, the storage system can extract metadata that describes the media asset. The user can specify one or more lifecycle policies to be applied for storage of the asset, and a rules engine can ensure the application of the one or more policies. The rules engine can also enable the use of simple media processing workflows. A filename hashing approach can be used to ensure that the segments and files for the asset are stored in a relatively random and even distribution across the partitions of the storage system. As part of the lifecycle for the asset, the high quality media file can be moved to less expensive storage once transcoding of the asset or another such action occurs.
MEDIA STORAGE
A user of a storage system can upload files for a media asset, which can include a high quality media file and various related files. As part of the upload process, the storage system can extract metadata that describes the media asset. The user can specify one or more lifecycle policies to be applied for storage of the asset, and a rules engine can ensure the application of the one or more policies. The rules engine can also enable the use of simple media processing workflows. A filename hashing approach can be used to ensure that the segments and files for the asset are stored in a relatively random and even distribution across the partitions of the storage system. As part of the lifecycle for the asset, the high quality media file can be moved to less expensive storage once transcoding of the asset or another such action occurs.