Patent classifications
G06V20/41
METHOD FOR WAREHOUSE STORAGE-LOCATION MONITORING, COMPUTER DEVICE, AND NON-VOLATILE STORAGE MEDIUM
The disclosure relates to a method for warehouse storage-location monitoring, a computer device, and a storage medium. The method includes the following. Video data of a warehouse storage-location area is obtained, and a target image corresponding to the warehouse storage-location area is obtained based on the video data, where the warehouse storage-location area includes an area of a storage-location and an area around the storage-location. The target image is detected based on a category detection model, to determine a category of each object appearing in the target image, where the category includes at least one of: human, vehicle, or goods. A detection result is obtained by detecting a status of each object based on the category of each object, where the detection result includes at least one of: whether the human enters the warehouse storage-location area, vehicle status information, or storage-location inventory information. The detection result is transmitted to a warehouse scheduling system, where the detection result is used for the warehouse scheduling system to monitor the warehouse storage-location area.
DYNAMIC SYMBOL-BASED SYSTEM FOR OBJECTS-OF-INTEREST VIDEO ANALYTICS DETECTION
Disclosed herein are system, method, and computer program product embodiments for a dynamic symbol-based system for objects-of-interest (OOI) video analytics detection. Some embodiments include instantiating one or more symbolic objects associated with one or more real world rules, defining an area of interest, associating one or more CV functions with the one or more symbolic objects, and identifying one or more video sources for which to apply the one or more CV functions. Some embodiments further include executing the one or more CV functions associated with the one or more symbolic objects to process the one or more video sources.
Display device for generating multimedia content, and operation method of the display device
A display apparatus for generating multimedia content and an operation method thereof are provided. The display apparatus includes a display, a memory storing one or more instructions, and a processor configured to execute the one or more instructions stored in the memory. The processor is configured to obtain plot information of the multimedia content, and generate sequence information including one or more sequences of the multimedia content corresponding to the plot information by using a first artificial intelligence (AI) model, generate scene information based on the sequence information by using a second AI model, generate the multimedia content based on the scene information, and control the display to output the multimedia content.
Quantum computing-based video alert system
A quantum computing based video alert system converts captured video and audio signals, in real time, into a sequence of video qubits and a sequence of audio qubits. An entanglement score is generated based on a comparison of the video qubits to historical video qubits that are verified to show malicious activity. A second entanglement score is generated based on a comparison of the audio qubits to historical audio qubits that are verified to show malicious activity. A probability score is generated for each segment of the video qubit sequence and for each segment of the audio qubit sequence. If the probability score for the video qubit sequence, the audio qubit sequence, or a combination of probability scores for both the video qubit sequence and the audio qubit sequence meet a threshold, then an alert is generated to identify possible malicious activity at the location of a CCTV camera capturing the real-time data.
PRIVACY PROTECTION FOR ELECTRONIC DEVICES IN PUBLIC SETTINGS
A system for protecting privacy of electronic material is provided. The system comprises an electronic device comprising processor and memory, a display screen, a front facing camera, and an application executing on the processor. The application captures initial video content of objects facing the device and establishes a safe environment based on the captured initial content. The application also continually captures video content after establishment of the safe environment. The application also detects a deviation from the safe environment based on the continually captured video content. The application also determines the deviation is a threat and performs an action to address the threat based on the determination. The safe environment includes at least a user of the device and persons known to the user. Deviations are detected by tracking human faces facing the display screen. The tracked human faces are of persons physically behind the user and proximate the user.
Pedestrian re-identification method and apparatus based on local feature attention
Disclosed are a pedestrian re-identification method and apparatus based on local feature attention. The method includes the following steps: S1: obtaining an original surveillance video image data set, and dividing the original surveillance video image data set into a training set and a test set in proportion; and S2: performing image enhancement on the original surveillance video image training set to obtain enhanced images, and converting the enhanced images into sequence data. The pedestrian re-identification technology based on local feature attention uses a multi-head attention mechanism neural network to capture, extract video image feature sequences and replace convolution kernels in a convolutional neural network, uses fully connected layers and an activation function to combine local pedestrian feature sequences into complete pedestrian feature sequences through a weight matrix, performs prediction on the obtained pedestrian feature sequences, outputs position coordinates of pedestrians in the images and selects pedestrians to realize pedestrian re-identification.
AUTOMATIC VISUAL MEDIA TRANSMISSION ERROR ASSESSMENT
A method or system is disclosed to assess transmission errors in a visual media input. Domain knowledge is obtained from the visual media input by content analysis, codec analysis, distortion analysis, and human visual system modeling. The visual media input is divided into partitions, which are passed into deep neural networks (DNNs). The DNN outputs of all partitions are combined with the guidance of domain knowledge to produce an assessment of the transmission error. In one or more illustrative examples, transmission error assessment at a plurality of monitoring points in a visual media communication system is collected and assessed, followed by quality control processes and statistical performance assessment on the stability of the visual communication system.
Systems and methods for video retrieval and grounding
Methods and systems are described for performing video retrieval together with video grounding. A word-based query for a video is and encoded into a query representation using a trained query encoder. One or more similar video representations are identified, from a plurality of video representations that are similar to the query representation. Each similar video representation represents a respective relevant video. A grounding is generated for each relevant video by forward propagating each respective similar video representation together with the query representation through a trained grounding module. The relevant videos or identifiers of the relevant videos are outputted together with the grounding generated for each relevant video.
Computer/human generation, validation and use of a ground truth map to enforce data capture and transmission compliance in real and near real time video of a local scene
A hybrid computer/human method for generating, validating and using a ground truth map (GTM) provides for enforcement of data capture and transmission compliance of real and near real time video. Computer-implemented processes are used to identify and classify as allowed or disallowed objects in a local scene based on attributes of a video session. A human interface is available to validate either the object identification or classification. The GTM is then used, preferably in conjunction with motion sense, to enforce data capture and transmission compliance of real and near real time video within the local scene.
Systems and methods for digital analysis, test, and improvement of customer experience
Disclosed are system and methods for digitally capturing, labeling, and analyzing data representing shared experiences between a service provider and a customer. The shared experience data is used to identify, test, and implement value-added improvements, enhancements, and augmentations to the shared experience and to monitor and ensure the quality of customer service. The improvements can be implemented as customer service process modifications, precision learning and targeted coaching for agents rendering customer service, process compliance monitoring, and as knowledge curation for a knowledge bot software application that facilitates automation of tasks and provides a natural language interface for accessing historical knowledge bases and solutions.