G06V30/2276

Alignment of video and textual sequences for metadata analysis

Systems, methods and computer program products related to aligning heterogeneous sequential data are disclosed. Video data in a media presentation and textual data corresponding to content of the media presentation are received. An action related to aligning the video data and the textual data is determined using an alignment neural network, such that the video data and the textual data are at least partially aligned following the action. The alignment neural network includes a first fully connected layer that receives as input the video data, the textual data, and data relating to a previously determined action by the alignment neural network related to aligning the video data and the textual data. The determined action related to aligning the video data and the textual data is performed.

METHODS AND SYSTEMS FOR MONITORING POTENTIAL LOSSES IN A RETAIL ENVIRONMENT
20210042509 · 2021-02-11 ·

Examples described herein generally relate to a system for monitoring customers in a retail environment. The system includes a plurality of cameras located in different regions of the retail environment, each camera configured to capture a video feed of a respective region. The system includes a computer system comprising a memory and a processor. The system provides the video feed of at least one region of the retail environment to a plurality of machine learning classifiers, each machine learning classifier trained on labeled videos to classify a sequence of images of a customer into a probability certainty of a respective activity being performed by the customer. The system applies the probability certainties of the respective activities of a customer to a set of business rules to determine whether customer activities identified by the probability certainties indicate suspicious behavior. The system provides a notification of the suspicious behavior to a worker.

Methods and systems for monitoring potential losses in a retail environment
11055518 · 2021-07-06 · ·

Examples described herein generally relate to a system for monitoring customers in a retail environment. The system includes a plurality of cameras located in different regions of the retail environment, each camera configured to capture a video feed of a respective region. The system includes a computer system comprising a memory and a processor. The system provides the video feed of at least one region of the retail environment to a plurality of machine learning classifiers, each machine learning classifier trained on labeled videos to classify a sequence of images of a customer into a probability certainty of a respective activity being performed by the customer. The system applies the probability certainties of the respective activities of a customer to a set of business rules to determine whether customer activities identified by the probability certainties indicate suspicious behavior. The system provides a notification of the suspicious behavior to a worker.

Writing recognition using wearable pressure sensing device

Writing recognition using a wearable pressure sensing device includes receiving pressure measurement data from a pressure sensor disposed upon a body part of a user. The pressure measurement data is indicative of a change in pressure of the body part due to an interaction of the body part with a medium indicative of a writing gesture by the user. A start boundary and end boundary for each of a plurality of writing symbols is detected based upon the pressure measurement data. At least one feature of the pressure measurement data associated with the plurality of writing symbols is extracted. A symbol pattern is detected based upon the extracted features, and at least one letter is detected based upon the symbol pattern. A word is detected based upon the detected at least one letter.

HANDWRITING INPUT APPARATUS, HANDWRITING INPUT METHOD, PROGRAM, AND INPUT SYSTEM
20200327307 · 2020-10-15 · ·

A handwriting input apparatus that displays stroke data handwritten based on a position of an input unit contacting a touch panel, includes circuitry configured to implement a handwriting recognition control unit for recognizing stroke data and converting the stroke data into text data, and an authentication control unit for authenticating a user based on the stroke data, and a display unit for displaying a display component for receiving a signature together with the text data when the authentication control unit determines that a user has been successfully authenticated.

METHOD AND DEVICE FOR DISPLAYING HANDWRITING-BASED ENTRY

A system and method for detecting, identifying, and displaying handwriting-based entry is provided. The system and method include features for detecting entry of at least one first letter based on handwriting, identifying a style of the at least one first letter, and displaying at least one second letter associated with the at least one first letter based on, and in the form of, the identified style of the at least one first letter.

ALIGNMENT OF VIDEO AND TEXTUAL SEQUENCES FOR METADATA ANALYSIS
20200175232 · 2020-06-04 ·

Systems, methods and computer program products related to aligning heterogeneous sequential data are disclosed. Video data in a media presentation and textual data corresponding to content of the media presentation are received. An action related to aligning the video data and the textual data is determined using an alignment neural network, such that the video data and the textual data are at least partially aligned following the action. The alignment neural network includes a first fully connected layer that receives as input the video data, the textual data, and data relating to a previously determined action by the alignment neural network related to aligning the video data and the textual data. The determined action related to aligning the video data and the textual data is performed.

Apparatuses, methods, and systems for 3-channel dynamic contextual script recognition using neural network image analytics and 4-tuple machine learning with enhanced templates and context data

In some embodiments, a method includes training a first machine learning model based on multiple documents and multiple templates associated with the multiple documents. The method further includes executing the first machine learning model to generate multiple relevancy masks, the multiple relevancy masks to remove a visual structure of the multiple templates from a visual structure of the multiple documents. The method further includes generating multiple multichannel field images to include the multiple relevancy masks and at least one of the multiple documents or the multiple templates. The method further includes training a second machine learning model based on the multiple multichannel field images and multiple non-native texts associated with the multiple documents. The method further includes executing the second machine learning model to generate multiple non-native texts from the multiple multichannel field images.

Alignment of video and textual sequences for metadata analysis

Systems, methods and computer program products related to aligning heterogeneous sequential data. A first sequential data stream and a second sequential data stream are received. An action related to aligning the first sequential data stream and the second sequential data stream is determined using an alignment neural network. The alignment neural network includes a fully connected layer that receives as input: data from the first sequential data stream, data from the second sequential data stream, and data relating to a previously determined action by the alignment neural network related to aligning the first sequential data stream and the second sequential data stream.

ALIGNMENT OF VIDEO AND TEXTUAL SEQUENCES FOR METADATA ANALYSIS
20200012725 · 2020-01-09 ·

Systems, methods and computer program products related to aligning heterogeneous sequential data. A first sequential data stream and a second sequential data stream are received. An action related to aligning the first sequential data stream and the second sequential data stream is determined using an alignment neural network. The alignment neural network includes a fully connected layer that receives as input: data from the first sequential data stream, data from the second sequential data stream, and data relating to a previously determined action by the alignment neural network related to aligning the first sequential data stream and the second sequential data stream.