G06V10/806

Spoofing detection apparatus, spoofing detection method, and computer-readable recording medium
11694475 · 2023-07-04 · ·

A spoofing detection apparatus comprises obtaining, from an image capture apparatus, a first image frame that includes the face of a subject person obtained when a light-emitting apparatus is emitting light and a second image frame that includes the face of the subject person obtained when the light-emitting apparatus is turned off, extracting, from the first image frame, information specifying a face portion of the subject person, and extract, from the second image frame, information specifying a face portion of the subject person, extracting a portion that includes a bright point formed by reflection in an iris region of an eye of the subject person, from the first image frame, extracts a portion corresponding to the portion that includes the bright point, from the second image frame, and calculates a feature that is independent of the position of the bright point, and determining authenticity of subject person based on the feature.

Learning model architecture for image data semantic segmentation

A learning model may provide a hierarchy of convolutional layers configured to perform convolutions upon image features, each layer other than a topmost layer convoluting the image features at a lower resolution to a higher layer, and each layer other than a bottommost layer returning the image features to a lower layer. Each layer fuses the lower resolution image features received from a higher layer with same resolution image features convoluted at the layer, so as to combine large-scale and small-scale features of images. Layers of the hierarchy may be substantially equal to a number of lateral convolutions at a bottommost convolutional layer. The bottommost convolutional layer ultimately passes the fused features to an attention mapping module, which utilizes two attention mapping pathways in combination to detect non-local dependencies and interactions between large-scale and small-scale features of images without de-emphasizing local interactions.

Method for detecting <i>Ophiocephalus argus </i>cantor under intra-class occulusion based on cross-scale layered feature fusion

Disclosed is a method for detecting Ophiocephalus argus cantor under intra-class occulusion based on cross-scale layered feature fusion, including image collecting, image processing and network model, where collected images are labeled, image sizes are adjusted to obtain input images, and the input images are input into an object detection network, integrated by convolution and inserted into cross-scale layered feature fusion modules, characterized by including dividing all features input into the cross-scale layered feature fusion modules into n layers, composed of s feature mapping subsets, and fusing features of each feature mapping subset with that of other feature mapping subsets, and connecting; carrying out convolution operation, outputting training result; adjusting network parameters by a loss function to obtain parameters for a network model; inputting final output candidate boxes into a non-maximum suppression module to screen correct prediction boxes, so that prediction result is obtained.

Systems and methods for generating document numerical representations

Described embodiments relate to a method comprising: determining a candidate document comprising image data and character data and extracting the image data and the character data from the candidate document. The method comprises providing, to an image-based numerical representation generation model, the image data, and generating, by the image-based numerical representation generation model, an image-based numerical representation of the image data. The method comprises providing, to a character-based numerical representation generation model, the character data; and generating, by the character-based numerical representation generation model, a character-based numerical representation of the character data. The method comprises providing, to a consolidated image-character based numerical representation generation model, the image-based numerical representation and the character-based numerical representation; and generating, by the consolidated image-character based numerical representation generation model, a combined image-character based numerical representation of the candidate document.

Multi-spectrum visual object recognition

Aspects of the present disclosure relate to multi-spectrum visual object recognition. A first image corresponding to visible light and a second image corresponding to invisible light with respect to an object can be obtained. A first contour of the object can be identified based on the first image. A second contour of the object can be identified based on the second image. The first contour of the object and the second contour of the object can be integrated to generate a multi-spectrum contour of the object. The object can be recognized using the multi-spectrum contour of the object.

SYSTEM AND METHOD OF IMAGE PROCESSING BASED EMOTION RECOGNITION

A system of image processing based emotion recognition is disclosed. The system principally comprises a camera and a main processor. Particularly, there a plurality of function units provided in the main processor, including: face detection unit, feature processing module, feature combination unit, conversion module, facial action judging unit, and emotion recognition unit. According to the present invention, the emotion recognition unit is configured to utilize a facial emotion recognition (FER) model to evaluate or distinguish an emotion state of a user based on at least one facial action, at least one emotional dimension, and a plurality of emotional scores. As a result, the accuracy of the emotion recognition conducted by the emotion recognition unit is significantly enhanced because basis of the emotion recognition comprises basic emotions, emotional dimension(s) and the user's facial action.

LUMBAR SPINE ANNATOMICAL ANNOTATION BASED ON MAGNETIC RESONANCE IMAGES USING ARTIFICIAL INTELLIGENCE

A system for automated comprehensive assessment of clinical lumbar MRIs includes a MRI standardization component that reads MRI data from raw lumbar MRI files, uses an artificial intelligence (AI) model to convert the raw MRI data into a standardized format. A core assessment component automatically generates MRI assessment results, including multi-tissue anatomical annotation, multi-pathology detection and multi-pathology progression prediction based on the structured MRI data package. The core assessment component contains a semantic segmentation module that utilizes a deep learning artificial intelligence (AI) model to generate an MRI assessment results that contains multi-tissue anatomical annotation, a pathology detection module to generate multi-pathology detection, and a pathology progression prediction module to generate multi-pathology progression prediction. A model optimization component archives clinical MRI data and MRI assessment results based on comments provided by a specialist, and periodically optimizes the AI deep learning model of the core assessment component.

Answering questions during video playback
11544590 · 2023-01-03 · ·

In implementations of answering questions during video playback, a video system can receive a question related to a video at a timepoint of the video during playback of the video, and determine audio sentences of the video that occur within a segment of the video that includes the timepoint. The video system can generate a classification vector from words of the question and the audio sentences, and determine an answer to the question utilizing the classification vector. The video system can obtain answer candidates, and the answer to the question can be selected as one of the answer candidates based on matching the classification vector to one of the answer vectors.

THREE-DIMENSIONAL HUMAN POSE ESTIMATION METHOD AND RELATED APPARATUS
20220415076 · 2022-12-29 ·

This application discloses a three-dimensional human pose estimation method performed by a computer device. An initialization pose estimation result of a single video frame in a video frame sequence of n views is extracted based on a neural network model. Single-frame and single-view human pose estimation is performed on the initialization pose estimation result for each video frame, to obtain n single-view pose estimation sequences respectively corresponding to the n views. Single-frame and multi-view human pose estimation is performed according to single-view pose estimation results with the same timestamp in the n single-view pose estimation sequences, to obtain a multi-view pose estimation sequence. Multi-frame and multi-view human pose estimation is performed on a multi-view pose estimation result in the multi-view pose estimation sequence, to obtain a multi-view and multi-frame pose estimation result. Therefore, accuracy of human pose estimation is improved.

IMAGE PROCESSING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
20220414896 · 2022-12-29 ·

Disclosed is an image processing method, electronic device and storage medium. The method includes obtaining feature information of first region in a current image frame, wherein first region includes a region that is determined by performing motion estimation on the current and previous image frames based on optical flow; obtaining feature information of second region in the current image frame, wherein second region includes a region corresponding to pixel points among first pixel points of the current image frame, where its association with pixel points among second pixel points of the previous image frame satisfies a condition; and based on the feature information of first region and that of second region, fusing the previous and current image frames to obtain a processed current image frame, which is used as a previous image frame for a next image frame.