G06V10/422

Teaching data correction method for training image, teaching data correction device and program

A teaching data correction device sets, for teaching data indicating an object area where an object of interest exists in a training image, a correction candidate area which is an area to be a correction candidate of the object area, the training image being used for learning. The teaching data correction device generates an output machine based on the correction candidate area, the output machine being learned to output, when an image is inputted thereto, an identification result or a regression result relating to the object of the interest. Then, the teaching data correction device updates the teaching data by the correction candidate area based on an accuracy of the output machine, the accuracy being calculated based on the identification result or the regression result outputted by the output machine.

Teaching data correction method for training image, teaching data correction device and program

A teaching data correction device sets, for teaching data indicating an object area where an object of interest exists in a training image, a correction candidate area which is an area to be a correction candidate of the object area, the training image being used for learning. The teaching data correction device generates an output machine based on the correction candidate area, the output machine being learned to output, when an image is inputted thereto, an identification result or a regression result relating to the object of the interest. Then, the teaching data correction device updates the teaching data by the correction candidate area based on an accuracy of the output machine, the accuracy being calculated based on the identification result or the regression result outputted by the output machine.

IMAGE GROUNDING WITH MODULARIZED GRAPH ATTENTIVE NETWORKS
20230368510 · 2023-11-16 ·

A system may include a memory and a processor in communication with the memory. The processor may be configured to perform operations. The operations may include receiving an input, extracting features from the input, and mining object relations using the features. The operations may include determining feature vectors using the object relations and generating, using the feature vectors, an output indicating a target region, wherein the target region corresponds to the input.

IMAGE GROUNDING WITH MODULARIZED GRAPH ATTENTIVE NETWORKS
20230368510 · 2023-11-16 ·

A system may include a memory and a processor in communication with the memory. The processor may be configured to perform operations. The operations may include receiving an input, extracting features from the input, and mining object relations using the features. The operations may include determining feature vectors using the object relations and generating, using the feature vectors, an output indicating a target region, wherein the target region corresponds to the input.

System and method for the autonomous identification of physical abuse

Embodiments of the present systems and methods may provide techniques that may provide the capability to autonomously and automatically identify cases of suspected abuse, which may then be subject to follow up review and action. For example, in an embodiment, a system for identifying potential physical abuse in a location may comprise at least one video camera provided at the location and a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor to cause the computer system to perform: receiving video data from the at least one video camera, identifying potential physical abuse from the video data, the identification being to within a predetermined confidence level, and transmitting information indicating that a potential human threat has been identified.

System and method for the autonomous identification of physical abuse

Embodiments of the present systems and methods may provide techniques that may provide the capability to autonomously and automatically identify cases of suspected abuse, which may then be subject to follow up review and action. For example, in an embodiment, a system for identifying potential physical abuse in a location may comprise at least one video camera provided at the location and a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor to cause the computer system to perform: receiving video data from the at least one video camera, identifying potential physical abuse from the video data, the identification being to within a predetermined confidence level, and transmitting information indicating that a potential human threat has been identified.

Image encoding and decoding, video encoding and decoding: methods, systems and training methods

Lossy or lossless compression and transmission, comprising the steps of: (i) receiving an input image; (ii) encoding it using an encoder trained neural network, to produce a y latent representation; (iii) encoding the y latent representation using a hyperencoder trained neural network, to produce a z hyperlatent representation; (iv) quantizing the z hyperlatent representation using a predetermined entropy parameter to produce a quantized z hyperlatent representation; (v) entropy encoding the quantized z hyperlatent representation into a first bitstream, using predetermined entropy parameters; (vi) processing the quantized z hyperlatent representation using a hyperdecoder trained neural network to obtain a location entropy parameter μ.sub.y, an entropy scale parameter σ.sub.y, and a context matrix A.sub.y of the y latent representation; (vii) processing the y latent representation, the location entropy parameter μ.sub.y and the context matrix A.sub.y, to obtain quantized latent residuals; (viii) entropy encoding the quantized latent residuals into a second bitstream, using the entropy scale parameter σ.sub.y; and (ix) transmitting the bitstreams.

Image encoding and decoding, video encoding and decoding: methods, systems and training methods

Lossy or lossless compression and transmission, comprising the steps of: (i) receiving an input image; (ii) encoding it using an encoder trained neural network, to produce a y latent representation; (iii) encoding the y latent representation using a hyperencoder trained neural network, to produce a z hyperlatent representation; (iv) quantizing the z hyperlatent representation using a predetermined entropy parameter to produce a quantized z hyperlatent representation; (v) entropy encoding the quantized z hyperlatent representation into a first bitstream, using predetermined entropy parameters; (vi) processing the quantized z hyperlatent representation using a hyperdecoder trained neural network to obtain a location entropy parameter μ.sub.y, an entropy scale parameter σ.sub.y, and a context matrix A.sub.y of the y latent representation; (vii) processing the y latent representation, the location entropy parameter μ.sub.y and the context matrix A.sub.y, to obtain quantized latent residuals; (viii) entropy encoding the quantized latent residuals into a second bitstream, using the entropy scale parameter σ.sub.y; and (ix) transmitting the bitstreams.

EFFICIENT CALCULATION OF A ROBUST SIGNATURE OF A MEDIA UNIT
20220343620 · 2022-10-27 · ·

Systems, and method and computer readable media that store instructions for calculating signatures, utilizing signatures and the like, wherein for a low-power calculation of a signature, the method comprises: receiving or generating a media unit of multiple objects: processing the media unit by performing multiple iterations, determining a relevancy of the spanning elements of the iteration; completing the dimension expansion process by relevant spanning elements of the iteration and reducing a power consumption of irrelevant spanning; determining identifiers that are associated with significant portions of an output of the multiple iterations; and providing a signature that comprises the identifiers and represents the multiple objects.

EFFICIENT CALCULATION OF A ROBUST SIGNATURE OF A MEDIA UNIT
20220343620 · 2022-10-27 · ·

Systems, and method and computer readable media that store instructions for calculating signatures, utilizing signatures and the like, wherein for a low-power calculation of a signature, the method comprises: receiving or generating a media unit of multiple objects: processing the media unit by performing multiple iterations, determining a relevancy of the spanning elements of the iteration; completing the dimension expansion process by relevant spanning elements of the iteration and reducing a power consumption of irrelevant spanning; determining identifiers that are associated with significant portions of an output of the multiple iterations; and providing a signature that comprises the identifiers and represents the multiple objects.