G06V10/20

MULTI-DOMAIN CONVOLUTIONAL NEURAL NETWORK

In one embodiment, an apparatus comprises a memory and a processor. The memory is to store visual data associated with a visual representation captured by one or more sensors. The processor is to: obtain the visual data associated with the visual representation captured by the one or more sensors, wherein the visual data comprises uncompressed visual data or compressed visual data; process the visual data using a convolutional neural network (CNN), wherein the CNN comprises a plurality of layers, wherein the plurality of layers comprises a plurality of filters, and wherein the plurality of filters comprises one or more pixel-domain filters to perform processing associated with uncompressed data and one or more compressed-domain filters to perform processing associated with compressed data; and classify the visual data based on an output of the CNN.

IMAGE CAPTURING DEVICE AND VEHICLE CONTROL SYSTEM
20230234503 · 2023-07-27 · ·

Fabrication processing is executed in a chip of an image sensor. An image capturing device includes an image capturing unit (11) mounted on a vehicle and configured to generate image data by performing image capturing of a peripheral region of the vehicle, a scene recognition unit (214) configured to recognize a scene of the peripheral region based on the image data, and a drive control unit (12) configured to control drive of the image capturing unit based on the scene recognized by the scene recognition unit.

OBJECT DETECTION APPARATUS USING AN IMAGE PREPROCESSING ARTIFICIAL NEURAL NETWORK MODEL
20230237792 · 2023-07-27 · ·

An apparatus for recognizing an object in an image includes a preprocessing module configured to receive an image including an object and to output a preprocessed image by performing image enhancement processing on the received image to improve a recognition rate of the object included in the received image; and an object recognition module configured to recognize the object included in the image by inputting the preprocessed image to an input layer of an artificial neural network for object recognition.

OBJECT DETECTION APPARATUS USING AN IMAGE PREPROCESSING ARTIFICIAL NEURAL NETWORK MODEL
20230237792 · 2023-07-27 · ·

An apparatus for recognizing an object in an image includes a preprocessing module configured to receive an image including an object and to output a preprocessed image by performing image enhancement processing on the received image to improve a recognition rate of the object included in the received image; and an object recognition module configured to recognize the object included in the image by inputting the preprocessed image to an input layer of an artificial neural network for object recognition.

AUTOMATICALLY DETECTING USER-REQUESTED OBJECTS IN DIGITAL IMAGES
20230237088 · 2023-07-27 ·

The present disclosure relates to an object selection system that accurately detects and optionally automatically selects user-requested objects (e.g., query objects) in digital images. For example, the object selection system builds and utilizes an object selection pipeline to determine which object detection neural network to utilize to detect a query object based on analyzing the object class of a query object. In particular, the object selection system can identify both known object classes as well as objects corresponding to unknown object classes.

METHOD AND SYSTEM OF CONTROLLING DEVICE USING REAL-TIME INDOOR IMAGE
20230006856 · 2023-01-05 · ·

A device and a method for controlling a device using a real-time image are provided. The method includes: receiving an image captured by an image capturing device connected to a network to display the image in real-time; searching for the device that is connected to the network and is controllable; designating, within the image, a setting zone corresponding to the device; receiving a user input; and controlling the device selected according to the user input. A location of the setting zone within the image may be updated according to a change in the image. The user may receive immediate visual feedback on how the devices are being controlled. The user may control a device displayed on the screen on which the real-time indoor image is displayed without having to navigate through different sub-menus for different devices.

Utilizing interactive deep learning to select objects in digital visual media
11568627 · 2023-01-31 · ·

Systems and methods are disclosed for selecting target objects within digital images utilizing a multi-modal object selection neural network trained to accommodate multiple input modalities. In particular, in one or more embodiments, the disclosed systems and methods generate a trained neural network based on training digital images and training indicators corresponding to various input modalities. Moreover, one or more embodiments of the disclosed systems and methods utilize a trained neural network and iterative user inputs corresponding to different input modalities to select target objects in digital images. Specifically, the disclosed systems and methods can transform user inputs into distance maps that can be utilized in conjunction with color channels and a trained neural network to identify pixels that reflect the target object.

Character recognizing apparatus and non-transitory computer readable medium
11568659 · 2023-01-31 · ·

A character recognizing apparatus includes an acquiring unit, an identifying unit, and a character recognizing unit. The acquiring unit acquires a string image that is an image of a string generated in accordance with one of multiple string generation schemes. The identifying unit identifies a range specified for a result of character recognition in each of the multiple string generation schemes. The character recognizing unit performs first character recognition on the string image, and if a result of the first character recognition has a feature of a particular string generation scheme of the multiple string generation schemes, the character recognizing unit performs second character recognition on the string image within the range specified for a result of character recognition in the particular string generation scheme.

Method and systems for anatomy/view classification in x-ray imaging

Various methods and systems are provided for x-ray imaging. In one embodiment, a method for an image pasting examination comprises acquiring, via an optical camera and/or depth camera, image data of a subject, controlling an x-ray source and an x-ray detector according to the image data to acquire a plurality of x-ray images of the subject, and stitching the plurality of x-ray images into a single x-ray image. In this way, optimal exposure techniques may be used for individual acquisitions in an image pasting examination such that the optimal dose is utilized, stitching quality is improved, and registration failures are avoided.

Systems and methods for improving visual search using summarization feature
11715294 · 2023-08-01 · ·

Systems that search databases of videos or images to identify similar products in a given video or image of a product are disclosed. The content of the given video is represented by a feature vector used to measure the given video's similarity to either a video or an image. When the system is deployed to recognize particular fashion items in videos, some such videos are taken in uncontrolled settings, and as a result, may have low resolution, poor contrast, minimal focus, motion blur, or low lighting. By recognizing and removing poor quality video frames from the image recognition pipeline, associating products across video frames to form tracklets of each product, and enriching the feature representation of each item for best retrieval result by fusing information from multiple video frames depicting the item, the system addresses the aforementioned shortcomings.