Patent classifications
G06T7/11
Medical image segmentation method based on U-Net
A medical image segmentation method based on a U-Net, including: sending real segmentation image and original image to a generative adversarial network for data enhancement to generate a composite image with a label; then putting the composite image into original data set to obtain an expanded data set, and sending the expanded data set to improved multi-feature fusion segmentation network for training. A Dilated Convolution Module is added between the shallow and deep feature skip connections of the segmentation network to obtain receptive fields with different sizes, which enhances the fusion of detail information and deep semantics, improves the adaptability to the size of the segmentation target, and improves the medical image segmentation accuracy. The over-fitting problem that occurs when training the segmentation network is alleviated by using the expanded data set of the generative adversarial network.
Medical image segmentation method based on U-Net
A medical image segmentation method based on a U-Net, including: sending real segmentation image and original image to a generative adversarial network for data enhancement to generate a composite image with a label; then putting the composite image into original data set to obtain an expanded data set, and sending the expanded data set to improved multi-feature fusion segmentation network for training. A Dilated Convolution Module is added between the shallow and deep feature skip connections of the segmentation network to obtain receptive fields with different sizes, which enhances the fusion of detail information and deep semantics, improves the adaptability to the size of the segmentation target, and improves the medical image segmentation accuracy. The over-fitting problem that occurs when training the segmentation network is alleviated by using the expanded data set of the generative adversarial network.
Image processing apparatus, image processing method, and storage medium
An image processing apparatus, includes a memory; and a processor coupled to the memory and configured to: generate a trained machine learning model by learning a machine learning model using a first set of image data, output an inference result by inputting a second set of image data to the trained machine learning model, and process a region of interest at a time of inference with respect to image data for which an inference result is correct in the second set of image data.
Image processing apparatus, image processing method, and storage medium
An image processing apparatus, includes a memory; and a processor coupled to the memory and configured to: generate a trained machine learning model by learning a machine learning model using a first set of image data, output an inference result by inputting a second set of image data to the trained machine learning model, and process a region of interest at a time of inference with respect to image data for which an inference result is correct in the second set of image data.
Video visual relation detection methods and systems
Methods and systems for detecting visual relations in a video are disclosed. A method comprises: decomposing the video sequence into a plurality of segments; for each segment, detecting objects in frames of the segment; tracking the detected objects over the segment to form a set of object tracklets for the segment; for the detected objects, extracting object features; for pairs of object tracklets of the set of object tracklets, extracting relativity features indicative of a relation between the objects corresponding to the pair of object tracklets; forming relation feature vectors for pairs of object tracklets using the object features of objects corresponding to respective pairs of object tracklets and the relativity features of the respective pairs of object tracklets; and generating a set of segment relation prediction results from the relation features vectors; generating a set of visual relation instances for the video sequence by merging the segment prediction results from different segments; and generating a set of visual relation detection results from the set of visual relation instances.
Video visual relation detection methods and systems
Methods and systems for detecting visual relations in a video are disclosed. A method comprises: decomposing the video sequence into a plurality of segments; for each segment, detecting objects in frames of the segment; tracking the detected objects over the segment to form a set of object tracklets for the segment; for the detected objects, extracting object features; for pairs of object tracklets of the set of object tracklets, extracting relativity features indicative of a relation between the objects corresponding to the pair of object tracklets; forming relation feature vectors for pairs of object tracklets using the object features of objects corresponding to respective pairs of object tracklets and the relativity features of the respective pairs of object tracklets; and generating a set of segment relation prediction results from the relation features vectors; generating a set of visual relation instances for the video sequence by merging the segment prediction results from different segments; and generating a set of visual relation detection results from the set of visual relation instances.
Imaging display device and wearable device
An imaging display device includes an imaging unit, a processing unit, a display unit, and a pupil detection unit. The imaging unit includes a plurality of photoelectric conversion elements and is configured to acquire first image information. The processing unit is configured to process a signal from the imaging unit and generate second image information. The display unit is configured to display an image that is based on the signal from the processing unit. The pupil detection unit is configured to detect vector information of a pupil. The processing unit generates the second image information by processing the first image information based on the vector information on the pupil.
Imaging display device and wearable device
An imaging display device includes an imaging unit, a processing unit, a display unit, and a pupil detection unit. The imaging unit includes a plurality of photoelectric conversion elements and is configured to acquire first image information. The processing unit is configured to process a signal from the imaging unit and generate second image information. The display unit is configured to display an image that is based on the signal from the processing unit. The pupil detection unit is configured to detect vector information of a pupil. The processing unit generates the second image information by processing the first image information based on the vector information on the pupil.
Viewpoint dependent brick selection for fast volumetric reconstruction
A method to culling parts of a 3D reconstruction volume is provided. The method makes available to a wide variety of mobile XR applications fresh, accurate and comprehensive 3D reconstruction data with low usage of computational resources and storage spaces. The method includes culling parts of the 3D reconstruction volume against a depth image. The depth image has a plurality of pixels, each of which represents a distance to a surface in a scene. In some embodiments, the method includes culling parts of the 3D reconstruction volume against a frustum. The frustum is derived from a field of view of an image sensor, from which image data to create the 3D reconstruction is obtained.
Viewpoint dependent brick selection for fast volumetric reconstruction
A method to culling parts of a 3D reconstruction volume is provided. The method makes available to a wide variety of mobile XR applications fresh, accurate and comprehensive 3D reconstruction data with low usage of computational resources and storage spaces. The method includes culling parts of the 3D reconstruction volume against a depth image. The depth image has a plurality of pixels, each of which represents a distance to a surface in a scene. In some embodiments, the method includes culling parts of the 3D reconstruction volume against a frustum. The frustum is derived from a field of view of an image sensor, from which image data to create the 3D reconstruction is obtained.