G06V20/70

SEMANTIC IMAGE EXTRAPOLATION METHOD AND APPARATUS
20230051832 · 2023-02-16 ·

Disclosed are a semantic image extrapolation method and a semantic image extrapolation apparatus. The present invention provides a technique for generating an empty region for image-extension in an image by using an extrapolated segmentation map and an inpainting technique. The present invention is to provide, considering that there is no information in an empty region for image-extension in an image, a semantic image extrapolation method, of first generating an extrapolated segmentation map on the basis of a segmentation map from an input image, and filling the empty region for image-extension in the image with information on the basis of the extrapolated segmentation map and the input image.

SEMANTIC IMAGE EXTRAPOLATION METHOD AND APPARATUS
20230051832 · 2023-02-16 ·

Disclosed are a semantic image extrapolation method and a semantic image extrapolation apparatus. The present invention provides a technique for generating an empty region for image-extension in an image by using an extrapolated segmentation map and an inpainting technique. The present invention is to provide, considering that there is no information in an empty region for image-extension in an image, a semantic image extrapolation method, of first generating an extrapolated segmentation map on the basis of a segmentation map from an input image, and filling the empty region for image-extension in the image with information on the basis of the extrapolated segmentation map and the input image.

LEARNING APPARATUS, LEARNING METHOD, AND RECORDING MEDIUM
20230052101 · 2023-02-16 · ·

In a learning apparatus, an acquisition unit acquires image data and label data corresponding to the image data. An object candidate extraction unit extracts each object candidate rectangle from the image data. A correct answer data generation unit generates a background object label corresponding to each background object included in each object candidate rectangle as correct answer data corresponding to the object candidate rectangle by using the label data. A prediction unit predicts a classification using each object candidate rectangle and outputs a prediction result. An optimization unit optimizes the object candidate extraction unit and the prediction unit using the prediction result and the correct answer data.

SEMANTIC ANNOTATION OF SENSOR DATA USING UNRELIABLE MAP ANNOTATION INPUTS

Provided are methods for semantic annotation of sensor data using unreliable map annotation inputs, which can include training a machine learning model to accept inputs including images representing sensor data for a geographic area and unreliable semantic annotations for the geographic area. The machine learning model can be trained against validated semantic annotations for the geographic area, such that subsequent to training, additional images representing sensor data and additional unreliable semantic annotations can be passed through the neural network to provide predicted semantic annotations for the additional images. Systems and computer program products are also provided.

APPARATUS AND METHOD WITH OBJECT DETECTION

Disclosed is an apparatus and method with object detection. The method may include updating a pre-trained model based on sensing data of an image sensor, performing pseudo labeling using an interim model provided a respective training set, determining plural confidence thresholds based on an evaluation of the interim model, performing multiple trainings using the interim model and the generated pseudo labeled data, by applying the determined plural confidence thresholds to the multiple trainings, respectively, and generating an object detection model dependent on the performance of the multiple trainings, including generating an initial candidate object detection model when the interim model is the updated model.

APPARATUS AND METHOD WITH OBJECT DETECTION

Disclosed is an apparatus and method with object detection. The method may include updating a pre-trained model based on sensing data of an image sensor, performing pseudo labeling using an interim model provided a respective training set, determining plural confidence thresholds based on an evaluation of the interim model, performing multiple trainings using the interim model and the generated pseudo labeled data, by applying the determined plural confidence thresholds to the multiple trainings, respectively, and generating an object detection model dependent on the performance of the multiple trainings, including generating an initial candidate object detection model when the interim model is the updated model.

Analyzing Objects Data to Generate a Textual Content Reporting Events
20230052442 · 2023-02-16 ·

Systems, methods and non-transitory computer readable media for analyzing objects data to generate a textual content reporting events are provided. An indication of an event may be received. An indication of a group of one or more objects associated with the event may be received. For each object of the group of one or more objects, data associated with the object may be received. The data associated with the group of one or more objects may be analyzed to select an adjective. A particular description of the event may be generated. The particular description may be based on the group of one or more objects. The particular description may include the selected adjective. A textual content may be generated. The textual content may include the particular description. The generated textual content may be provided.

Search results within segmented communication session content

Methods and systems provide for search results within segmented communication session content. In one embodiment, the system receives a transcript and video content of a communication session between participants, the transcript including timestamps for a number of utterances associated with speaking participants; processes the video content to extract textual content visible within the frames of the video content; segments frames of the video content into a number of contiguous topic segments; determines a title for each topic segment; assigns a category label for each topic segment; receives a request from a user to search for specified text within the video content; determines one or more titles or category labels for which a prediction of relatedness with the specified text is present; and presents content from at least one topic segment associated with the one or more titles or category labels for which a prediction of relatedness is present.

Search results within segmented communication session content

Methods and systems provide for search results within segmented communication session content. In one embodiment, the system receives a transcript and video content of a communication session between participants, the transcript including timestamps for a number of utterances associated with speaking participants; processes the video content to extract textual content visible within the frames of the video content; segments frames of the video content into a number of contiguous topic segments; determines a title for each topic segment; assigns a category label for each topic segment; receives a request from a user to search for specified text within the video content; determines one or more titles or category labels for which a prediction of relatedness with the specified text is present; and presents content from at least one topic segment associated with the one or more titles or category labels for which a prediction of relatedness is present.

On demand visual recall of objects/places

Aspects of the subject disclosure may include, for example, observing a plurality of objects viewed through a smart lens, wherein the plurality of objects are in a frame of an image viewed by the smart lens, determining an identification for an object of the plurality of objects, assigning tag information for the object based on the identification, storing the tag information for the object and the frame in which the object was observed, receiving a recall request for the object, retrieving the tag information for the object and the frame responsive to the receiving the recall request, and displaying the tag information and the frame. Other embodiments are disclosed.