G06V10/768

SYSTEMS AND METHODS FOR COMMUNICATING WITH VISION AND HEARING IMPAIRED VEHICLE OCCUPANTS
20230067615 · 2023-03-02 · ·

Systems and methods associated with a vehicle are provided. The systems and methods include an occupant output system including an output device, a camera or other perception device, and a processor in operable communication with the occupant output system and the camera or other perception device. The processor is configured to execute program instructions to cause the processor to: receive image or other perception data from the camera or other perception device, the image or other perception data including at least part of a head and/or body of an occupant of the vehicle, analyze the image or other perception data to determine if the occupant is of hearing and vision impaired, when the occupant is determined to be of vision and hearing impaired, decide on an output modality to assist the occupant, and generate an output for the occupant on the output device, and in the output modality.

Classifying terms from source texts using implicit and explicit class-recognition-machine-learning models

This disclosure relates to methods, non-transitory computer readable media, and systems that can classify term sequences within a source text based on textual features analyzed by both an implicit-class-recognition model and an explicit-class-recognition model. For example, by applying machine-learning models for both implicit and explicit class recognition, the disclosed systems can determine a class corresponding to a particular term sequence within a source text and identify the particular term sequence reflecting the class. The dual-model architecture can equip the disclosed systems to apply (i) the implicit-class-recognition model to recognize implicit references to a class in source texts and (ii) the explicit-class-recognition model to recognize explicit references to the same class in source texts.

Methods and Systems for Generating Animated Images for Presentation by a Dynamic Keyboard Interface

The present disclosure is directed to generating animated images for presentation by a dynamic keyboard interface. In particular, the methods and systems of the present disclosure can: receive data describing advertisement content, and data describing a first context in which to present the advertisement content; generate data describing a first animated image including at least a portion of the advertisement content; determine a second context in which to present the advertisement content; generate data describing a second animated image including at least a portion of the advertisement content; and communicate, to one or more user devices on which one or more applications are executed, data indicating a plurality of different animated images for presentation by a dynamic keyboard interface in association with the one or more applications, the plurality of different animated images comprising the first animated image and the second animated image.

SYSTEMS AND METHODS FOR SYNTHETIC DATABASE QUERY GENERATION

A system for returning synthetic database query results. The system may include a memory unit for storing instructions, and a processor configured to execute the instructions to perform operations comprising: receiving a query input by a user at a user interface; determining, based on natural language processing, a type of the query input; determining, based on the received query input and a database language interpreter, an output data format; returning, based on a generation model and the output data format, a result of the query input; providing, to a plurality of training models and based on the determined query type, the query input and the result; and training the training models, based on the query input and the result.

EXTRACTION OF GENEALOGY DATA FROM OBITUARIES

Systems, methods, and other techniques for extracting data from obituaries are provided. In some embodiments, an obituary containing a plurality of words is received. Using a machine learning model, an entity tag from a set of entity tags may be assigned to each of one or more words of the plurality of words. Each particular tag from the set of entity tags may include a relationship component and a category component. The relationship component may indicate a relationship between a particular word and the deceased individual. The category component may indicate a categorization of the particular word to a particular category from a set of categories. The extracted data may be stored in a genealogical database.

Contextual grounding of natural language phrases in images
11620814 · 2023-04-04 · ·

Aspects of the present disclosure describe systems, methods and structures providing contextual grounding—a higher-order interaction technique to capture corresponding context between text entities and visual objects.

Method and system for generating a vector representation of an image

There is described a computer-implemented method for generating a vector representation of an image, the computer-implemented method comprising: receiving a given image and semantic information about the given image; generating a first vector representation of the given image using an image embedding method; generating a second vector representation of the semantic information using a word embedding method; combining the first vector representation of the image to be embedded and the second vector representation of the semantic information together, thereby obtaining a modified vector representation for the image to be embedded; and outputting the modified vector representation.

OBJECT IDENTIFICATION BASED ON ADAPTIVE LEARNING

Disclosed herein are systems, methods, and devices for using adaptive learning to identify objects. An object-identifying device performs a first object identification based on one or more features of a first modality of an object retrieved from an image frame including the object and a first database including first modality identification features. A second object identification is performed based on one or more features of a second modality of the object retrieved from the image frame and a second database including second modality identification features. The second database is updated by adaptively learning a new second modality identification feature according to a first identification result of the first object identification. The second object identification is trained with the updated second database and determines a final identification result by integrating a first identification result of the first object identification and a second identification result of the second object identification.

AUTOMATED COLLISION AVOIDANCE IN MEDICAL ENVIRONMENTS

An apparatus for automated collision avoidance includes a sensor configured to detect an object of interest, predicting a representation of the object of interest at a future point in time, calculating an indication of a possibility of a collision with the object of interest based on the representation of the object of interest at the future point in time, and executing a collision avoidance action based on the indication.

Contextually disambiguating queries

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for contextually disambiguating queries are disclosed. In an aspect, a method includes receiving an image being presented on a display of a computing device and a transcription of an utterance spoken by a user of the computing device, identifying a particular sub-image that is included in the image, and based on performing image recognition on the particular sub-image, determining one or more first labels that indicate a context of the particular sub-image. The method also includes, based on performing text recognition on a portion of the image other than the particular sub-image, determining one or more second labels that indicate the context of the particular sub-image, based on the transcription, the first labels, and the second labels, generating a search query, and providing, for output, the search query.