G06F18/40

ASSESSING MACHINE LEARNING BIAS USING MODEL TRAINING METADATA
20230229734 · 2023-07-20 ·

In one embodiment, a device receives a request for a machine learning model to make an inference about input data included in the request. The device retrieves metadata regarding training data used to train the machine learning model from a ledger associated with the machine learning model. The device assesses bias of the machine learning model by comparing the input data in the request to the metadata from the ledger. The device provides an indication of the bias of the machine learning model for display.

ASSESSING MACHINE LEARNING BIAS USING MODEL TRAINING METADATA
20230229734 · 2023-07-20 ·

In one embodiment, a device receives a request for a machine learning model to make an inference about input data included in the request. The device retrieves metadata regarding training data used to train the machine learning model from a ledger associated with the machine learning model. The device assesses bias of the machine learning model by comparing the input data in the request to the metadata from the ledger. The device provides an indication of the bias of the machine learning model for display.

SANITIZING PERSONALLY IDENTIFIABLE INFORMATION (PII) IN AUDIO AND VISUAL DATA
20230229803 · 2023-07-20 ·

Techniques for sanitizing personally identifiable information (PII) from audio and visual data are provided. For instance, in a scenario where the data comprises an audio signal with speech uttered by a person P, these techniques can include removing/obfuscating/transforming speech-related PII in the audio signal such as pitch and acoustic cues associated with P's vocal tract shape and/or vocal actuators (e.g., lips, nasal air bypass, teeth, tongue, etc.) while allowing the content of the speech to remain recognizable. Further, in a scenario where the data comprises a still image or video in which a person P appears, these techniques can include removing/obfuscating/transforming visual PII in the image or video such as P's biological features and indicators of P's location/belongings/data while allowing the general nature of the image or video to remain discernable. Through this PII sanitization process, the privacy of individuals portrayed in the audio or visual data can be preserved.

SANITIZING PERSONALLY IDENTIFIABLE INFORMATION (PII) IN AUDIO AND VISUAL DATA
20230229803 · 2023-07-20 ·

Techniques for sanitizing personally identifiable information (PII) from audio and visual data are provided. For instance, in a scenario where the data comprises an audio signal with speech uttered by a person P, these techniques can include removing/obfuscating/transforming speech-related PII in the audio signal such as pitch and acoustic cues associated with P's vocal tract shape and/or vocal actuators (e.g., lips, nasal air bypass, teeth, tongue, etc.) while allowing the content of the speech to remain recognizable. Further, in a scenario where the data comprises a still image or video in which a person P appears, these techniques can include removing/obfuscating/transforming visual PII in the image or video such as P's biological features and indicators of P's location/belongings/data while allowing the general nature of the image or video to remain discernable. Through this PII sanitization process, the privacy of individuals portrayed in the audio or visual data can be preserved.

Machine learning system and method for determining or inferring user action and intent based on screen image analysis
11704898 · 2023-07-18 · ·

System(s) and method(s) that analyze image data associated with a computing screen operated by a user, and learns the image data (e.g., using pattern recognition, historical information analysis, user implicit and explicit training data, optical character recognition (OCR), video information, 360°/panoramic recordings, and so on) to concurrently glean information regarding multiple states of user interaction (e.g., analyzing data associated with multiple applications open on a desktop, mobile phone or tablet). A machine learning model is trained on analysis of graphical image data associated with screen display to determine or infer user intent. An input component receives image data regarding a screen display associated with user interaction with a computing device. An analysis component employs the model to determine or infer user intent based on the image data analysis; and an action component provisions services to the user as a function of the determined or inferred user intent. In an implementation, a gaming component gamifies interaction with the user in connection with explicitly training the model.

Techniques for image content extraction

Embodiments are directed to techniques for image content extraction. Some embodiments include extracting contextually structured data from document images, such as by automatically identifying document layout, document data, document metadata, and/or correlations therebetween in a document image, for instance. Some embodiments utilize breakpoints to enable the system to match different documents with internal variations to a common template. Several embodiments include extracting contextually structured data from table images, such as gridded and non-gridded tables. Many embodiments are directed to generating and utilizing a document template database for automatically extracting document image contents into a contextually structured format. Several embodiments are directed to automatically identifying and associating document metadata with corresponding document data in a document image to generate a machine-facilitated annotation of the document image. In some embodiments, the machine-facilitated annotation may be used to generate a template for the template database.

Identification of Effect Pigments in a Target Coating

Described herein is a computer-implemented method. The method includes: providing digital images and respective formulas for coating compositions with known pigments and/or pigment classes associated with the respective digital images, classifying, using an image annotation tool, for each digital image, each pixel, by visually reviewing the respective digital image pixel-wise, providing, for each digital image, an associated pixel-wise annotated image, training a first neural network with the provided digital images as input and the associated pixel-wise annotated images as output, making the trained first neural network available for applying the trained first neural network to at least one unknown input image of a target coating and for assigning a pigment label and/or a pigment class label to each pixel in the at least one unknown input image, and determining and/or outputting, for each unknown input image, a statistic of corresponding identified pigments and/or pigment classes, respectively.

Data model generation using generative adversarial networks

Methods for generating data models using a generative adversarial network can begin by receiving a data model generation request by a model optimizer from an interface. The model optimizer can provision computing resources with a data model. As a further step, a synthetic dataset for training the data model can be generated using a generative network of a generative adversarial network, the generative network trained to generate output data differing at least a predetermined amount from a reference dataset according to a similarity metric. The computing resources can train the data model using the synthetic dataset. The model optimizer can evaluate performance criteria of the data model and, based on the evaluation of the performance criteria of the data model, store the data model and metadata of the data model in a model storage. The data model can then be used to process production data.

Data model generation using generative adversarial networks

Methods for generating data models using a generative adversarial network can begin by receiving a data model generation request by a model optimizer from an interface. The model optimizer can provision computing resources with a data model. As a further step, a synthetic dataset for training the data model can be generated using a generative network of a generative adversarial network, the generative network trained to generate output data differing at least a predetermined amount from a reference dataset according to a similarity metric. The computing resources can train the data model using the synthetic dataset. The model optimizer can evaluate performance criteria of the data model and, based on the evaluation of the performance criteria of the data model, store the data model and metadata of the data model in a model storage. The data model can then be used to process production data.

SELECTION METHOD OF LEARNING DATA AND COMPUTER SYSTEM
20230019364 · 2023-01-19 · ·

A computer system accurately selects learning data for improving a prediction accuracy of a predictor, and is connected to a database that stores a plurality of pieces of learning data and information for managing a plurality of predictors generated under different learning conditions. A target predictor is selected, an influence degree representing strength of an influence of the learning data on a prediction accuracy of the target predictor for test data is calculated for each of a plurality of pieces of test data, an influence score of the learning data is calculated for the plurality of predictors based on a plurality of influence degrees of the learning data associated with the predictors, and the learning data to be used is selected from the plurality of pieces of learning data on the basis of a plurality of the influence scores of each of the plurality of pieces of learning data.