G06N3/0475

INFORMATION PROCESSING DEVICE AND MACHINE LEARNING METHOD

Accuracy of a model extracting a graph structure as an intermediate representation from input data is improved. An encoding unit (100) extracts a feature amount of each of a plurality of vertices included in a graph structure (Tr) from input data (10), and calculates a likelihood that an edge is connected to the vertex. A sampling unit (130) determines the graph structure (Tr) based on a conversion result of a Gumbel-Softmax function for the likelihood. A learning unit (150) optimizes a decoding unit (140) and the encoding unit (100) by back propagation using a loss function including an error (L.sub.P) between output data (20) generated from the graph structure (Tr) and correct data.

Method and System for Scene-Aware Audio-Video Representation

Embodiments disclose a method and system for a scene-aware audio-video representation of a scene. The scene-aware audio video representation corresponds to a graph of nodes connected by edges. A node in the graph is indicative of the video features of an object in the scene. An edge in the graph connecting two nodes indicates an interaction of the corresponding two objects in the scene. In the graph, at least one or more edges are associated with audio features of a sound generated by the interaction of the corresponding two objects. The graph of the audio-video representation of the scene may be used to perform a variety of different tasks. Examples of the tasks include one or a combination of an action recognition, an anomaly detection, a sound localization and enhancement, a noisy-background sound removal, and a system control.

METHODS AND SYSTEMS FOR HIGH DEFINITION IMAGE MANIPULATION WITH NEURAL NETWORKS
20230019851 · 2023-01-19 ·

Methods and systems for high-resolution image manipulation are disclosed. An original high-resolution image to be manipulated is obtained, as well as a driving signal indicating a manipulation result. The original high-resolution image is down-sampled to obtain a low-resolution image to be manipulated. Using a trained manipulation generator, a low-resolution manipulated image and a motion field are generated from the low-resolution image. The motion field represent pixel displacements of the low-resolution image to obtain the manipulation indicated by the driving signal. A high-frequency residual image is computed from the original high-resolution image. A high-frequency manipulated residual image is generated using the motion field. A high-resolution manipulated image is outputted by combining the high-frequency manipulated residual image and a low-frequency manipulated image generated from the low-resolution manipulated image by up-sampling.

Apparatus and Method for End-to-End Adversarial Blind Bandwidth Extension with one or more Convolutional and/or Recurrent Networks

An apparatus for processing a narrowband speech input signal by conducting bandwidth extension of the narrowband speech input signal to obtain a wideband speech output signal according to an embodiment is provided. The apparatus includes a signal envelope extrapolator including a first neural network, wherein the first neural network is configured to receive as input values of the first neural network a plurality of samples of a signal envelope of the narrowband speech input signal, and configured to determine as output values of the first neural network a plurality of extrapolated signal envelope samples. Moreover, the apparatus includes an excitation signal extrapolator configured to receive a plurality of samples of an excitation signal of the narrowband speech input signal, and configured to determine a plurality of extrapolated excitation signal samples. Furthermore, the apparatus includes a combiner configured to generate the wideband speech output signal such that the wideband speech output signal is bandwidth extended with respect to the narrowband speech input signal depending on the plurality of extrapolated signal envelope samples and depending on the plurality of extrapolated excitation signal samples.

MACHINE LEARNING TECHNIQUES FOR FUTURE OCCURRENCE CODE PREDICTION
20230017734 · 2023-01-19 ·

Various embodiments of the present invention provide methods, apparatus, systems, computing devices, computing entities, and/or the like for performing predictive structural analysis. Certain embodiments of the present invention utilize systems, methods, and computer program products that perform predictive structural analysis using at least one of techniques using time bound code transition likelihood data objects, techniques using cross-code relationship values, techniques using augmented entity-code occurrence data objects, techniques using per-pathway text representations of inferred occurrence pathways of a one or more individual historic code occurrences, techniques using polygenic risk score (PRS) measures, and/or the like.

SYSTEMS AND METHODS OF NEURAL NETWORK TRAINING
20230019874 · 2023-01-19 ·

A computer system is provided for training a neural network that converts images. Input images are applied to the neural network and a difference in image values is determined between predicted image data and target image data. A Fast Fourier Transform is taken of the difference. The neural network is trained on based the L1 Norm of resulting frequency data.

DATA CATALOG SYSTEM FOR GENERATING SYNTHETIC DATASETS

A data catalog system that is configured to automatically generate synthetic datasets based upon original datasets cataloged by the data catalog system, wherein each synthetic dataset comprises synthetic data that is generated using one or more data generation techniques. The data catalog system may access an original dataset and harvest associated metadata information and generate catalog information for the original dataset. The data catalog system may then generate a synthetic dataset based upon the original dataset and its harvested metadata information. The data catalog system may also generate catalog information for the generated synthetic dataset. The catalog information generated for the original dataset may be updated to refer to the newly generated synthetic dataset and its catalog information. The catalog information generated for the synthetic dataset may include references to the original dataset and its catalog information to inform a user of the original dataset about the synthetic dataset.

SYSTEMS AND METHODS FOR SYNTHESIZING CROSS DOMAIN COLLECTIVE INTELLIGENCE
20230018116 · 2023-01-19 ·

In some implementations, a collaborative knowledge system may receive a first set and a second set of privatized embeddings. The first set of privatized embeddings may be generated by a local model based on a first set of private documents associated with a first knowledge domain. The second set of privatized embeddings may be generated by a local model based on a second set of private documents associated with a second, different knowledge domain. The collaborative knowledge system may train, based on the first and second sets of privatized embeddings, a centralized model. The collaborative knowledge system may receive a query associated with the first knowledge domain or the second knowledge domain. The collaborative knowledge system may generate a response to the query based on processing the query with the centralized model. The collaborative knowledge system may provide the response to the query to a user device.

Inserting three-dimensional objects into digital images with consistent lighting via global and local lighting information

This disclosure describes methods, non-transitory computer readable storage media, and systems that generate realistic shading for three-dimensional objects inserted into digital images. The disclosed system utilizes a light encoder neural network to generate a representation embedding of lighting in a digital image. Additionally, the disclosed system determines points of the three-dimensional object visible within a camera view. The disclosed system generates a self-occlusion map for the digital three-dimensional object by determining whether fixed sets of rays uniformly sampled from the points intersects with the digital three-dimensional object. The disclosed system utilizes a generator neural network to determine a shading map for the digital three-dimensional object based on the representation embedding of lighting in the digital image and the self-occlusion map. Additionally, the disclosed system generates a modified digital image with the three-dimensional object inserted into the digital image with consistent lighting of the three-dimensional object and the digital image.

GENERATING AUDIO WAVEFORMS USING ENCODER AND DECODER NEURAL NETWORKS

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.