G06F40/151

TECHNIQUES FOR DOCUMENT CREATION BASED ON IMAGE SECTIONS
20230222282 · 2023-07-13 · ·

In an embodiment, an image reception system is communicatively coupled to an image analysis system and is configured to receive a digital image and analyze the pixels of the digital image to determine one or more regions in the digital image. For each region in the one or more regions in the digital image, the image analysis system recognizes the content in the region. A document creation system communicatively coupled to the image analysis system is configured to create a digital document based on the recognized content for the one or more regions. In some embodiments, the image analysis system is further configured to analyze the digital image to detect one or more of the following: region markers, tables, headers.

METHODS AND APPARATUS FOR RETRIEVING INFORMATION VIA AN INTERMEDIATE REPRESENTATION
20230222120 · 2023-07-13 ·

The disclosed subject matter relates to a system and method for providing an automated assistant that retrieves information from a knowledge base in response to a user's natural language question. A user's natural language question voice is transformed into an intermediate representation. From the intermediate representation, a cypher query is generated which may be used to query the database. The query results are provided in response to the user. The transformation into the intermediate representation is database independent while the cypher query is dependent upon the database queried.

METHODS AND APPARATUS FOR RETRIEVING INFORMATION VIA AN INTERMEDIATE REPRESENTATION
20230222120 · 2023-07-13 ·

The disclosed subject matter relates to a system and method for providing an automated assistant that retrieves information from a knowledge base in response to a user's natural language question. A user's natural language question voice is transformed into an intermediate representation. From the intermediate representation, a cypher query is generated which may be used to query the database. The query results are provided in response to the user. The transformation into the intermediate representation is database independent while the cypher query is dependent upon the database queried.

CHARACTER STRING TRANSMISSION METHOD AND DEVICE, COMPUTER, AND READABLE STORAGE MEDIUM
20230214577 · 2023-07-06 ·

Disclosed are a character string transmission method and device, a computer, and a readable storage medium. The method includes the following steps: obtaining a target character string, and adding an escape character before each special character in the target character string, the special character being a character that is incapable of being transmitted accurately to a target script; converting the special character into a transcoded character in an American Standard Code for Information Interchange (ASCII) code form to obtain a transcoded character string; transmitting the transcoded character string to the target script by means of a shell; and calling the target script to decode the transcoded character string to obtain the target character string. Compared with existing complex escape, the method is easier to implement, and the special character may be effectively prevented from being specially processed by the shell.

Cross-lingual unsupervised classification with multi-view transfer learning
11694042 · 2023-07-04 · ·

Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based fine-tuning approaches that adjust parameters solely based on the classification error on training data, embodiments employ an encoder-decoder framework of an UMT as a regularization component on the shared network parameters. In one or more embodiments, the cross-lingual encoder of embodiments learns a shared representation, which is effective for both reconstructing input sentences of two languages and generating more representative views from the input for classification. Experiments on five language pairs verify that an MVEC embodiment significantly outperforms other models for 8/11 sentiment classification tasks.

Cross-lingual unsupervised classification with multi-view transfer learning
11694042 · 2023-07-04 · ·

Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based fine-tuning approaches that adjust parameters solely based on the classification error on training data, embodiments employ an encoder-decoder framework of an UMT as a regularization component on the shared network parameters. In one or more embodiments, the cross-lingual encoder of embodiments learns a shared representation, which is effective for both reconstructing input sentences of two languages and generating more representative views from the input for classification. Experiments on five language pairs verify that an MVEC embodiment significantly outperforms other models for 8/11 sentiment classification tasks.

Adversarial anonymization and preservation of content
11544460 · 2023-01-03 · ·

Systems and methods for anonymizing content suggestive of a particular characteristic while preserving relevant content are disclosed. An example method may be performed by one or more processors of a protection system and include defining an anonymization loss indicative of an accuracy at which a trained discriminator model can predict a particular characteristic, defining a content loss indicative of a difference between latent representations of versions of a document, defining a combined objective function incorporating the anonymization and content losses, extracting and anonymizing suggestive content from training documents while preserving relevant content, and adversarially training, using the associated accuracies and differences in the combined objective function, a transformation model to transform a given document representative of credentials of a given person possessing the particular characteristic into an anonymized document maximizing a predicted uncertainty of the trained discriminator model while simultaneously maximizing an amount of relevant information about the person preserved.

Caption modification and augmentation systems and methods for use by hearing assisted user

A system and method for facilitating communication between an assisted user (AU) and a hearing user (HU) includes receiving an HU voice signal as the AU and HU participate in a call using AU and HU communication devices, transcribing HU voice signal segments into verbatim caption segments, processing each verbatim caption segment to identify an intended communication (IC) intended by the HU upon uttering an associated one of the HU voice signal segments, for at least a portion of the HU voice signal segments (i) using an associated IC to generate an enhanced caption different than the associated verbatim caption, (ii) for each of a first subset of the HU voice signal segments, presenting the verbatim captions via the AU communication device display for consumption, and (iii) for each of a second subset of the HU voice signal segments, presenting enhanced captions via the AU communication device display for consumption.

Systems and methods for creating enhanced documents for perfect automated parsing

The disclosed enhanced document creation and parsing systems deal with enhanced documents that allow for the presentation of document content in a preferred visual manner, while ensuring that the document content can be captured accurately by an automated parser with nothing being discarded or misrepresented. The enhanced document creation system may create an enhanced document by encoding document content in accordance with a defined schema, optionally encrypting the resulting structured data into an encrypted byte string, and embedding the encrypted byte string as non-visible metadata in a rendered document. The resulting enhanced document can be completely and accurately parsed by an enhanced document parsing system that is capable of extracting, decrypting and decoding the embedded document metadata.

ENCODING VARIABLE LENGTH CHARACTERS USING SIMULTANEOUS PROCESSING
20220405460 · 2022-12-22 ·

Embodiments are directed to managing character encoding. A plurality characters that are each encoded as code units based on a character code may be provided such that the code units for each character represents a code point of a character encoding scheme. An encoding model may be determined based on the character code, one or more processor features, and a target character code. Process features may be employed to transform the code units into target code units based on the encoding model such that the target code units are based on the target character code and such that the target code units encode the code point for each character. The plurality of target characters may be provided to a target stream such that each target character may be encoded as the target code units.