G06V30/196

REGULAR EXPRESSION GENERATION USING LONGEST COMMON SUBSEQUENCE ALGORITHM ON COMBINATIONS OF REGULAR EXPRESSION CODES

Disclosed herein are techniques related to automated generation of regular expressions. In some embodiments, a regular expression generator may receive input data comprising one or more character sequences. The regular expression generator may convert character sequences into a sets of regular expression codes and/or span data structures. The regular expression generator may identify a longest common subsequence shared by the sets of regular expression codes and/or spans, and may generate a regular expression based upon the longest common subsequence.

Method for testing medical data

A method for testing medical data is provided. Each medical datum includes a plurality of information units and a plurality of separators, and the method includes the following steps: a. matching the medical data against a standard library including a plurality of patterns, a matching expression being: [\s\S][number/sequence/relation]&[\b|\B] (S101); and b. determining, based on a matching result of the step a, whether the medical datum is qualified (S102). A standardized standard library is first established, a matching result is obtained by matching the medical datum and the standard library for a non-initial boundary, an initial boundary, an information quantity, information sequences, a semantic relationship quantity, a character boundary, and a non-character boundary, and whether the medical datum meets a requirement is further determined according to the matching result.

Article reading device

An article reading device according to an embodiment includes a display device and an image capturing device that generates an image of an article. A processor extracts, from the image, first feature data for recognizing the article and second feature data for determining whether to recognize the article based on the first feature data. The processor determines whether to recognize the article. If it is determined to recognize the article, the processor recognizes the article based on the extracted first feature data, and controls the display device to display a recognition result. If it is determined to not recognize the article, extract a barcode from the image, the processor identifies the article based on the extracted barcode, and control the display device to display an identification result. The processor performs a transaction settlement with respect to the recognition result, if any, and the identification result, if any.

Machine evaluation of contract terms

The present disclosure provides for a method of machine representation and tracking of contract terms over the lifetime of a contract including a step of defining an object model having object model components. Object model components are associated with other object model components where the object model components have object model component types. Further, words of object model components are evaluated to identify whether the words contain one or more core attributes pertaining to details of the contract terms. From the object model components, and the terms they contain, prevailing terms of the contract are evaluated, stored and updated as changes are made to the object model components.

Techniques for sentiment analysis of data using a convolutional neural network and a co-occurrence network

Techniques are provided for performing sentiment analysis on words in a first data set. An example embodiment includes generating a word embedding model including a first plurality of features. A value indicating sentiment for the words in the first data set can be determined using a convolutional neural network (CNN). A second plurality of features are generated based on bigrams identified in the data set. The bigrams can be generated using a co-occurrence graph. The model is updated to include the second plurality of features, and sentiment analysis can be performed on a second data set using the updated model.

Systems and methods for using image analysis to automatically determine vehicle information

The present disclosure is directed to systems and methods for analyzing digital images to determine alphanumeric strings depicted in the digital images. An electronic device may generate a set of filtered images using a received digital image. The electronic device may also perform an optical character recognition (OCR) technique on the set of filtered images, and may filter out any of the set of filtered images according to a set of rules. The electronic device may further identify a set of common elements representative of the alphanumeric string depicted in the digital image, and determine a machine-encoded alphanumeric string based on the set of common elements.

Systems and methods to identify breaking application program interface changes

Systems and methods for managing Application Programming Interfaces (APIs) are disclosed. For example, the system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include sending a first call to a first node-testing model associated with a first API and receiving a first model output comprising a first model result and a first model-result category. The operations may include identifying a second node-testing model associated with a second API and sending a second call to the second node testing model. The operations may include receiving a second model output comprising a second model result and a second model-result category. The operations may include performing at least one of sending a notification, generating an updated first node-testing model, generating an updated second node-testing model, generating an updated first call, or generating an updated second call.

Systems and methods for censoring text inline

Systems and methods for censoring text-based data are provided. In some embodiments a censoring system may include at least one processor and at least one non-transitory memory storing application programming interface instructions. The censoring system may be configured to perform operations comprising storing a target pattern type and a computer-based model for identifying a target data pattern corresponding to a target pattern type within text based data. The censoring system may also be configured to receive text-based data by a server, and to retrieve the stored target pattern type to be censored in the text-based data. The censoring system may be configured to identify within the received text-based data, a target data pattern corresponding to the retrieved target pattern type. The censoring system may be configured to censor target characters within the identified target data pattern, and transmit the censored text-based data to a receiving party.

System, method, and computer-accessible medium for evaluating multi-dimensional synthetic data using integrated variants analysis

An exemplary system, method, and computer-accessible medium can include, for example, receiving an original dataset(s), receiving a synthetic dataset(s), training a model(s) using the original dataset(s) and the synthetic dataset(s), and evaluating the synthetic dataset(s) based on the training of the model(s). The model(s) can include a first model and a second model, and the first model can be trained using the original dataset(s) and the second model can be trained using the synthetic dataset(s). The synthetic dataset(s) can be evaluated by comparing first results from the training of the first model to second results from the training of the second model.

Apparatus and method for recognizing image-based content presented in a structured layout

A method for extracting information from a table includes steps as follows. Characters of a table are extracted. The characters are merged into n-gram characters. The n-gram characters are merged into words and text lines through a two-stage GNN mode. The two-stage GNN mode comprises sub steps as: spatial features, semantic features, CNN image features are extracted from a target source; a first GNN stage is processed to output graph embedding spatial features from the spatial features; and a second GNN stage is processed to output graph embedding semantic features and graph embedding CNN image features from the semantic features and the CNN image features, respectively. The text lines are merged into cells. The cells are grouped into rows, columns, and key-value pairs based on one or more adjacency matrices, a row relationship among the cells, a column relationship among the cells, and a key-value relationship among the cells.