G06F40/129

Method and device for sorting Chinese characters, searching Chinese characters and constructing dictionary
20230004707 · 2023-01-05 ·

The invention discloses a method and a device for sorting Chinese characters, searching for Chinese characters and constructing a dictionary, and relates to the technical field of computers. A specific implementation of the method includes: obtaining the first basic character-forming component of a Chinese character according to the stroke order as the First Character, and encoding the First Character to obtain the First Character code, where the First Character includes the first character-forming component and the first main stroke component of a Chinese character; obtaining the number of strokes included in each Chinese character, and obtaining the corresponding stroke string of each Chinese character; using the First Character code as the first and highest priority sorting field, the number of strokes as the second sorting field, and the stroke string as the third and the lowest priority sorting field to sort Chinese characters. This embodiment can solve the problem of difficulty in sorting and searching of Chinese characters caused by the unfixed definition and position of radicals.

Chinese Character Input Method, System and Keyboard
20230004730 · 2023-01-05 ·

The invention discloses a Chinese character input method, system and a keyboard, and relates to the technical field of computers. A specific implementation of the method includes: recognizing the received key signal; in the case where the recognition result of the received key signal indicates a Chinese character Category Code and/or phrase Category Code, determining the recognized Chinese character and/or phrase represented by the Chinese character Category Code and/or the phrase Category Code; where the Chinese character Category Code is a combination of component Category Codes or a combination of component Category Codes and stroke Category Codes, used to represent Chinese characters; phrase Category Codes are combinations of component Category Codes, used to indicate phrases; display the determined Chinese characters and/or phrases. This implementation method solves the problem of messy character splitting, conforms to the character theory, is easy to remember and easy to use, does not require special learning. The entire input process is very natural, there are not many rules, the learning difficulty is reduced, and there are no special requirements for equipment conditions.

Chinese Character Input Method, System and Keyboard
20230004730 · 2023-01-05 ·

The invention discloses a Chinese character input method, system and a keyboard, and relates to the technical field of computers. A specific implementation of the method includes: recognizing the received key signal; in the case where the recognition result of the received key signal indicates a Chinese character Category Code and/or phrase Category Code, determining the recognized Chinese character and/or phrase represented by the Chinese character Category Code and/or the phrase Category Code; where the Chinese character Category Code is a combination of component Category Codes or a combination of component Category Codes and stroke Category Codes, used to represent Chinese characters; phrase Category Codes are combinations of component Category Codes, used to indicate phrases; display the determined Chinese characters and/or phrases. This implementation method solves the problem of messy character splitting, conforms to the character theory, is easy to remember and easy to use, does not require special learning. The entire input process is very natural, there are not many rules, the learning difficulty is reduced, and there are no special requirements for equipment conditions.

INTENT CLASSIFICATION USING NON-CORRELATED FEATURES
20220405472 · 2022-12-22 ·

A system for classifying a language sample intent by receiving a language sample including a set of features, identifying language sample features, determining a tokenization score for the language sample according to the language sample features, eliminating duplicate features according to the tokenization score, determining a term frequency (tf) according to the identified features and the tokenization score, determining an inverse document frequency (idf) according to the identified features and the tokenization score, and generating a term frequency-inverse document frequency (tf-idf) matrix for the identified features.

Display device, display method, and computer-readable recording medium
11514696 · 2022-11-29 · ·

A display device includes a circuitry configured to perform a search for a plurality of image candidates in an image transformation dictionary part, based on handwritten data, and a display configured to display the plurality of image candidates obtained by the search. At least a portion of the plurality of image candidates displayed on the display represents a different person or an object.

USER INTERFACE PRESENTATION METHOD AND APPARATUS, COMPUTER-READABLE MEDIUM AND ELECTRONIC DEVICE
20220365644 · 2022-11-17 ·

The disclosure relates to a user interface presentation method and apparatus, a computer-readable medium and an electronic device. The method comprises: performing, according to the axis of symmetry of a to-be-flipped user interface, mirror image flipping on said user interface, so as to obtain a first interface; determining a first target element in the first interface; performing mirror image flipping on the first target element in the first interface according to the axis of symmetry of the first target element to form a second target element, so as to obtain a second interface; and presenting the second interface. Thus, the flipping of a user interface is realized by means of mirror image flipping, so as to adapt to reading habits of a user.

Parallel unicode tokenization in a distributed network environment

Unicode data can be protected in a distributed tokenization environment. Data to be tokenized can be accessed or received by a security server, which instantiates a number of tokenization pipelines for parallel tokenization of the data. Unicode token tables are accessed by the security server, and each tokenization pipeline uses the accessed token tables to tokenization a portion of the data. Each tokenization pipeline performs a set of encoding or tokenization operations in parallel and based at least in part on a value received from another tokenization pipeline. The outputs of the tokenization pipelines are combined, producing tokenized data, which can be provided to a remote computing system for storage or processing.

Information presentation device, and information presentation method
11495209 · 2022-11-08 · ·

There is provided an information presentation device that is configured to present information, to a plurality of users that differ in level, in such a manner that each of the users can easily understand the information, and an information presentation method. The information presentation device includes: an identification unit that identifies respective levels of one or more users; an obtaining unit that obtains presentation information to be presented to the users; a conversion unit that appropriately converts the obtained presentation information according to the level of each user; and a presentation unit that presents the appropriately converted presentation information to each user. The present technology can be applied to, for example, a robot, a signage device, a car navigation device, and the like.

TAXPAYER INDUSTRY CLASSIFICATION METHOD BASED ON LABEL-NOISE LEARNING
20230031738 · 2023-02-02 ·

Disclosed is a taxpayer industry classification method based on label-noise learning, which comprises the following steps: extracting text information to be mined from taxpayer industry information for text embedding, and performing feature processing on the embedded information; extracting, non-text information from the taxpayer industry information for encoding; constructing a BERT-CNN deep network structure, a number of neurons and the dimensionality of input and output in each layer and the number of target categories; pre-training the network constructed in the previous step through contrastive learning, nearest neighbor semantic clustering and self-labeling learning in turn; adding, a noise modeling layer on the basis of the constructed deep network, modeling label noise distribution through network self-trust and noisy label information, and performing model training; taking the deep network before the noise modeling layer as a classification model, and classifying taxpayer industries.

TAXPAYER INDUSTRY CLASSIFICATION METHOD BASED ON LABEL-NOISE LEARNING
20230031738 · 2023-02-02 ·

Disclosed is a taxpayer industry classification method based on label-noise learning, which comprises the following steps: extracting text information to be mined from taxpayer industry information for text embedding, and performing feature processing on the embedded information; extracting, non-text information from the taxpayer industry information for encoding; constructing a BERT-CNN deep network structure, a number of neurons and the dimensionality of input and output in each layer and the number of target categories; pre-training the network constructed in the previous step through contrastive learning, nearest neighbor semantic clustering and self-labeling learning in turn; adding, a noise modeling layer on the basis of the constructed deep network, modeling label noise distribution through network self-trust and noisy label information, and performing model training; taking the deep network before the noise modeling layer as a classification model, and classifying taxpayer industries.