Patent classifications
H03M7/705
COMPUTER ARCHITECTURE FOR STRING SEARCHING
An embodiment of the present invention is a prime representation data structure in a computer architecture. The prime representation data structure has a plurality of records where each record contains a prime representation and where the prime representation is a product of two or more selected prime factors. Each of the selected prime factor associated with an n-gram of a domain representation of a domain string. The domain representation of the domain string is a domain string of ordered, contiguous domain characters. The n-gram being a subset of n number of the ordered, contiguous domain characters in the domain string. The computer architecture performs string searching and includes one or more central processing units (CPUs) with one or more operating systems, one or more input/output device interfaces, one or more memories, and one or more input/output devices. The architecture further includes the prime representation data structure, one or more prime target query data structures and a search process performed by one or more of the CPUs. The CPUs can be organized in a hierarchical structure. The prime target query data structure has one or more target prime queries. Each target prime query is the product of one or more target selected prime factors. Each target selected factor is associated with a target n-gram of a target domain representation of a target domain string. The search process, performed by one or more of the CPUs, determines whether one or more of the target selected prime factors is common with one of the selected prime factors. By performing this efficient testing, the computer system can determine if one or more small strings are included in one or more large strings.
Hybrid comparison for unicode text strings consisting primarily of ASCII characters
A method compares text strings having Unicode encoding. The method receives a first string S=s.sub.1s.sub.2 . . . s.sub.n and a second string T=t.sub.1t.sub.2 . . . t.sub.m, where s.sub.1, s.sub.2, . . . , s.sub.n and t.sub.1, t.sub.2, . . . , t.sub.m are Unicode characters. The method computes a first string weight for the first string S according to a weight function ƒ. When S consists of ASCII characters, ƒ(S)=S. when S includes one or more non-replaceable non-ASCII characters, the first string weight ƒ(S) is a concatenation of an ASCII weight prefix ƒ.sub.A(S) and a Unicode weight suffix ƒ.sub.U(S). The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Hybrid Comparison for Unicode Text Strings Consisting Primarily of ASCII Characters
A method compares text strings having Unicode encoding. The method receives a first string S=s.sub.1s.sub.2 . . . s.sub.n and a second string T=t.sub.1t.sub.2 . . . t.sub.m, where s.sub.1, s.sub.2, . . . , s.sub.n and t.sub.1, t.sub.2, . . . , t.sub.m are Unicode characters. The method computes a first string weight for the first string S according to a weight function . When S consists of ASCII characters, (S)=S. when S includes one or more non-replaceable non-ASCII characters, the first string weight (S) is a concatenation of an ASCII weight prefix .sub.A(S) and a Unicode weight suffix .sub.U(S). The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Unicode conversion with minimal downtime
Prior to performing a Unicode conversion of a productive system and during an uptime processing stage of the productive system, files in the productive system are mapped to a cluster file system. Prior to the Unicode conversion and during an uptime processing stage of the productive system, a clone system of the productive system is generated using the cluster file system. Prior to the Unicode conversion and during an uptime processing stage of the productive system, the clone system is tested. During a downtime processing stage of the productive system, the Unicode conversion is performed. The clone system is activated, including making the clone system the productive system.
Hybrid comparison for unicode text strings consisting primarily of ASCII characters
A method compares text strings having Unicode encoding. The method receives a first string S=s.sub.1s.sub.2 . . . s.sub.n and a second string T=t.sub.1t.sub.2 . . . t.sub.m, where s.sub.1, s.sub.2, . . . , s.sub.n and t.sub.1, t.sub.2, . . . , t.sub.m are Unicode characters. The method computes a first string weight for the first string S according to a weight function . When S consists of ASCII characters, (S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, (S)=g(s.sub.1)g(s.sub.2) . . . g(s.sub.n), where g(s.sub.i)=s.sub.i when s.sub.i is an ASCII character and g(s.sub.i)=s.sub.i when s.sub.i is an accented ASCII character that is replaceable by the corresponding ASCII character s.sub.i. The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Reducing the domain of a subquery by retrieving constraints from the outer query
A database engine receives a human-readable database query that includes a subquery, and parses the database query to build an operator tree. The operator tree includes a subtree corresponding to the subquery. The database engine estimates the number of rows that will accessed when the subtree is executed and estimates the fraction of the cardinality of rows that will be filtered out by subsequent operations in the operator tree. In accordance with a determination that the estimated fraction exceeds a first threshold, the database engine inserts a domain constraint into the subtree that restricts rows retrieved by execution of the subtree, thereby forming a modified operator tree. The database engine executes the modified operator tree to form a final result set corresponding to the database query and returns the final result set.
Hybrid Comparison for Unicode Text Strings Consisting Primarily of ASCII Characters
A method compares text strings having Unicode encoding. The method receives a first string S=s.sub.1s.sub.2 . . . s.sub.n and a second string T=t.sub.1t.sub.2 . . . t.sub.m, where s.sub.1, s.sub.2, . . . , s.sub.n and t.sub.1, t.sub.2, . . . , t.sub.m are Unicode characters. The method computes a first string weight for the first string S according to a weight function . When S consists of ASCII characters, (S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, (S)=g(s.sub.1)g(s.sub.2) . . . g(s.sub.n), where g(s.sub.i)=s.sub.i when s.sub.i is an ASCII character and g(s.sub.i)=s.sub.i when s.sub.i is an accented ASCII character that is replaceable by the corresponding ASCII character s.sub.i. The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Hybrid comparison for unicode text strings consisting primarily of ASCII characters
A method compares text strings having Unicode encoding. The method receives a first string S=s.sub.1 s.sub.2 . . . s.sub.n and a second string T=t.sub.1 t.sub.2 . . . t.sub.m, where s.sub.1, s.sub.2, . . . , s.sub.n and t.sub.1, t.sub.2, . . . , t.sub.m are Unicode characters. The method computes a first string weight for the first string S according to a weight function . When S consists of ASCII characters, (S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, (S)=g(s.sub.1) g(s.sub.2) . . . g(s.sub.n), where g(s.sub.i)=s.sub.i when s.sub.i is an ASCII character and g(s.sub.i)=s.sub.i when s.sub.i is an accented ASCII character that is replaceable by the corresponding ASCII character s.sub.i. When S includes one or more non-replaceable non-ASCII characters, the first string weight concatenates an ASCII weight prefix .sub.A (S) and a Unicode weight suffix .sub.U(S). The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Hybrid Comparison for Unicode Text Strings Consisting Primarily of ASCII Characters
A method compares text strings having Unicode encoding. The method receives a first string S=s.sub.1 s.sub.2 . . . s.sub.n and a second string T=t.sub.1 t.sub.2 . . . t.sub.m, where s.sub.1, s.sub.2, . . . , s.sub.n and t.sub.1, t.sub.2, . . . , t.sub.m are Unicode characters. The method computes a first string weight for the first string S according to a weight function . When S consists of ASCII characters, (S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, (S)=g(s.sub.1) g(s.sub.2) . . . g(s.sub.n), where g(s.sub.i)=s.sub.i when s.sub.i is an ASCII character and g(s.sub.i)=s.sub.i when s.sub.i is an accented ASCII character that is replaceable by the corresponding ASCII character s.sub.i. When S includes one or more non-replaceable non-ASCII characters, the first string weight concatenates an ASCII weight prefix .sub.A (S) and a Unicode weight suffix .sub.U(S). The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Space compression for file size reduction
A computer-implemented method according to one embodiment includes receiving a text document for storage within a storage device. The text document includes a plurality of words which are separated by spaces. Further, each word includes a last letter. The computer-implemented method also includes replacing the last letter of each word in the text document with a replacement symbol and removing the space after each word so as to reduce the file size of the text document to create a reduced file size text document. The computer-implemented method further includes storing the reduced file size text document within the storage device.