SYNTHETIC DATA STORAGE SYSTEM BASED ON ATTRIBUTES OF ARECACEAE
20220410113 · 2022-12-29
Inventors
Cpc classification
B01J19/0046
PERFORMING OPERATIONS; TRANSPORTING
G16B35/00
PHYSICS
C09K5/20
CHEMISTRY; METALLURGY
International classification
B01J19/00
PERFORMING OPERATIONS; TRANSPORTING
C09K5/20
CHEMISTRY; METALLURGY
Abstract
A data storage medium includes a substrate. The data storage medium also includes an antifreeze layer coated on at least one surface of the substrate. The data storage medium further includes multiple storage containers located on the substrate. The multiple storage containers store different combinations of plant-based molecules representing data.
Claims
1. A data storage system comprising: a freeze-tolerant data storage medium, the freeze-tolerant data storage medium having a grid comprising a plurality of grid elements, with each grid element of the plurality of grid elements comprising a storage container; a molecule generator configured to generate a plurality of different plant-based molecules representing data; and a molecule depositor, coupled to the molecule generator, configured to deposit different combinations of the plurality of different plant-based molecules into different storage containers of the plurality of storage containers.
2. The data storage system of claim 1 and further comprising a reader configured to identify the different combinations of the plurality of different plant-based molecules deposited into the different storage containers of the plurality of storage containers.
3. The data storage system of claim 2 and wherein the reader is configured to identify the different combinations of the plurality of different plant-based molecules deposited into the different storage containers of the plurality of storage containers by ultra violet spectroscopy, by molecular weight, or by a combination thereof.
4. The data storage system of claim 1 and wherein the freeze-tolerant data storage medium comprises a glass plate, with at least one surface of the glass plate being coated with an antifreeze layer.
5. The data storage system of claim 4 and wherein the antifreeze layer comprises an antifreeze protein.
6. The data storage system of claim 1 and further comprising a molecule library that comprises a mapping of the plurality of different plant-based molecules to corresponding alphanumeric characters represented in binary.
7. The data storage system of claim 6 and further comprising a controller configured to receive binary data, and configured to determine plant-based molecules of the plurality of different plant-based molecules corresponding to the received binary data from the molecule library.
8. The data storage system of claim 7 and wherein the controller is further configured to communicate the determined plant-based molecules to the molecule generator for generation by a chemical reaction.
9. A method comprising: providing a substrate; coating at least one surface of the substrate with an antifreeze layer; and providing a plurality of storage containers on the substrate, the plurality of storage containers configured to store different combinations of a plurality of different plant-based molecules representing data.
10. The method of claim 9 and wherein the substrate comprises a glass plate.
11. The method of claim 9 and wherein the antifreeze layer comprises an antifreeze protein.
12. The method of claim 9 and further comprising arranging the plurality of storage containers within a grid on the substrate, with each grid element comprising a different storage container of the plurality of storage containers.
13. A data storage medium comprising: a substrate; an antifreeze layer coated on at least one surface of the substrate; and a plurality of storage containers on the substrate, the plurality of storage containers configured to store different combinations of a plurality of different plant-based molecules representing data.
14. The data storage medium of claim 13 and wherein the substrate comprises a glass plate.
15. The data storage medium of claim 13 and wherein the antifreeze layer comprises an antifreeze protein.
16. The data storage medium of claim 13 and the plurality of storage containers is arranged within a grid on the substrate, with each grid element comprising a different storage container of the plurality of storage containers.
17. A data storage system comprising the data storage medium of claim 13, the data storage system further comprising: a molecule generator configured to generate the plurality of different plant-based molecules representing data; and a molecule depositor, coupled to the molecule generator, configured to deposit different combinations of the plurality of different plant-based molecules into different storage containers of the plurality of storage containers.
18. The data storage system of claim 17 and further comprising a reader configured to identify the different combinations of the plurality of different plant-based molecules deposited into the different storage containers of the plurality of storage containers.
19. The data storage system of claim 18 and wherein the reader is configured to identify the different combinations of the plurality of different plant-based molecules deposited into the different storage containers of the plurality of storage containers by ultraviolet spectroscopy, by molecular weight, or by a combination thereof.
20. The data storage system of claim 17 and further comprising a molecule library that comprises a mapping of the plurality of different plant-based molecules to corresponding alphanumeric characters represented in binary.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0006]
[0007]
[0008]
[0009]
[0010]
[0011]
[0012]
[0013]
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
[0014] Embodiments of the disclosure provide synthetic data storage systems that are based in part on physical and chemical properties of Arecaceae (a family of perennial flowering plants).
[0015] In some embodiments, molecules that are similar to molecules found in Arecaceae are generated and utilized to represent data, and media having antifreeze properties of Arecaceae are developed and utilized to store the generated molecules that represent the data. Prior to providing additional detail regarding different embodiments of the disclosure, an example general embodiment is described below in connection with
[0016] It should be noted that like reference numerals are used in different figures for same or similar elements. It should also be understood that the terminology used herein is for the purpose of describing embodiments, and the terminology is not intended to be limiting. Unless indicated otherwise, ordinal numbers (e.g., first, second, third, etc.) are used to distinguish or identify different elements or steps in a group of elements or steps, and do not supply a serial or numerical limitation on the elements or steps of the embodiments thereof. For example, “first,” “second,” and “third” elements or steps need not necessarily appear in that order, and the embodiments thereof need not necessarily be limited to three elements or steps. It should also be understood that, unless indicated otherwise, any labels such as “left,” “right,” “front,” “back,” “top,” “bottom,” “forward,” “reverse,” “clockwise,” “counter clockwise,” “up,” “down,” or other similar terms such as “upper,” “lower,” “aft,” “fore,” “vertical,” “horizontal,” “proximal,” “distal,” “intermediate” and the like are used for convenience and are not intended to imply, for example, any particular fixed location, orientation, or direction. Instead, such labels are used to reflect, for example, relative location, orientation, or directions. It should also be understood that the singular forms of “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[0017] It will be understood that, when an element is referred to as being “connected,” “coupled,” or “attached” to another element, it can be directly connected, coupled or attached to the other element, or it can be indirectly connected, coupled, or attached to the other element where intervening or intermediate elements may be present. In contrast, if an element is referred to as being “directly connected,” “directly coupled” or “directly attached” to another element, there are no intervening elements present. Drawings illustrating direct connections, couplings or attachments between elements also include embodiments, in which the elements are indirectly connected, coupled or attached to each other.
[0018]
[0019] In data storage system 100, molecule generator 104 and molecule depositor 106 together act as a writer to store information as molecule combinations on data storage medium 102. Reader 108 identifies the molecule combinations on the data storage medium 102, and thereby retrieves the stored information from the data storage medium 102. Controller 110 may include one or more processors and one or more memories that store program code having instructions that are executable by the processor(s). Controller 110, which is coupled to molecule generator 104, molecule deposition 106 and reader 108, may control operations of those components.
[0020] In embodiments of the disclosure, the molecules generated by molecule generator 104 and deposited on data storage medium 102 by molecule depositor 106 are molecules that are formed using amino acids found in Arecaceae, and the data storage medium 102 has antifreeze or freeze-tolerant properties of (or similar to) Arecaceae. Details regarding antifreeze properties of Arecaceae are provided further below.
[0021] In data storage system 100, freeze-tolerant data storage medium 102 has a grid 111 including multiple grid elements 112. In some embodiments, each of the grid elements 112 includes a storage container 114 that is configured to store molecules deposited by depositor 106. In one embodiment, each storage container 114 may include an adhesive that holds the molecules in place when deposited into the container 114. In other embodiments, no containers 114 are used, and the molecules may be directly deposited on the grid elements 112. In such embodiments, the adhesive that is configured to hold the molecules in place when deposited may be applied directly on the grid elements 112. Details regarding how the molecules are generated are provided further below.
[0022] Synthetic data storage system 100 may include a library of characters 115 that maps characteristics of different molecules and/or molecule combinations to different combinations of bits. Thus, when binary data arrives to the synthetic data storage system 100 from a host 116 for storage, molecule types corresponding to the binary data are first identified from the library 115 by controller 110. Then, based on the molecule types identified by controller 110, molecule generator 104 generates the molecules by one or more chemical reactions and provides them to molecule depositor 106. Molecule depositor 106 deposits the molecules in storage containers 114. To carry out a read operation for the stored data (for example, in response to a read command received from host 116), controller 110 directs reader 108 to the container(s) 114 that include molecules that represent the requested information. The reader 108 identifies the molecules, and returns the identified information to the controller 110. The controller 110 converts the identified information into binary data, and returns the data to the host 116.
[0023] In the embodiment shown in
[0024] In one embodiment, the data storage medium 102 includes a glass plate, with one or more surfaces of the glass plate coated with an antifreeze layer (e.g., an antifreeze protein (AFP)) similar to chemical properties of Arecaceae. Physical and chemical properties of Arecaceae are listed below in tables 1 and 2, respectively.
TABLE-US-00001 TABLE 1 Physical properties Tensile Young's Density Elongation strength modulus Fiber (g/cm3) (%) (MPa) (GPa) Palm leaf stalk .sup. 1-1.2 2-4.50 97-196 2.50-5.40 Palm leaf sheath 1.20-1.30 2.84 220 4.8 Palm petiole 0.7-1.55 25 248 3.24 Palm fruit 1.09 28 423 6.-8. Coir 1.15-1.2 30 175 4.-6. Pineapple leaf 0.80-1.60 14.5 144 400-627 [0025] Young's modulus is a mechanical property that measures stiffness of a solid material. It defines a relationship between stress (force per unit area) and strain (proportional deformation) in a material in a linear elasticity regime of a uniaxial deformation. The international system (SI) unit of Young's modulus is pascal (GPa in Table 1 is gigapascal). [0026] Tensile strength is a resistance of a material to breaking under tension. In other words, the tensile strength of a material is the maximum amount of tensile stress that it can take before failure, for example breaking. The SI unit of Young's modulus is pascal (MPa in Table 1 is megapascal). [0027] Elongation is a state, act, or process of lengthening. [0028] Density of a substance is its mass per unit volume (grams(s)/cubic centimeter (cm.sup.3) in Table 1).
TABLE-US-00002 TABLE 2 Chemical properties Cellulose Hemi cellulose Lignin Wax Fiber (%) (%) (%) (%) Palm leaf stalk 40-52 42-43 — — Palm leaf sheath 28 25 45 — Palm petiole 30 14 28 — Palm fruit 53 12 21 0.8 Coir 32-43 0.15-0.25 40-45 — Pineapple leaf 70-83 — 5-12.7 — [0029] Cellulose content is responsible for long fiber chains that range from 28%-53% for palm fibers. [0030] Hemi-cellulose leads to disintegration of cellulose microfibrils that decrease the fiber strength. For palm, hemi-cellulose ranges from 12%-43%. Microfibrils is used as a general term to describe the structure of protein fiber, e.g., hair. [0031] Lignins help in the formation of cell walls, especially in wood and bark, because they lend rigidity and do not rot easily. [0032] Plants secrete waxes into and on the surface of their cuticles as a way to control evaporation, wettability and hydration.
[0033] Plants such as Arecaceae have their own deoxyribonucleic acid (DNA) and their properties such as density and durability are some of the premises that were considered for the disclosure. Regarding density, research papers indicate that one gram of DNA can potentially store 215 petabytes of data. An average hard disk drive in a laptop can house just one millionth of that amount. Thus, as indicated above, embodiments of the disclosure encode data at a molecule level (using synthesized plant-based molecules) and store the molecules representing the large amounts of data in a medium that extends the data shelf life. Regarding durability, the natural physical and chemical properties of palm leaves, if preserved under the right conditions, can increase the shelf life of the medium for 600 years or more.
[0034] Palm leaves includes AFPs, also known as thermal hysteresis proteins, which are compound proteins (e.g., a protein complex combining amino acids with other substances, usually sugar), which have the following properties: [0035] A high concentration of amino acid threonine. [0036] A naturally formed AFP is not hazardous and toxic like chemical antifreeze used in cars (ethylene glycol). [0037] A high rich cellulose content and a relatively low lignin content in palm fibers ensure substantial mechanical strength. [0038] Tensile values of palm fibers are significantly higher than any other natural fibers and can be used for reinforcement in a polymer matrix. [0039] Palm fruit fiber reinforced polyester composite has an appreciable tensile strength (e.g., 100-200 MPa). The tensile strength is based on the size of a medium (e.g., 50*50 by 2 mm thickness or 20*60 by 2 mm thickness). In Table 1 above (physical properties table), tensile strength varies for different fibers (e.g., variations in tensile strength from palm leaf stalk to coir).
[0040] Antifreeze prevents growth of crystals when a data storage medium (e.g., 102 of
[0041] Unlike polymers of nucleotides where alphabets can be formed from 4 nucleotide bases A, T, G and C, polymers of amino acids (known as proteins) have 22 flavors. Each synthesized small plant-based molecule is created using amino acids. Amino acids utilized to generate the plant-based molecules in some embodiments are listed below in table 3.
TABLE-US-00003 TABLE 3 Molecular Three One weight in Amino Acid Name Letter Code Letter Code Dalton (Da) alanine ala A 89 arginine arg R 174 asparagine asn N 132 aspartic acid asp D 133 cysteine cys C 121 glutamine gln Q 146 glutamic acid glu E 147 glycine gly G 147 histidine his H 155 isoleucine ile I 131 leucine leu L 131 lysine lys K 146 methionine met M 149 phenylalanine phe F 165 proline pro P 115 serine ser S 105 threonine thr T 119 tryptophan trp W 204 tyrosine tyr Y 181 valine val V 117 Dalton (Da) is an alternate name for an atomic mass unit, and a kilodalton (kDa) is 1,000 daltons. Thus, a protein with a mass of 64 kDa has a molecular weight of 64,000 grams per mole.
[0042] Each amino acid has a spectral color, and when a molecule is created with amino acid combinations, that molecule is associated with a color. In some embodiments, each molecule is created using a multi-component reaction (MCR) process, and each molecule represents an alphabet. Accordingly, a word library may be created from molecules.
[0043] Some reasons for choosing an MCR process for generating molecules are as follows:
1) MCRs are convergent reactions in which three or more components react to form a product.
2) MCRs involve a single operational step, and may be carried out in a single pot/container.
3) An MCR reduces waste generation due to its convergent nature and its single operational step.
4) An MCR does not generate a by-product, and is considered to be an eco-friendly reaction system.
5) MCRs are substantially atomically balanced (e.g., a number of atoms of any given element does not change in any reaction, which maximizes the incorporation of all materials used in the process).
6) Due to weak negative reactions, MCRs are usually safe processes.
7) MCRs save time, and are energy efficient.
[0044]
Accordingly, as indicated above, to improve yield and efficiency, chemical properties of different MCR reactions/categories are taken into consideration before building a final product.
[0050]
[0051] MCR methodology is considered as a sorted tool to create combinational libraries. Since MCR can generate diverse libraries, it can be considered as a tool to formulate unique molecules. As indicated above, in embodiments of the disclosure, MCR reactions are used to combine amino acids to make unique molecules. Each unique molecule represents a unique character and a group of molecules forms a word. Every different generated character can be associated with a unique color. For example, 26 alphabets=26 unique molecules, and each unique molecule is associated with a unique color. Thus, a library of words may be created based on the generated molecules. Examples for creating data from amino acids are provided below.
[0052] An amino acid has both a basic amine group and an acidic carboxylic acid group. The 20 amino acids in Table 3 are structured as polar and non-polar amino acids. Distinct letters may be formed from polar or non-polar amino acids.
[0053] Consider the word “The,” which has the following binary equivalents of ASCII from an ASCII-binary character table:
T is 01010100
[0054] h is 01101000
e is 01100101
As can be seen in
To form a molecule (T) from R+N+D, 2 R, 2N and 1D with an additional standard component are utilized. A simple demonstration is as follows:
[0055] Convert R, N, D into binary equivalents
[0056] R is 01010010
[0057] N is 01001110
[0058] D is 01000100
[0059] As noted above, T is 01010100
R*R=01101001000100
(R*R)/N=01010110
01010110+N=010101100
010101100−D=01101000
01101000−00010100=01010100 (which is T)
[0060]
[0061] Data stored in the form of molecules on a data storage medium such as 102 of
[0062] The above examples describe storage of single words as molecules, and the retrieval of stored single words by identifying the molecules. In general, data storage systems in accordance with embodiments of the disclosure may be employed for storing any amounts of data. For example, if there are in total 500-1000 grid elements in the data storage medium, by using 22 amino acids in different combinations, in total 500-1000 molecules can be created. In general, molecule generation may be carried out once and may be dependent on a number of grid elements. As indicated in an example provided above, a mixture of three amino acids (R+N+D) may store one byte of information which represents one molecule. A larger combination of amino acids can stretch up to larger data sets (e.g., 2.sup.6). It should be noted that the grid elements on a data storage medium are labelled/numbered to enable tracking of molecules for storage and retrieval of information.
[0063] The illustrations of the embodiments described herein are intended to provide a general understanding of the structure of the various embodiments. The illustrations are not intended to serve as a complete description of all of the elements and features of apparatus and systems that utilize the structures or methods described herein. Many other embodiments may be apparent to those of skill in the art upon reviewing the disclosure. Other embodiments may be utilized and derived from the disclosure, such that structural and logical substitutions and changes may be made without departing from the scope of the disclosure. Additionally, the illustrations are merely representational and may not be drawn to scale. Certain proportions within the illustrations may be exaggerated, while other proportions may be reduced. Accordingly, the disclosure and the figures are to be regarded as illustrative rather than restrictive.
[0064] One or more embodiments of the disclosure may be referred to herein, individually and/or collectively, by the term “invention” merely for convenience and without intending to limit the scope of this application to any particular invention or inventive concept. Moreover, although specific embodiments have been illustrated and described herein, it should be appreciated that any subsequent arrangement designed to achieve the same or similar purpose may be substituted for the specific embodiments shown. This disclosure is intended to cover any and all subsequent adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the description.
[0065] The Abstract of the Disclosure is provided to comply with 37 C.F.R. § 1.72(b) and is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, various features may be grouped together or described in a single embodiment for the purpose of streamlining the disclosure. This disclosure is not to be interpreted as reflecting an intention that the claimed embodiments employ more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter may be directed to less than all of the features of any of the disclosed embodiments.
[0066] The above-disclosed subject matter is to be considered illustrative, and not restrictive, and the appended claims are intended to cover all such modifications, enhancements, and other embodiments, which fall within the true spirit and scope of the present disclosure. Thus, to the maximum extent allowed by law, the scope of the present disclosure is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing detailed description.