Searchable and secure decentralized P2P filesystem

Abstract

A first user saving a file f is detected. The file f is intercepted. A filename and one or more text strings entered as searchable metadata are also intercepted. An identifier id(f) is generated for the file f. The filename, the one or more text strings, and id(f) are stored together on a content address server. A symmetric encryption key KF is generated. File f is divided into n segments. An identifier id(s.sub.i) is generated for each segment s.sub.i. Each s.sub.i is encrypted using KF, producing n encrypted segments es.sub.i=ES.sub.KF(S.sub.i). Each es.sub.i is stored with its id(s.sub.i) on a peer device. For each es.sub.i, id(s.sub.i) is stored on the content address server with id(f). A public key KU2 of a second user is retrieved, KF is encrypted with KU2, producing wrapped key KW2 EA.sub.KU2(KF), and KW2 is stored on the content address server with id(f).

Claims

1. A computer-implemented method for sharing a file with searchable metadata among peer devices in a distributed cryptographically secured peer-to-peer filesystem, comprising: detecting that a first user is saving a file f on a first peer device; intercepting before storage in a non-volatile memory of the first peer device the file f, a filename fn of the file f, and one or more text strings entered by the first user as searchable metadata for the file f using the first peer device; generating a unique identifier id(f) for the file f using the first peer device; storing together the filename fn, the one or more text strings, and the identifier id(f) on a content address server using the first peer device, wherein the storing together the filename fn, the one or more text strings, and the identifier id(f) comprises searching a content address server for previous versions of the file f using the filename fn and the identifier id(f) and generating versioning information for the file f from the search using the first peer device, and storing together the filename fn, the one or more text strings, the versioning information, and the identifier id(f) on the content address server using the first peer device; generating a symmetric encryption key KF for the file f on the first peer device; dividing the file f into n segments on the first peer device, where in n>1; generating a unique identifier id(s.sub.i) for each segment s.sub.i of the n segments on the first peer device; encrypting each segment s.sub.i of the n segments using the symmetric key KF using a symmetric encryption algorithm on the first peer device, producing n encrypted segments es.sub.i=ES.sub.KF(s.sub.i; storing each encrypted segment es.sub.i of the n segments with its identifier id(s.sub.i) on at least one peer device identified by a unique peer identifier l.sub.i using the first peer device; for each encrypted segment es.sub.i, storing the identifier id(s.sub.i) on the content address server with the identifier id(f) using the first peer device; retrieving from a public key server a public key KU2 of a second user who can share the file f, encrypting the symmetric key KF with the public key KU2 using an asymmetric encryption algorithm, producing wrapped key KW2=EA.sub.KU2(KF), and storing the wrapped key KW2 on the content address server with the identifier id(f) using the first peer device.

2. The method of claim 1, wherein the versioning information comprises a date and time of the storage of the file f, a userid of an author of the file f, and an ordinal sequence number of the version.

3. The method of claim 1, wherein the content address server comprises a distributed hash table.

4. The method of claim 3, wherein the identifier id(f) comprises a cryptographic hash value h(f) and is generated using a cryptographic hashing algorithm.

5. The method of claim 3, wherein the unique identifier id(s.sub.i) for each segment s.sub.i of the n segments comprises a cryptographic hash value h(s.sub.i) and is generated using a cryptographic hashing algorithm.

6. The method of claim 1, further comprising sending the identifier id(f) to a second peer device used by the second user using the first peer device.

7. The method of claim 1, wherein storing each encrypted segment es.sub.i of the n segments with its identifier id(s.sub.i) on at least one peer device identified by a unique peer identifier l.sub.i comprises storing each encrypted segment es.sub.i of the n segments with its identifier id(s.sub.i) on the first peer device identified by a unique peer identifier l.sub.1.

8. The method of claim 1, wherein storing each segment es.sub.i of the n segments with its identifier id(s.sub.i) on at least one peer device identified by a unique peer identifier l.sub.i comprises storing each segment es.sub.i of the n segments with its identifier id(s.sub.i) on a different peer device identified by a unique peer identifier l.sub.i.

9. The method of claim 1, wherein dividing the file f into n segments on the first peer device further comprises using erasure coding so that the encrypted file ef can be reconstructed from m segments of the n segments where 0<m<n.

10. The method of claim 1, further comprising retrieving from the public key server a public key KU1 of the first user, encrypting the symmetric key KF with the public key KU1 using the asymmetric encryption algorithm, producing wrapped key KW1=EA.sub.KU1(KF), and storing the wrapped key KW1 on the content address server with the identifier id(f) using the first peer device in order to make the file f accessible to the first user.

11. A system for sharing a file with searchable metadata among peer devices in a distributed cryptographically secured peer-to-peer filesystem, comprising: a public key server that stores one or more public keys of an asymmetric cryptographic algorithm for one or more users; a content address server that stores information about one or more files; a first peer device that detects that a first user is saving a file f on the first peer device, intercepts before storage in a non-volatile memory of the first peer device the file f, a filename fn of the file f, and one or more text strings entered by the first user as searchable metadata for the file f, generates a unique identifier id(f) for the file f, stores together the filename fn, the one or more text strings, and the identifier id(f) on a content address server, wherein the stores together the filename fn, the one or more text strings, and the identifier id(f) comprises searching a content address server for previous versions of the file f using the filename fn and the identifier id(f) and generating versioning information for the file f from the search using the first peer device, and storing together the filename fn, the one or more text strings, the versioning information, and the identifier id(f) on the content address server using the first peer device, generates a symmetric encryption key KF for the file f, divides the file f into n segments on the first peer device, wherein n>1, generates a unique identifier id(s) for each segment s.sub.i of the n segments on the first peer device; encrypts each segment s.sub.i of the n segments using the symmetric key KF using a symmetric encryption algorithm on the first peer device, producing n encrypted segments es.sub.i=ES.sub.KF(s.sub.i), stores each encrypted segment es.sub.i of the n segments with its identifier id(s.sub.i) on at least one peer device identified by a unique peer identifier l.sub.i, for each encrypted segment es.sub.i, stores the identifier id(s.sub.i) on the content address server with the identifier id(f), and retrieves from the public key server a public key KU2 of a second user who can share the file f, encrypts the symmetric key KF with the public key KU2 using an asymmetric encryption algorithm, producing wrapped key KW2=EA.sub.KU2(KF), and stores the wrapped key KW2 on the content address server with the identifier id(f).

12. The system of claim 11, wherein the versioning information comprises a date and time of the storage of the file f, a userid of an author of the file f, and an ordinal sequence number of the version.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The skilled artisan will understand that the drawings, described below, are for illustration purposes only. The drawings are not intended to limit the scope of the present teachings in any way.

(2) FIG. 1 is a block diagram that illustrates a computer system, upon which embodiments of the present teachings may be implemented.

(3) FIG. 2 is an exemplary diagram showing how a file is publicly shared by a BitTorrent client application.

(4) FIG. 3 is an exemplary diagram showing how a file is securely stored in Tahoe-LAFS.

(5) FIG. 4 is an exemplary diagram showing how a file is securely stored in JigDFS.

(6) FIG. 5 is an exemplary diagram showing a system for sharing a file with searchable metadata among peer devices in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments.

(7) FIG. 6 is an exemplary diagram showing a system for finding a file with metadata matching a search term in a distributed cryptographically secured P2P filesystem

(8) FIG. 7 is a flowchart showing a computer-implemented method for sharing a file with searchable metadata among peer devices in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments.

(9) FIG. 8 is a flowchart showing a computer-implemented method for finding a file with metadata matching a search term in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments.

(10) Before one or more embodiments of the present teachings are described in detail, one skilled in the art will appreciate that the present teachings are not limited in their application to the details of construction, the arrangements of components, and the arrangement of steps set forth in the following detailed description or illustrated in the drawings. Also, it is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting.

DESCRIPTION OF VARIOUS EMBODIMENTS

(11) Computer-Implemented System

(12) FIG. 1 is a block diagram that illustrates a computer system 100, upon which embodiments of the present teachings may be implemented. Computer system 100 includes a bus 102 or other communication mechanism for communicating information, and a processor 104 coupled with bus 102 for processing information. Computer system 100 also includes a memory 106, which can be a random-access memory (RAM) or other dynamic storage device, coupled to bus 102 for storing instructions to be executed by processor 104. Memory 106 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 104. Computer system 100 further includes a read only memory (ROM) 108 or other static storage device coupled to bus 102 for storing static information and instructions for processor 104. A storage device 110, such as a magnetic disk or optical disk, is provided and coupled to bus 102 for storing information and instructions.

(13) Computer system 100 may be coupled via bus 102 to a display 112, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user. An input device 114, including alphanumeric and other keys, is coupled to bus 102 for communicating information and command selections to processor 104. Another type of user input device is cursor control 116, such as a mouse, a trackball or cursor direction keys for communicating direction information and command selections to processor 104 and for controlling cursor movement on display 112.

(14) A computer system 100 can perform the present teachings. Consistent with certain implementations of the present teachings, results are provided by computer system 100 in response to processor 104 executing one or more sequences of one or more instructions contained in memory 106. Such instructions may be read into memory 106 from another computer-readable medium, such as storage device 110. Execution of the sequences of instructions contained in memory 106 causes processor 104 to perform the process described herein. Alternatively, hard-wired circuitry may be used in place of or in combination with software instructions to implement the present teachings. Thus, implementations of the present teachings are not limited to any specific combination of hardware circuitry and software.

(15) The term “computer-readable medium” or “computer program product” as used herein refers to any media that participates in providing instructions to processor 104 for execution. The terms “computer-readable medium” and “computer program product” are used interchangeably throughout this written description. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and precursor ion mass selection media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 110. Volatile media includes dynamic memory, such as memory 106.

(16) Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, digital video disc (DVD), a Blu-ray Disc, any other optical medium, a thumb drive, a memory card, a RAM, PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other tangible medium from which a computer can read.

(17) Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 104 for execution. For example, the instructions may initially be carried on the magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 100 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector coupled to bus 102 can receive the data carried in the infra-red signal and place the data on bus 102. Bus 102 carries the data to memory 106, from which processor 104 retrieves and executes the instructions. The instructions received by memory 106 may optionally be stored on storage device 110 either before or after execution by processor 104.

(18) In accordance with various embodiments, instructions configured to be executed by a processor to perform a method are stored on a computer-readable medium. The computer-readable medium can be a device that stores digital information. For example, a computer-readable medium includes a compact disc read-only memory (CD-ROM) as is known in the art for storing software. The computer-readable medium is accessed by a processor suitable for executing instructions configured to be executed.

(19) The following descriptions of various implementations of the present teachings have been presented for purposes of illustration and description. It is not exhaustive and does not limit the present teachings to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practicing of the present teachings. Additionally, the described implementation includes software but the present teachings may be implemented as a combination of hardware and software or in hardware alone. The present teachings may be implemented with both object-oriented and non-object-oriented programming systems.

(20) Storing and Searching Metadata in a Distributed, Encrypted P2P System

(21) As described above, distributed, decentralized, or peer-to-peer (P2P) file systems offer many advantages to an enterprise. Unfortunately, decentralized file systems also present challenges, not the least of which is data discovery or knowledge management.

(22) Data is an organization's greatest asset, and the organization's ability to manage knowledge and derive wisdom across the span of its data holdings is critical to success. These barriers are magnified when the contents of the distributed or decentralized file system are cryptographically secured, as described below. Searching and discovery across many devices are difficult due to security and latency issues associated with the need to traverse device and network boundaries, as well as data opacity resulting from encryption. Searching and discovery can be entirely frustrated when cryptographic security measures are combined with distributed or decentralized file systems, rendering content completely opaque to keyword search mechanisms.

(23) As a result, there exists an unmet need for systems and methods that allow searching and discovery when cryptographic security measures are combined with distributed, decentralized, or P2P file systems.

(24) In various embodiments, systems and methods described herein allow searching and discovery when cryptographic security measures are combined with distributed, decentralized, or P2P file systems. This is qualitatively different from how other distributed, decentralized, or P2P file systems have stored and searched files. In various embodiments, the files described herein consist of an encrypted payload, e.g., the file content, and searchable metadata that is moved around by the content server, enabling distributed search on encrypted data.

(25) System for Sharing a Searchable File

(26) FIG. 5 is an exemplary diagram 500 showing a system for sharing a file with searchable metadata among peer devices in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments. The system of FIG. 5 includes public-key server 501, content address server 502, and first peer device 510.

(27) Public key server 501 stores one or more public keys of an asymmetric cryptographic algorithm for one or more users. Public key server 501 can be a computer service residing on any type of computer system including, but not limited to, the computer system of FIG. 1.

(28) Content address server 502 stores information about one or more files. Like public key server 501, content address server 502 can be a computer service residing on any type of computer system including, but not limited to, the computer system of FIG. 1.

(29) First peer device 510 is a computer system of the distributed P2P filesystem that is used for sharing files or segments of files. A peer device is typically a client computing device, such as a laptop computer, desktop computer, tablet computer, or smartphone, but can be any type of computing device that executes a client application for sharing files in the distributed P2P file system. A peer device, for example, can be the computer system of FIG. 1.

(30) When, for example, a first user 511 saves a file f 512, first peer device 510 performs several steps using a client application (not shown) of the distributed P2P filesystem that resides on first peer device 510. The client application is, for example, a file monitoring agent. First peer device 510 detects that first user 511 is saving file f 512 on first peer device 510. First peer device 510 intercepts file f 512 before storage in a non-volatile memory of first peer device 510. First peer device 510 also intercepts a filename n of file f 512 and one or more text strings (t.sub.1, . . . , t.sub.n) entered by first user 511 as searchable metadata for file f 512.

(31) First peer device 510 generates a unique identifier id(f) for file f 512. First peer device 510 stores together the filename n, the one or more text strings, and the identifier id(f) on content address server 502.

(32) First peer device 510 divides file f 512 into n segments. First peer device 510 generates a unique identifier id(s.sub.i) for each segment s.sub.i of the n segments. First peer device 510 generates a symmetric encryption key KF for encrypting the n segments. First peer device 510 encrypts each of the n segments using the symmetric key KF using a symmetric encryption algorithm, producing n encrypted segments. First peer device 510 stores each encrypted segment es.sub.i of the n segments with its identifier id(s.sub.i) on at least one peer device identified by a unique peer identifier l.sub.i. In various embodiments and as shown in FIG. 5, the n encrypted segments are stored on first peer device 510, which is location l.sub.1. For each encrypted segment es.sub.i, first peer device 510 stores the identifier id(s.sub.i) on content address server 502 with the identifier id(f). Note that the storage of the filename n, the one or more text strings, and the identifier id(f) on content address server 502 and the storage of the n encrypted segments on a peer device provides a unique file format in which there is metadata+payload but where metadata and payload can be geographically distributed from one another at any given point in time.

(33) Finally, first peer device 510 retrieves from public key server 501 a public key KU2 of second user 521 who can share file f 512, encrypts the symmetric key KF with the public key KU2 using an asymmetric encryption algorithm, producing wrapped key KW2=EA.sub.KU2(KF), and stores the wrapped key KW2 on content address server 502 with the identifier id(f), for example. First peer device 510 can also or alternatively, for example, store the wrapped key KW2 on first pear device 510. Essentially, the information stored on content address server 502 for the identifier id(f) makes file f 512 sharable by second user 521 using second peer device 520.

(34) In various embodiments, first peer device 510 stores versioning information with the filename n, the one or more text strings, and the identifier id(f) on content address server 502. Specifically, first peer device 510 searches content address server 502 for previous versions of file f 512 using the filename n and the identifier id(f). First peer device 510 generates versioning information for the file f from the search. First peer device 510 then stores together the filename n, the one or more text strings, the versioning information, and the identifier id(f) on content address server 502.

(35) In various embodiments, the versioning information includes a date and time of the storage of the file f 512, a userid of an author of file f 512, and an ordinal sequence number v of the version.

(36) In various embodiments, content address server 502 can be a distributed hash table. The identifier id(f) is then a cryptographic hash value h(f) and is generated using a cryptographic hashing algorithm. Also, the unique identifier id(s.sub.i) for each segment s.sub.i of the n segments is a cryptographic hash value h(s.sub.i) and is generated using the cryptographic hashing algorithm.

(37) In various embodiments, first peer device 510 further notifies second peer device 520 of file f 512. For example, first peer device 510 sends the identifier id(f) to second peer device 520 used by second user 521 to let second user 521 know that file f 512 is available.

(38) In various embodiments, first peer device 510 sends the identifier id(f) to second peer device 520 that the P2P filesystem application of second peer device 520 stores in the form of a symbolic link. Similarly, first peer device 510 can also send the wrapped key KW2 for second user 521 with identifier id(f) to second peer device 520. The P2P filesystem application of second peer device 520 then also stores the wrapped key KW2 for second user 521 in the symbolic link. In other words, to second user 521 of second peer device 520, file f 512 appears as any other file in the filesystem of second peer device 520 even though none of the segments of file f 512 may currently be stored on second peer device 520.

(39) In addition or alternately, if content address server 502 is a distributed hash table, when first peer device 510 stores the identifier id(f) on content address server 502 second peer device 520 is automatically notified of file f 512. For example, content address server 502 automatically publishes an update to the distributed hash table accessible to second peer device 520 when file f 512 is added. Second peer device 520 can also be automatically notified of the wrapped key KW2 for second user 521 for file f 512.

(40) In various embodiments, first peer device 510 does not initially store the n encrypted segments, es.sub.i, of file f 512 on any other peers. All of the n encrypted segments are only stored on another peer after that peer has received the identifier id(f) of file f 512 and opened the file. Opening the file on another peer, for example, causes the n encrypted segments, es.sub.i, of file f 512 to be transmitted to and received by that peer from one or more other peers.

(41) In various embodiments, first peer device 510 stores segments across different peer devices. Specifically, first peer device 510 stores each segments, of then segments with its identifier id(s.sub.i) on a different peer device identified by a unique peer identifier l.sub.i.

(42) In various embodiments, redundancy is provided in segments stored across different peer devices using erasure coding. Specifically, first peer device 510 divides file f 512 using erasure coding so that file f 512 can be reconstructed from m segments of the n segments where m<n.

(43) In various embodiments, content address server 502 stores a wrapped key with a file identifier for each user that can share the file. For example, a wrapped key for first user 511 also needs to be stored with identifier id(f) so that first user 511 can reconstruct file f 512. Specifically, first peer device 510 further retrieves from public key server 501 a public key KU1 of first user 511, encrypts the symmetric key KF with the public key KU1 using the asymmetric encryption algorithm, producing wrapped key KW1=EA.sub.KU1(KF), and stores the wrapped key KW1 on content address server 502 with the identifier id(f).

(44) In various embodiments, content address server 502 additionally stores a user identifier (userid) with each wrapped key. Additionally or alternatively, each wrapped key can be stored on a peer device along with the file identifier.

(45) System for Finding a File Using a Search Term

(46) FIG. 6 is an exemplary diagram 600 showing a system for finding a file with metadata matching a search term in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments. The system of FIG. 6 includes public-key server 501, content address server 502, and second peer device 520. These devices are also shown in FIG. 5.

(47) Note that in FIGS. 5 and 6 and as described herein, file identifiers, segment identifiers, and wrapped keys are shown and described as being stored on one content address server 502. In various alternative embodiments, file identifiers, segment identifiers, and wrapped keys can be stored separately or in any combination on separate servers. For example, file and segment identifiers may be stored in a DHT and wrapped keys with a file identifier may be stored in a separate key server.

(48) Second peer device 520 intercepts one or more search terms entered by a second user 521 of second peer device 520 into a file search agent (not shown) of an operating system of second peer device 520. Second peer device 520 queries content address server 502 with the one or more search terms and a userid u.sub.2 of second user 521.

(49) Second peer device 520 receives from content address server 502, for at least one search term of the one or more search terms and the userid u.sub.2, a filename n of file f 512. The filename n of a file f 512 is found if the at least one search term matches the filename n and the filename n is associated with the userid u.sub.2. Alternatively, the filename n of a file f 512 is found if at least one text string of searchable metadata that is associated with the filename n and a unique identifier id(f) of file f 512 matches the at least one search term and the filename n is associated with the userid u.sub.2. In other words, the at least one search term is compared to the filename n, the metadata text strings, or both. In various embodiments, the userid u.sub.2 is sent along with the at least one search term to ensure that only files accessible to second user 521 are returned. In addition to receiving the filename n from content address server 502, second peer device 520 also receives the identifier id(f), at least one wrapped key KW2 associated with the userid u.sub.2 for the identifier id(f), and a unique identifier id(s.sub.i) and at least one peer identifier l.sub.i for each encrypted segment es.sub.i of n encrypted segments of file f 512.

(50) In various embodiments, the metadata text strings are stored with the filename n. Note, however, that the filename n can have different versions and, therefore, more than one identifier id(f). As a result, in various alternative embodiments, metadata text strings are stored with each identifier id(f) of file f 512.

(51) In various embodiments, second peer device 520 stores the filename n, the identifier id(f), the at least one wrapped key KW2 associated with the userid u.sub.2 for the identifier id(f), and a unique identifier id(s.sub.i) and at least one peer identifier l.sub.i for each encrypted segment es.sub.i of n encrypted segments of file f 512 in the form of a symbolic link on second peer device 520.

(52) In various embodiments, the filename n of a file f 512 is found if versioning information associated with the filename n and a unique identifier id(f) of file f 512 matches the at least one search term and the filename n is associated with the userid u.sub.2. In other words, the at least one search term is compared to versioning information in addition to the filename n and the metadata text strings.

(53) In various embodiments, the versioning information includes a date and time of the storage of the file f 512, a userid of an author of file f 512, and an ordinal sequence number v of the version. In various embodiments, a userid of an author or editor of file f 512 can also be stored in the metadata text strings or alternatively can be stored in the metadata text strings.

(54) In various embodiments, second peer device 520 receives a request from second user 521 to open the file f 512. In various embodiments and as shown in FIG. 6, the request is invoked by selecting the symbolic link. Second peer device 520 requests each encrypted segment es.sub.i identified by each received identifier id(s.sub.i) of at least m identifiers of the n encrypted segments, where m<n, using a location of a peer device identified by the at least one peer identifier l.sub.i of each encrypted segment es.sub.i.

(55) As shown in FIG. 6, the file f 512 has only just been created by first user 511, for example, and, therefore, the n encrypted segments only reside on first peer device 510, which has peer identifier l.sub.i. As a result, second peer device 520 receives from content address server 502 the list of n encrypted segments all associated with the peer identifier l.sub.i. As a result, the n segments are requested from first peer device 510 in this example.

(56) Second peer device 520 receives the requested m segments from the request. Second peer device 520 reconstructs or recreates file f 512 from the received m encrypted segments.

(57) In various embodiments and as shown in FIG. 6, m segments can equal n segments. As a result, second peer device 520 reconstructs file f 512 from all n encrypted segments.

(58) Second peer device 520 decrypts the received at least one wrapped key KW2 with a private key KPr2 of second user 521 using an asymmetric encryption algorithm. Symmetric key KF=EA.sub.KPr2.sup.−1(KW2) is produced. Second peer device 520 decrypts each of the n encrypted segments using the symmetric key KF. In various embodiments, there is an error checking component at each stage. For example, if a decryption fails, the process is aborted and the encrypted segment is obtained from another peer.

(59) Finally, second peer device 520 reconstructs file f 512 from the n decrypted segments. Reconstruction has an error checking component as well—if the identifier of each decrypted segment does not match the contents of the segment, there is an attempt to retrieve the segment from a different peer, if the retrieval fails, the process is aborted. Similarly, the assembled file is error checked and either approved or the process aborted.

(60) Again, as described above and in various embodiments, content address server 502 can be a distributed hash table. The identifier id(f) is then a cryptographic hash value h(f) and is generated using a cryptographic hashing algorithm. In various embodiments, content address server 502 further generates a hash value for the file fusing the cryptographic hashing algorithm and compares the hash value to h(f) to verify the file f 512.

(61) Also, the unique identifier id(s.sub.i) for each segments, of the n segments is a cryptographic hash value h(s.sub.i) and is generated using the cryptographic hashing algorithm. In various embodiments, content address server 502 further after decrypting each encrypted segment es.sub.i of the m received encrypted segments, generates a hash value for each decrypted segment s.sub.i using the cryptographic hashing algorithm and comparing the hash value to h(s.sub.i) to verify each decrypted segment s.sub.i.

(62) In various embodiment, second peer device 520 reconstructs encrypted file f 513 using m segments of the n segments where m<n using erasure decoding. In other words, if all n segments are created using erasure coding, then only a subset m of the segments needs to be used to reconstruct file f 512.

(63) Method for Sharing a File

(64) FIG. 7 is a flowchart showing a computer-implemented method 700 for sharing a file with searchable metadata among peer devices in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments.

(65) In step 710 of method 700, it is detected that a first user is saving a file f on a first peer device.

(66) In step 720, before storage in a non-volatile memory of the first peer device, the file f, a filename n of the file f, and one or more text strings entered by the first user as searchable metadata for the file f are intercepted using the first peer device.

(67) In step 730, a unique identifier id(f) for the file f is generated using the first peer device.

(68) In step 740, store together the filename n, the one or more text strings, and the identifier id(f) on a content address server using the first peer device;

(69) In step 750, a symmetric encryption key KF is generated for the file f on the first peer device.

(70) In step 760, the file f is divided into n segments on the first peer device.

(71) In step 770, a unique identifier id(s) is generated for each segment s.sub.i of the n segments on the first peer device

(72) In step 780, each segment s.sub.i of the n segments is encrypted using the symmetric key KF using a symmetric encryption algorithm on the first peer device, producing n encrypted segments es.sub.i=ES.sub.KF(S.sub.i).

(73) In step 785, each encrypted segment es.sub.i of the n segments is stored with its identifier id(s.sub.i) on at least one peer device identified by a unique peer identifier l.sub.i using the first peer device.

(74) In step 790, for each encrypted segment es.sub.i, the identifier id(s.sub.i) is stored on the content address server with the identifier id(f) using the first peer device.

(75) In step 795, a public key KU2 of a second user who can share the file f is retrieved from a public key server. The symmetric key KF is encrypted with the public key KU2 using an asymmetric encryption algorithm, producing wrapped key KW2=EA.sub.KU2(KF). The wrapped key KW2 is stored on the content address server with the identifier id(f) using the first peer device.

(76) Method for Finding a File

(77) FIG. 8 is a flowchart showing a computer-implemented method 800 for finding a file with metadata matching a search term in a distributed cryptographically secured P2P filesystem, in accordance with various embodiments.

(78) In step 810 of method 800, one or more search terms entered by a second user of a second peer device into a file search agent of an operating system of the second peer device are intercepted using the second peer device.

(79) In step 820, a content address server is queried with the one or more search terms and a userid u.sub.2 of the second user using the second peer device.

(80) In step 830, from the content address server, for at least one search term of the one or more search terms and the userid u.sub.2, a filename n of a file f is received that matches the at least one search term. Alternatively, a filename n is received that has stored with the filename n and a unique identifier id(f) of the file f at least one text string of searchable metadata for the file f that matches the at least one search term. The identifier id(f), at least one wrapped key KW2 associated with the userid u.sub.2 for the identifier id(f), and a unique identifier id(s) and at least one peer identifier l.sub.i for each encrypted segment es.sub.i of n encrypted segments of the file f are received along with the filename n using the second peer device.

(81) While the present teachings are described in conjunction with various embodiments, it is not intended that the present teachings be limited to such embodiments. On the contrary, the present teachings encompass various alternatives, modifications, and equivalents, as will be appreciated by those of skill in the art.

(82) Further, in describing various embodiments, the specification may have presented a method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. As one of ordinary skill in the art would appreciate, other sequences of steps may be possible. Therefore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. In addition, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the various embodiments.

Searchable and secure decentralized P2P filesystem

Assignee

Inventors

Cpc classification

Classification Explorer

G06F16/176

PHYSICS

Classification Explorer

H04L9/3239

ELECTRICITY

Classification Explorer

G06F2221/2107

PHYSICS

Classification Explorer

H04L9/085

ELECTRICITY

Classification Explorer

G06F21/6227

PHYSICS

Classification Explorer

H04L9/3236

ELECTRICITY

Classification Explorer

H04L9/0825

ELECTRICITY

Classification Explorer

H04L9/30

ELECTRICITY

Classification Explorer

G06F21/6218

PHYSICS

Classification Explorer

G06F16/1837

PHYSICS

Classification Explorer

G06F21/602

PHYSICS

Classification Explorer

H04L9/50

ELECTRICITY

Classification Explorer

H04L9/0866

ELECTRICITY

Classification Explorer

H04L9/0869

ELECTRICITY

Classification Explorer

H04L9/0863

ELECTRICITY

Classification Explorer

H04L9/3213

ELECTRICITY

Classification Explorer

H04L9/0643

ELECTRICITY

Classification Explorer

H04L9/0894

ELECTRICITY

Classification Explorer

G06F21/606

PHYSICS

International classification

Classification Explorer

G06F16/176

PHYSICS

Classification Explorer

H04L9/08

ELECTRICITY

Classification Explorer

G06F16/182

PHYSICS

Classification Explorer

H04L9/30

ELECTRICITY

Classification Explorer

H04L9/32

ELECTRICITY

Abstract

Claims

Description