Machine-Learning Program, Method, and Apparatus for Measuring, by Pore Electric Resistance Method, Transient Change in Ion Current Associated with Passage of Target Particles through Pores and for Analyzing Pulse Waveform of Said Transient Change
20220155277 · 2022-05-19
Inventors
Cpc classification
G01N15/12
PHYSICS
G01N27/4161
PHYSICS
G01N33/48721
PHYSICS
G06F18/2148
PHYSICS
International classification
Abstract
An apparatus using a feature value extracted from a pulse waveform representing a transient change in ion current flowing between electrodes when a particle passes through a pore, as teacher data and data subject to analysis for machine learning. The apparatus includes a machine-learning program, a searcher, a host attribute table, and a feature value table, a host attribute table is searched using first host attribute information as a search key to extract a first host ID and a second host ID associated with the first host attribute information, a feature value table is searched using a first host ID as a search key to extract a first teacher feature value group obtained from first known particles of a first type, a feature value table is searched using a second host ID as a search key to extract a second teacher feature value group obtained from second known particles of the first type, learning is performed using the first teacher feature value group and the second teacher feature value group as teacher data and first particle type information representing the first type as a teacher label to calculate machine learning optimization parameters, and the machine learning optimization parameters with an input value that is a feature value group subject to analysis obtained from an unknown particle with a first host attribute are used to discriminate whether or not the unknown particle is of the first type.
Claims
1. An apparatus for utilizing a structure in which two chambers to be filled with an electrolytic solution containing particles are connected through a pore that a particle can pass through, the two chambers each using a sensor having electrodes to be in contact with the electrolytic solution, wherein a voltage is applied between the electrodes of the sensor, and a feature value extracted from a pulse waveform representing a transient change in ion current flowing between the electrodes when a particle passes through the pore is used as teacher data and data subject to analysis, thereby performing machine learning, wherein the apparatus includes storage means, wherein the storage means includes: a machine-learning program; a searcher; a host attribute table that stores host attribute information on a particle in association with a host ID used to identify the host of the particle; and a feature value table that stores a feature value group extracted from a pulse waveform output from the sensor, and particle type information indicating a type of the particle in association with the host ID, wherein the searcher is configured to search the host attribute table using first host attribute information as a search key, and extract a first host ID and a second host ID associated with the first host attribute information, wherein the searcher is configured to search the feature value table using the first host ID as a search key and extract a first teacher feature value group obtained from first known particles of a first type, and search the feature value table using the second host ID as a search key and extract a second teacher feature value group obtained from second known particles of the first type, wherein the machine-learning program is configured to learn using the first teacher feature value group and the second teacher feature value group collectively as teacher data, and first particle type information representing the first type as a teacher label to calculate machine learning optimization parameters, and wherein the machine-learning program is configured to use the machine learning optimization parameters with an input value that is a feature value group subject to analysis obtained from an unknown particle having the first host attribute information to discriminate whether or not the unknown particle is of the first type.
2. The apparatus according to claim 1, wherein the apparatus is a server that is connectable to the sensor via a network.
3. A machine-learning program, configured to carry out the steps of: connecting a sensor, wherein two chambers to be filled with an electrolytic solution containing known particles are connected through a pore that known particles can pass through, the two chambers each being connected to the sensor having electrodes to be in contact with the electrolytic solution; applying a voltage between the electrodes of the sensor to obtain a transient change in ion current flowing between the electrodes when the known particle passes through the pore as a teacher waveform, extracting a teacher feature value from the teacher waveform, and learning the teacher feature value as learning data and the type of the known particle as teacher data to calculate a machine learning optimization parameter; applying a voltage between the electrodes of the sensor to obtain a transient change in ion current flowing between the electrodes when an unknown particle passes through the pore as a waveform subject to analysis, and identifying the type of the unknown particle by using a feature value subject to analysis extracted from the waveform subject to analysis, and the machine learning optimization parameter; obtaining, as learning data, a first teacher feature value from first known particles from a first host and of a first type, and a second teacher feature value obtained from second known particles from a second host and of the first type, and learning the first teacher feature value and the second teacher feature value are collectively used as teacher data to calculate a machine learning optimization parameter, and inputting an input value that is a first feature value subject to analysis obtained from a first unknown particle from the third host, and using the machine learning optimization parameter to discriminate whether or not the first unknown particle is of the first type.
4. A machine-learning program, configured to carry out the steps of: connecting a sensor, wherein two chambers to be filled with an electrolytic solution containing known particles are connected through a pore that known particles can pass through, the two chambers each being connected to the sensor having electrodes to be in contact with the electrolytic solution, applying a voltage between the electrodes of the sensor to obtain a transient change in ion current flowing between the electrodes when the known particle passes through the pore as a teacher waveform, extracting a teacher feature value from the teacher waveform, and learning the teacher feature value as learning data and the type of the known particle as teacher data to calculate a machine learning optimization parameter, applying a voltage between the electrodes of the sensor to obtain a transient change in ion current flowing between the electrodes when an unknown particle passes through the pore as a waveform subject to analysis, and identifying the type of the unknown particle by using a feature value subject to analysis extracted from the waveform subject to analysis, and the machine learning optimization parameter; calculating a machine learning optimization parameter by learning a pair of a first teacher feature value group obtained from first known particles from a first host with a first host attribute and of a first type and first host attribute information representing the first host attribute, and a pair of a second teacher feature value group obtained from second known particles from a second host with a second host attribute and of the first type and second host attribute information representing the second host attribute that are collectively used as teacher data, and first particle type information representing the first type which is used as a teacher label; and inputting input values that are a first feature value group subject to analysis obtained from an unknown particle from a third host with a third host attribute and third host attribute information representing the third host, and using the machine learning optimization parameter to discriminate whether or not the unknown particle is of the first type.
5. The machine-learning program according to claim 3, wherein the known particle and the unknown particle are viruses or bacteria.
6. The machine-learning program according to claim 4, wherein the known particle and the unknown particle are viruses or bacteria.
7. The machine-learning program according to claim 3, further configured to carry out the steps of: having received the teacher waveform and the waveform subject to analysis from the sensor, generating, by an information terminal, the first teacher feature value, the second teacher feature value, and the first feature value subject to analysis; sending, from the information terminal to a server via a network, the first teacher feature value, the second teacher feature value, and the first feature value subject to analysis; and executing, by the server, the learning and the discrimination.
8. The machine-learning program according to claim 4, further configured to carry out the steps of: having received the teacher waveform and the waveform subject to analysis from the sensor, generating, by an information terminal, the first teacher feature value group, the second teacher feature value group, and the first feature value group subject to analysis, sending, from the information terminal to a server via a network, the first teacher feature value group, the second teacher feature value group, and the first feature value group subject to analysis; and executing, by the server, the learning and the discrimination.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
DESCRIPTION OF EMBODIMENTS
Configuration
[0055]
[0056] In order to identify or discriminate the type of particles to be identified, first, an electrolytic solution containing particles to be identified 190 is introduced from the inlet 111 or 121 and the chambers 110, 120 and the pore 140 are filled with the electrolytic solution. The particles to be identified may be present in both chambers 110 and 120, or may be present in only one of them. The power supply 152 then applies a voltage between the electrodes 112 and 122. The target particles 190 in the charged chamber move from the chamber 110 to the chamber 120 via the pore 140 by the voltage, for example. At this time, the ion current between the electrodes 112 and 122 is reduced by exhausting the electrolytic solution from the pore 140. After the transient temporal change of the ion current is amplified with the amplifier 150, the ammeter 151 monitors it. Note that
[0057]
[0058]
[0059]
[0060] In the example shown in
[0061] In the following description, the feature value (group) extracted from the pulse waveform caused when known particles pass through the pore is referred to as teacher feature value (group), and the feature value (group) extracted when unknown particles pass through the pore is referred to as feature value subject to analysis (group).
[0062]
[0063] The storage 530 may hold a feature value table 531, a host attribute table 532, and an optimization parameter table 533. The roles of these tables will be explained in detail later.
[0064] Here, the host ID is used to identify the place, environment, process, conditions, and the like where the known particles to be the teacher and the unknown particles to be analyzed were generated. For example, when the particles are a virus, it may be used as an ID for identifying the living body where the virus was generated. For example, when an embodiment of the present invention is applied to virus identification in clinical practice, a host ID is assigned to each of the virus particles collected from patient A and the virus particles collected from patient B in order to distinguish them. The host IDs may not be necessarily used only for distinction of the individuals from which the particles are derived, but may also be used to distinguish part or all of the information about the place and environment where the particles were generated, the method and process for generating the particles, and the like.
[0065] After calculating the machine learning optimization parameters, the server 360 receives the host ID of unknown particles and the feature value set subject to analysis from the network I/O 550 and stores them in the storage 530. The received feature value set subject to analysis is input to the machine learning algorithm having the machine learning optimization parameters, and the probability that the particle from which the feature value set subject to analysis is derived is the same type of particle as the teacher label is calculated. The process allows the type of the unknown particle to be estimated.
[0066] One feature value set subject to analysis is generated for each pulse waveform generated when one particle passes through the pore. Therefore, with the present method, each time a single particle passes through the pore, the type of the particle may be estimated.
[0067] Next, the information processing performed according to an embodiment of the present invention will be explained with reference to the flow charts shown in
Learning
[0068]
[0069] First, an electrolytic solution containing virus particles is generated from a first sample of first known particles collected from a first living body, and is then introduced into the sensor module 101. When a voltage is applied to the electrodes of the sensor module 101, a transient change in the ion current occurs each time the virus particles pass through the pore, and is amplified and digitized by the measuring instrument 320 and sent to the information terminal 340 as a first pulse waveform (Step S601). When the I/O 450 receives the pulse waveform, it is sent to the storage 420. Further, the information terminal 340 acquires information indicating the type of the first known particle, and the first host ID for identifying the first living body and the first host attribute information indicating the attribute of the first host from a keyboard 551, an optical sensor 552, and the like, and these are stored in the storage 420 via the I/O 450 (Step S602). In the example shown in
[0070] Next, the network I/O 460 sends the first teacher feature value set group, the first teacher label, the first host ID, and the first host attribute information to the server 360 via the network 399. The server 360 stores these pieces of information received at the network I/O 550 in a feature value table 531 and a host attribute table 532 of the storage 530 through the processor 510 (Step S604).
[0071] Here, with reference to
[0072] In one example shown in
[0073] In the embodiment, the server 360 may additionally receive the attribute information for each host from the information terminal 340 and store it in the host attribute table 532 in association with the corresponding host ID.
[0074] Referring back to
[0075] For the second known particles also, Steps S601 to S604 are executed in the same manner as for the first known particles. Such processing, the feature value table 531 stores the second teacher feature value set group in association with the second host ID, and the host attribute table 532 stores the second host attribute information in association with the second host ID. In one example of
[0076] The processor 510 then inputs the teacher label stored in the feature value table 531 and the stored teacher feature value set together as teacher data to the learner 511. The learner 511 optimizes a number of machine learning parameters of the learner 511 itself so as to minimize the error function. The machine learning parameters optimized here are referred to as machine learning optimization parameters. The processor 510 stores the calculated machine learning optimization parameters in the optimization parameter table 533 (Step S605).
Identification
[0077]
[0078] When the I/O 450 receives the third pulse waveform group, it is sent to the storage 420. Further, the information terminal 340 acquires the third host ID for identifying the third living body and the third host attribute information that represents the attribute of the third host from the keyboard 551, the optical sensor 552, and the like, and these are stored in the storage 240 via the I/O 450 (Step S902). Since a sample usually contains a large number of particles, a large number of pulse waveforms are obtained by one measurement. For this reason, a plurality of pulse waveforms are stored. These will hereinafter be referred to as a third pulse waveform group. The processor 410 then inputs the third pulse waveform group to the feature value extractor 411, and generates a feature value set subject to analysis from each of the third pulse waveform groups. As many feature value sets as the pulses are generated from the third pulse waveform group (Step S903). As there are a plurality of feature sets, these will hereinafter be referred to as the third feature value set group. Here, a feature value extracted from the third sample is referred to as a feature value subject to analysis in the sense that it is a feature generated from unknown particles to be analyzed.
[0079] Next, the network I/O 460 sends the feature value set group subject to analysis, the third host ID, and the third host attribute information to the server 360 via the network 399. The server 360 stores these pieces of information received at the network I/O 550 in the feature value table 531 and the host attribute table 532 of the storage 530 through the processor 510 (Step S904). In the example shown in
[0080] The processor 510 then inputs the machine learning optimization parameters stored in the optimization parameter table 533 in Step S605 and the feature value set group subject to analysis stored in the feature value table in Step S904 to the learner 511. Then, the learner 511 calculates, for each unknown particle pulse, the probability that the pulse is the same type of pulse as the first known sample (Step S905). In the method, a large number of pulse waveforms of unknown particles are usually monitored in one measurement, and, for each pulse waveform, the probability that the pulse waveform is the same type as the known sample is calculated. The probabilities for the respective pulses are combined to identify whether or not the unknown sample is the same type as the known sample (Step S906). A method of identifying whether or not the unknown sample is the same type as the known sample from the set of probabilities for individual pulse waveforms is, for example, a method of calculating the average of the probabilities for the respective pulses. Alternatively, an embodiment of the present invention may carry out any calculation methods.
[0081] As described above, in an embodiment of the present invention, the feature value extractor is located in the information terminal and the learner is located in the server; alternatively, the feature value extractor may be located in the server and feature value extraction in Steps S603 and S903 may be performed in the server 360. The feature value extractor 512 is represented by the dotted line in
Highly Accurate Identification Based on Host Attribute Information
[0082] In another embodiment, in the learning by the machine-learning program described with reference to
[0083] For example, even for the viruses supposed to be of the same type, if there are variants that depend on the attributes of the host, such as the area where the host lives, learning with a machine-learning program using a conventional method causes the learner to learn different features of multiple variants in mixture, which interferes with highly accurate particle identification. Further, for example, even for the particles of the same type in the sense that they act on living cells with the same biological selectivity, in the pore electric resistance method, there may be particles that lead to pulse waveforms having shapes that tend to differ depending on the attributes of the host. In this case also, highly accurate particle identification cannot be achieved for the same reason.
[0084] However, in an embodiment of the present invention, unlike the prior art, for example, a feature value set group is extracted from a pulse waveform obtained only from known particles derived from a host having the same host attribute information, and machine learning optimization parameters may be calculated only using the feature value set group. An example of such processing is shown in the flow chart of
[0085] In the method according to an embodiment of the present invention, as in the example that has been described here, the host attribute table may be searched using one piece of attribute information as a search key, or the host attribute table may be searched using multiple pieces of attribute information as a search key. In this case, learning yields machine learning optimization parameters for each combination of host attributes.
[0086] Next, for the identification of unknown particles, prior to the identification flow shown in
[0087] Such processing according to an embodiment of the present invention enables particle identification with high identification accuracy, which is not affected by the difference in host's attributes. Note that the searcher that has been described here may be either on the server or on the information terminal, and the aforementioned processing may be performed either on the server or on the information terminal. For example, the searcher 413 in
[0088] In another embodiment, as the teacher data given to the learner in the learning in Step S605, besides the feature value set stored in the feature value table 531 in Step S604, the host attribute information stored in the host attribute table 532 may be given as a feature value. For example, in the learning in Step S605, in addition to the feature value sets 722, 723, 724 . . . stored as teacher data in association with the host ID 720 in the feature value table 531, the host attribute information 863 stored in association with the host ID 820 in the host attribute table 532 may be used as a feature value, and the teacher label 721 may be used as the correct answer for learning.
[0089] In still another embodiment, multiple pieces of attribute information stored in the host attribute table 532 may be used as teacher data for learning in Step S605. For example, not only the attribute information 863 but also 862 and 861 may be used as teacher data together with the feature value set associated with the host ID 720. The process allows the machine learning parameters of the machine learning model to undergo optimization including the difference in particles depending on the host. As described above, the machine learning model learned according to an embodiment of the present invention may be used as a machine learning model for particle identification with wider general versatility.
[0090] Embodiments of the present invention are able to provide, in addition to the aforementioned method, an apparatus or hardware that can implement the method, a program, and products (e.g., an arbitrary medium, carrier, and module) that store a part or all of the program in a format that is executable by the user.
REFERENCE SIGNS LIST
[0091] 101 Sensor module
[0092] 102 Sensor module
[0093] 103 Sensor module
[0094] 110 Chamber
[0095] 111 Electrolytic solution inlet
[0096] 112 Electrode
[0097] 120 Chamber
[0098] 121 Electrolytic solution inlet
[0099] 122 Electrode
[0100] 130 Partition
[0101] 140 Pore
[0102] 141 Silicon wafer
[0103] 142 Thin film
[0104] 150 Amplifier
[0105] 151 Ammeter
[0106] 152 Power supply
[0107] 190 Target particles
[0108] 201 Current value
[0109] 202 Current value
[0110] 203 Current value
[0111] 320 Measuring instrument
[0112] 340 Information terminal
[0113] 360 Server
[0114] 399 Network
[0115] 410 Processor
[0116] 411 Feature value extractor
[0117] 412 Learner
[0118] 413 Searcher
[0119] 420 Storage
[0120] 421 Feature value table
[0121] 422 Host attribute table
[0122] 430 Memory
[0123] 440 Display
[0124] 450 I/O
[0125] 460 Network I/O
[0126] 510 Processor
[0127] 511 Learner
[0128] 512 Feature value extractor
[0129] 513 Searcher
[0130] 520 Memory
[0131] 530 Storage
[0132] 531 Feature value table
[0133] 532 Host attribute table
[0134] 533 Optimization parameter table
[0135] 540 Display
[0136] 550 Network I/O
[0137] 551 Keyboard
[0138] 552 Optical sensor
[0139] 700 Teacher label
[0140] 710 Heading row
[0141] 711 Column showing feature values for pulse depth
[0142] 712 Column showing feature values for pulse width
[0143] 713 Column showing feature values for pulse asymmetry
[0144] 720 First host ID
[0145] 721 Teacher label
[0146] 722 Teacher feature value set
[0147] 723 Teacher feature value set
[0148] 724 Teacher feature value set
[0149] 730 Second host ID
[0150] 731 Teacher label
[0151] 732 Teacher feature value set
[0152] 733 Teacher feature value set
[0153] 734 Teacher feature value set
[0154] 740 Third host ID
[0155] 741 Blank
[0156] 742 Feature value set subject to analysis
[0157] 743 Feature value set subject to analysis
[0158] 744 Feature value set subject to analysis
[0159] 810 Heading row
[0160] 820 First host ID
[0161] 830 Second host ID
[0162] 840 Third host ID
[0163] 851 Column showing host attribute information (gender)
[0164] 852 Column showing host attribute information (age)
[0165] 853 Column showing host attribute information (area)
[0166] 861 Host attribute information
[0167] 862 Host attribute information
[0168] 863 Host attribute information
[0169] 873 Host attribute information