METHOD FOR PREDICTING YIELD PERFORMANCE OF A CROP PLANT
20220155298 · 2022-05-19
Inventors
- Monika HEILMANN (Limburgerhof, DE)
- Pilar PUENTE (Limburgerhof, DE)
- Oliver THIMM (Ludwigshafen, DE)
- Iain PROCTOR (Limburgerhof, DE)
- Girish SRINIVAS (Limburgerhof, DE)
Cpc classification
International classification
G16B20/00
PHYSICS
Abstract
The invention relates to a method for predicting yield performance of a crop plant, comprising the steps of receiving metabolite measurements of the crop plant; determining new metabolite features by combining the received metabolite measurements, wherein at least one new metabolite feature is based on a classified average; providing the new metabolite features to a trained machine learning model; and determining yield performance of the crop plant using the provided model. It also relates to a method for training a machine learning model for predicting yield performance of a crop plant; a control unit configured to execute the method for predicting yield performance; to a plant breeding method and a farming method that apply said method; and the use of new metabolite features as determined in said method for prediction of yield performance.
Claims
1. A method for predicting yield performance of a crop plant, the method comprising: receiving (S1) metabolite measurements (M) of the crop plant (50); determining (S2) new metabolite features (Mn) by combining the received metabolite measurements (M), wherein at least one new metabolite feature (Mn) is based on a classified average; providing (S3) the new metabolite features to a trained machine learning model (13); and determining (S4) yield performance (Yp) of the crop plant (50) using the provided model (13).
2. The method of claim 1, further comprising: receiving hyperspectral data (Dh) of the crop plant (50); determining vegetation indices (I), relating to a combination of spectral bands from the crop plant (50), preferably having physiological meaning, from the hyperspectral data (Dh); and providing the vegetation indices to the trained machine learning model (13).
3. The method of claim 1, further comprising: determining the metabolite measurements (M) based on a crop plant sample (S) by chromatography, preferably polar gas chromatography (GCP), lipid gas chromatography (GCL), polar liquid chromatography (LCP) and/or lipid liquid chromatography (LCL).
4. The method of claim 1, wherein the classified average is determined by a) assigning the received metabolite measurements (M) to at least one ontology (F1, F2); and b) determining the average of the metabolite measurements (M) that are assigned to the same ontology.
5. The method of claim 4, wherein the ontology includes metabolite measurements (M) at different points in time during the crop cycle.
6. The method of claim 4, wherein the ontology is based on a chemical or biochemical generalization of metabolites.
7. The method of claim 4, wherein the metabolite measurements (M) are assigned to at least two hierarchy levels of ontologies (F1, F2), preferably wherein the first ontology level is defined according to a biomolecular or bio-functional classification of metabolites; more preferably wherein the second ontology level is defined according to biochemical relation of metabolites.
8. The method of claim 1, wherein new metabolite features (Mn) are determined by a) assigning the received metabolite measurements (M) to different ontologies (F3) based on a classification of metabolites as substrate(s) or product(s) of an enzymatically catalyzed reaction; and b) determining a ratio between product metabolite measurements and substrate metabolite measurements.
9. The method of claim 1, wherein the received metabolite measurements (M) and the new metabolic features (Mn) are provided to the trained machine learning model (13).
10. The method of claim 1, wherein the yield performance (Yp) is determined based on metabolite measurements from the vegetative and/or reproductive growth stage of the crop plant (50).
11. A method for training a machine learning model for predicting yield performance of a crop plant, the method comprising: receiving historical data sets comprising metabolite measurements in connection with a measured yield performance, wherein each data set comprises metabolite measurements for different points in time of the growth cycle for one or more crop plant(s); determining new metabolite features combining the received historical data sets, wherein at least one new metabolite feature is based on a classified average; generating a training data set and a test data set based on the historical data sets with new metabolite features; providing a machine learning model and training the machine learning model based on the training data set; and testing the trained machine learning model based on the test data set.
12. The method of claim 11, further comprising: on training, validating the yield performance (Yp) and providing validation data (V) by comparing the predicted yield performance (Yp) with the actual yield performance (Ya) of the respective crop plant (50); and adjusting the model (13) based on the validation data (V).
13. The method of any of claim 11, further comprising: adjusting a parametrization (P) of a machine learning algorithm determining the model (13) based on the validation data (V).
14. The method of claim 11, further comprising: determining a best new metabolite feature (Mb) from the new metabolite features (Mn) based on the validation data (V); wherein the best metabolite feature (Mb) comprises the metabolite measurements (M) with the highest impact on the expected yield performance; and wherein preferably the best metabolite feature (Mb) comprises metabolite measurements (M) extracted by polar gas chromatography (GCP).
15. A control unit (10) being configured for executing the method of claim 1.
16. A yield evaluation platform (100), comprising: a profiling platform (20) configured for determining metabolite measurements (M) from a crop plant sample (S); and a control unit (10) of claim 15.
17. A plant breeding method, comprising: determining yield performance (Yp) per plant of more than one crop plant (50) using the method of claim 1; and selecting the crop plants (50) with a predicted yield performance (Yp) according to predicted yield performance (Yp) for future breeding cycles.
18. A farming method, comprising: determining yield performance (Yp) of one or more crop plant(s) (50) using the method of claim 1; providing an expected yield performance of the crop plant(s) (50) depending on the determined yield performance (Yp); and adjusting farming conditions by the farmer depending on the expected yield performance of the crop plant(s) (50).
19. (canceled)
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0142] Exemplary embodiments will be described in the following with reference to the following drawings:
[0143]
[0144]
[0145]
[0146]
[0147]
DETAILED DESCRIPTION OF EMBODIMENTS
[0148]
[0149] Developing relevant biomarker, different crop field trials have to be run on different setups, in particular on different levels of abiotic stress like drought stress. Therefore, different crop plants 50 or areas of crop plants 50 on the crop field 40 are continuously stressed by constant levels of drought stress. For example two simultaneous field trials are set up using randomized block design with three different levels of water treatment. As a comparison, a control group of crop plants 50, which are not stressed is also added. Crop plants 50 are subjected to these different treatment levels at vegetative or reproductive growth stages, respectively. The same trials are conducted for several subsequent years. The crop plant trails are run from the vegetative growth stage, about thirty days, through the reproductive phase, about sixty days, of the crop plant 50. Ideally, a biomarker can be found that has a high field predictive power and allows assumptions on the yield or the yield performance of a crop plant 50 within the vegetative growth stage. Thus, assumptions on the yield or the yield performance of a crop plant 50 in the field can be made while the crop plant 50 is still in the greenhouse, preferably at an early stage of the vegetative growth stage.
[0150] The crop field trials use drought stress as abiotic stress to evaluate different biomarkers on their predictive power regarding the yield performance of the crop plant 50 relating to the drought stress. The expected yield performance of a crop plant 50 increases with the amount of drought stress applied to the crop plant 50.
[0151] For each experimental setup in crop field trials, at least two different varieties of crop seeds are used, but at least one common variety is maintained throughout the crop field trials.
[0152] For finding a valid biomarker, on different stages of the crop field trial, crop plant samples S of the different crop plants 50 are taken. For example, the crop plant sample S is corn leaf tissue. For example, crop plant samples S from three different time points are taken. The crop plant samples S are then provided to a profiling platform 20, generating metabolite measurements M from the crop plant samples S and providing them to the control unit 10, as described in detail in
[0153] Additional information about the crop field 40 is gathered by remote sensing. Therefore, a hyperspectral sensor 31, which is preferably mounted on a drone 30, gathers hyperspectral and thermal information, as well as information about the volume and the height of the crop plants 50. The hyperspectral sensor 31 is therefore configured for gathering hyperspectral data Dh, in particular by spectral imaging with visible and near-infrared (VNIR) and/or short wavelength infrared (SWIR). The hyperspectral data Dh, is provided to the control unit 10.
[0154] Instead of a drone 30, the hyperspectral sensor 31 can be mounted on any manned or unmanned working machine.
[0155]
[0156] As is shown, from a single crop plant sample S four different data sets can be received. The total number of metabolite measurements M identified with all four data sets is around 750 metabolite measurements M. The metabolite measurements M are determined by polar gas chromatography GCP, lipid gas chromatography GCL, polar liquid chromatography LCP and/or lipid liquid chromatography LCL and then provided to the control unit 10.
[0157]
[0158] In this example, the metabolite features Mn determined by polar gas chromatography GCP are partly assigned to a first ontology F1, for example comprising organic acids, amino acids and related and carbohydrates and related.
[0159] In this example, the metabolite features Mn determined by polar gas chromatography GCP are partly assigned to a second ontology F2, for example comprising sugar alcohols, sugar phosphates and free sugars.
[0160] In this example, the metabolite features Mn determined by polar gas chromatography GCP are defined by different ratios between two metabolite measurements M that is defined as product and substrate in an enzyme mapping F3.
[0161] Using this method, new metabolite features Mn are determined combining the received metabolite measurements M. The new metabolite features Mn are then provided to a model 13 of the control unit 10.
[0162]
[0163] The different ontologies F1 to F12 are validated and best new metabolites features Mb are determined from the new metabolite features Mn, wherein the best metabolite features Mb comprises the metabolite features Mn with the highest impact on the expected yield performance. In this case the first ontology F1 and the second ontology F2 are the best new metabolite features Mb.
[0164]
[0165] The model 13 is provided with parameters P from the machine learning unit 14. Based on the parameters P and the provided data from the ontology unit 11 and the hyperspectral sensor 31, the model 13 is trained. The model 13 then is used to provide a yield prediction data Yp of the respective crop plant 50. The model is preferably trained and tested to predict two classes of yield: “yield loss” and “no yield loss”. The yield prediction data Yp of the model 13 is provided to the validation unit 14. If available, the validation unit 14 additionally is provided with actual yield data Ya. The validation unit 14 then compares the yield prediction data Yp with the actual yield data Ya and determines validation data V, representing the accuracy of the yield prediction data Yp. The validation data V is provided to the machine learning unit 15, which adjusts the parameters P provided to the model 13 based on the validation data V.
[0166] The control unit 10, the ontology unit 11, the vegetation indices unit 12, the model 13, the validation unit 14 and/or the machine learning unit 15 may refer to a data processing element such as a microprocessor, microcontroller, crop field programmable gate array (FPGA), central processing unit (CPU), digital signal processor (DSP) capable of receiving crop field data, e.g. via a universal service bus (USB), a physical cable, Bluetooth, or another form of data connection. The respective units may be several independent devices. However, more or all respective units may be integrated into one device.
[0167]
[0168] In step S1, metabolite measurements M of the crop plant 50 are received. In step S2, new metabolite features Mn are determined combining the received metabolite measurements M. In step S3, a model 13 is determined by a machine learning algorithm based the new metabolite features Mn. In step S4, yield performance prediction data Yp of the crop plant 50 is determined using the determined model 13.
REFERENCE SIGNS
[0169] 10 control unit [0170] 11 ontology unit [0171] 12 vegetation indices unit [0172] 13 model [0173] 14 validation unit [0174] 15 machine learning unit [0175] 20 profiling platform [0176] 21 preparation unit [0177] 30 drone [0178] 31 hyperspectral sensor [0179] 40 crop field [0180] 50 crop plant [0181] 100 yield evaluation platform [0182] S crop plant sample [0183] Yp yield prediction data [0184] Ya actual yield data [0185] P parameter [0186] V validation data [0187] M metabolite feature [0188] Mn new metabolite features [0189] Mb best metabolite features [0190] GCP polar gas chromatography [0191] GCL lipid gas chromatography [0192] LCP polar liquid chromatography [0193] LCL lipid liquid chromatography [0194] F1 first ontology [0195] F2 second ontology [0196] F3 ratio between product metabolites and substrate metabolites [0197] F4 to F12 Fourth to twelfth ontology [0198] Dh hyperspectral data [0199] I indices [0200] receiving metabolite measurements [0201] S2 determining new metabolite features [0202] S3 determining a model [0203] S4 determining yield performance prediction data