System and Method for Crop Monitoring

20230292647 · 2023-09-21

    Inventors

    Cpc classification

    International classification

    Abstract

    Disclosed is a method of automated crop monitoring based on the processing and analysis of a large number of high resolution aerial images that map an area of interest using computer vision and machine learning techniques. The method comprises receiving 120 or retrieving image data containing a plurality of high resolution images of crops in an area of interest for monitoring, identifying 130 one or more crop features of each crop in each image, determining 140, for each identified crop feature, one or more crop feature attributes, and generating or determining 160 one or more crop monitoring outputs based, at least in part, on the crop features and crop feature attributes. Also disclosed is a method generating field camera specific training data for the machine learning model used to analyse the received image data.

    Claims

    1. A method of automated crop monitoring, comprising: receiving image data containing a plurality of images of crops in an area of interest for monitoring; identifying one or more crop features of each crop in each image; determining, for each identified crop feature, one or more crop feature attributes; and generating one or more crop monitoring outputs based, at least in part, on the crop features and crop feature attributes.

    2. The method of claim 1, wherein the one or more crop monitoring outputs include one or more of: a crop feature population count, a crop feature population density map, a volumetric crop yield prediction, a crop loss map, a diseased crop map, and one or more intervention instructions.

    3. The method of claim 2, wherein the one or more intervention instructions comprise instructions to apply one or more treatments to one or more regions of the area of interest, and optionally or preferably, wherein the instructions are machine integrated instructions for one or more agricultural machinery unit or vehicles to apply the one or more treatments to the one or more regions.

    4. The method of claim 1, further comprising generating or updating a spatially resolved model of the identified crop features in the area of interest, wherein each crop feature is associated/tagged with an attribute vector comprising its respective one or more crop feature attributes, and optionally, wherein the model comprises a three-dimensional point cloud, where each three-dimensional point represents a crop feature associated/tagged with its attribute vector.

    5. (canceled)

    6. The method of claim 4, wherein the image data is generated at a first time or date, and the method comprises: receiving second image data containing a second plurality of images of crops in the area of interest generated at a second time or date; identifying one or more crop features of each crop in each image; determining, for each identified crop feature, one or more crop feature attributes; generating, based on the crop features and crop feature attributes, one or more crop monitoring outputs; and updating the model to include the crop features and crop feature attributes for the second time or date.

    7. The method of claim 1, wherein the one or more crop features in each image are identified using a machine learning model trained on a training dataset of crop images to identify the one or more crop features in the respective image based, at least in part, on one or more image features extracted from each respective image; and optionally or preferably, wherein identifying a crop feature includes identifying a crop feature type.

    8. The method of claim 1, wherein determining the one or more crop features attributes comprises extracting one or more primary crop feature attributes from each identified crop feature based, at least in part, on the image pixel values and/or based on one or more image features extracted from each respective image, and wherein the one or more primary crop feature attributes include any one or more of: a location, a color, a dimension, and a sub-feature count, the location of each crop feature determined, at least in part, using geolocation data of each respective image in the image data.

    9-10. (canceled)

    11. The method of claim 1, wherein determining the one or more crop features attributes comprises determining one or more secondary crop feature attributes for each identified crop feature using a machine learning model trained on a training dataset of crop images to determine the one or more secondary crop feature attributes based, at least in part, on one or more image features extracted from each respective image, and wherein the one or more secondary crop feature attributes include one or more of: diseased and disease type, pest-ridden and pest type, weed-ridden and weed type, healthy, and unhealthy.

    12-13. (canceled)

    14. The method of claim 6, wherein the one or more image features comprise any one or more of: edges, corners, ridges, blobs, RGB colour composition, area range, shape, aspect ratio, and feature principle axis.

    15. The method of claim 6, wherein the image data is generated by a field camera, and the machine learning model is trained on training data specific to the image resolution of the field camera; and optionally or preferably, wherein the training data is generated from hyperspectral images of crops in a control growth environment, and/or by the method of claim 26.

    16. The method of claim 1, wherein each image is mapped to a different geolocation in the area of interest, and/or the plurality of images form an orthomosaic map of the area of interest.

    17. The method of claim 1, wherein the plurality of images include multiple viewpoints of each crop in the area of interest, and the step of determining, for each identified crop feature, one or more crop feature attributes comprises: for each identified crop feature, combining each respective crop feature attribute extracted from each respective viewpoint to provide one or more composite crop feature attributes.

    18. The method of claim 1, wherein the plurality of images have a pixel resolution of at least 32 pixels per meter, and/or a pixel size of less than 25 mm.

    19. (canceled)

    20. The method of claim 1, comprising generating the image data using at least one field camera mounted to a drone; and optionally mapping each image to a different geolocation in the area of interest, and/or generating an orthomosaic map of the area of interest from the plurality images.

    21. A crop monitoring system, comprising: a processing device with processing circuitry and a machine readable medium containing instructions which, when executed on the processing circuitry, cause the processing device to: receive image data containing a plurality of images of crops in an area of interest for monitoring; identify one or more crop features of each crop in each image; determining, for each identified crop feature, one or more crop feature attributes; generate one or more crop monitoring outputs based, at least in part, on the crop features a r feature attributes.

    22. The system of claim 21, comprising one or more imaging systems for generating the image data, wherein the one or more imaging systems comprises a drone; and optionally wherein the drone is configured to receive flight control instructions from the processing device for generating the image data and optionally send the generated image data to the processing device.

    23. (canceled)

    24. The system of claim 21, comprising one or more agricultural machinery units or vehicles for applying a treatment to one or more regions of the area of interest based on the one or more intervention instructions.

    25. (canceled)

    26. A method of generating training data for a machine learning model used to determine crop feature attributes of crop features in images of crops generated by a field camera for crop monitoring, the method comprising: receiving image data containing a hyperspectral training image of crops generated in a controlled growth environment using a hyperspectral training camera; generating one or more field camera-specific training images from the hyperspectral training image, the field camera-specific training images having an equivalent image resolution to that of a field camera used to generate the field images; identifying one or more crop features of each crop in the field camera-specific training images; labelling a sub-set of identified crop features with the one or more crop feature attributes; and storing the labelled classified crop features in a database as a training data set for the machine learning model.

    27. The method of claim 26, wherein the step of labelling comprises determining, for each identified crop feature, one or more primary crop feature attributes based on the pixel attributes of the respective identified crop feature.

    28. The method of claim 27, wherein the one or more primary attributes comprise one or more geometric and/or spectral attributes derived from the pixel attributes of the respective identified crop feature; and, optionally or preferably wherein the geometric attributes include one or more of: location, dimensions, area, aspect ratio, sub-feature size and/or count; and/or wherein the spectral attributes include one or more of: dominant colour, RGB, red edge and/or NIR pattern, hyperspectral signature, normalised difference vegetation index (NDVI), and normalised difference water index (NDWI).

    29. The method of claim 27, wherein the step of labelling further comprises determining, for each identified crop feature, one or more secondary crop feature attributes based at least in part on the primary crop features attributes and ground control data for the crops and/or image; and, optionally or preferably wherein the ground control data comprises known information including one or more of: crop type, disease type, weed type, growth conditions, and crop age.

    30. The method of claim 26, wherein the step of generating one or more field camera-specific training images comprises: modifying the pixel values of the hyperspectral image based on the spectral response of the field camera, optionally by determining a set of spectral filter weights for each spectral band of the field camera based on the spectral response of the respective spectral band of the field camera, and applying the set of filter weights to the spectral bands of each pixel of the hyperspectral image; and generating the one or more field camera-specific training images from the modified pixel values of the hyperspectral image; and, optionally or preferably wherein the one or more field camera-specific training images comprise one or more of: an RGB, near infrared and red-edge image.

    31. The method of claim 26, wherein the step of generating one or more field camera-specific training images comprises re-sampling the hyperspectral training image to substantially match spatial and/or pixel resolution of the field camera; and, optionally or preferably wherein the re-sampling is based on one or more equivalence parameters of the field camera.

    32. The method of claim 26, wherein the image data comprises a series or plurality of hyperspectral training images of the crops, each hyperspectral training image taken at a different point in time, and wherein the method comprises: generating one or more field camera-specific training images from each hyperspectral training image in the time series; identifying, for each point in time, one or more crop features of each crop in the field camera-specific training images; and labelling, for each point in time, a sub-set of identified crop features with the one or more crop feature attributes including a respective time stamp.

    33. The method of claim 32, comprising applying one or more geometric and/or spectral corrections to the hyperspectral training images or the one or more field camera-specific training images associated with each different point in time to account for temporal variations in camera position and lighting conditions.

    34. The method of claim 33, comprising: assigning one of the hyperspectral training images or field camera-specific training images associated with a given point in time as a reference image; applying a geometric transformation to the other hyperspectral training images or the other field camera-specific training images associated different points in time to substantially match the spatial location and pixel sampling of the reference image, optionally based on the location size of one or more pixels of one or more ground control points in each image; and/or applying a white balance to the other hyperspectral training images or the other field camera-specific training images associated different points in time to substantially match the white balance of the reference image, optionally based on one or more pixels values of one or more ground control points in each image.

    35. The method of claim 26, comprising training a machine learning model to identify crop features and determine crop feature attributes of crop features in images of crops generated by a field camera using the field-camera specific training images and training data set; and, optionally or preferably, wherein the machine learning model is or comprises a deep or convolutional neural network.

    36. The method of claim 26, comprising generating the image data by taking a plurality of hyperspectral images over a period of time using a hyperspectral camera in substantially the same position relative to the crops; and/or wherein each hyperspectral image is taken from substantially the same position relative to the crops.

    37. (canceled)

    Description

    BRIEF DESCRIPTION OF DRAWINGS

    [0079] In order that the invention can be well understood, embodiments will now be discussed by way of example only with reference to the accompanying drawings, in which:

    [0080] FIGS. 1(a) and 1(b) show illustrations of wheat crops and a wheat head respectively;

    [0081] FIG. 2 shows an image of stripe rust and leaf rust on a leaf;

    [0082] FIG. 3 shows a method of automated crop monitoring according to the invention;

    [0083] FIG. 4 shows a method of generating image data;

    [0084] FIG. 5 shows a schematic illustration of a drone imaging crops;

    [0085] FIGS. 6(a) and 6(b) show a low resolution satellite image of a field indicating an area of interest and a corresponding normalised difference vegetation index (NDVI) map of the field, respectively;

    [0086] FIGS. 7(a) and 7(b) show a high resolution image of crops and the same image highlighting detected crop features, respectively;

    [0087] FIG. 8 shows a method of identifying crop features;

    [0088] FIG. 9(a) shows an orthomosaic map of an area of interest;

    [0089] FIG. 9(b) shows the same image in FIG. 9(a) overlaid with a crop feature density map;

    [0090] FIG. 9(c) shows a zoom in of a region of the area of interest in FIG. 9(a);

    [0091] FIG. 10 shows a schematic diagram of a system for implementing the method of FIG. 3;

    [0092] FIG. 11 shows a method of generating training data;

    [0093] FIG. 12 shows a schematic diagram of a controlled growth setting for generating training image data;

    [0094] FIG. 13 shows an example training image of crops generated in a controlled growth setting with a map of individual crops in the image;

    [0095] FIG. 14 shows a hyperspectral cube representation of an example hyperspectral training image;

    [0096] FIG. 15 shows a composite image of different areas of crops generated from the training image of FIG. 13 after geometric correction with the map of individual crops in the image;

    [0097] FIG. 16 shows example spectral response curves for the spectral bands of a field camera used for generating field camera-specific training images; and

    [0098] FIGS. 17(a) to 17(c) shows example composite RGB, near infrared and red-edge field camera-specific training images generated from the training image of FIG. 13.

    [0099] It should be noted that the figures are diagrammatic and may not be drawn to scale. Relative dimensions and proportions of parts of these figures may have been shown exaggerated or reduced in size, for the sake of clarity and convenience in the drawings. The same reference signs are generally used to refer to corresponding or similar features in modified and/or different embodiments.

    DETAILED DESCRIPTION

    [0100] FIGS. 1(a) and 1(b) show example illustrations of crops 10 and crop features 12, 14, 16 and sub-features 12a which in this example are wheat plants 10, with heads 12, stems 14, leaves 16, and kernels 12a (grains/seeds). The crop features are inspected by farmers to monitor crop health, loss and yield. A crop feature 10 may have several attributes that correlate with crop health, loss and yield, and which a farmer can look for to base various conclusion and decisions on. For example, and pests such as aphids and mites can be detected upon close inspection of crop features. In addition, many diseases manifest as visible deterioration on the crop features that can be detected by experienced farmers at a relatively early stage, such as stripe rust on leaves 16 shown in FIG. 2. Also, the number of kernels 12a can be counted. By way of example, a wheat head 12 may have 25-50 kernels 12a depending on the health and nutrition of the crop. A high yielding crop of wheat may have 45-50 kernels per head, but this is reduced if nitrogen supply is limited. As such, a farmer can visually inspect the number of kernels 12a to base a conclusion that nitrogen supply is limited or yield is high. Based on detection of disease, pest, weeds, and/or poor health/yield the farmer can intervene by applying one or more treatments to the crops. e.g. fungicide, pesticide, weed killer, nitrogen etc. However, this manual process has several drawbacks: it is extremely time consuming; it is not feasible to inspect every crop so in practice only small sample or areas of crops are inspected and results are extrapolated across the whole field; and results can vary depending on the knowledge and experience of the farmer doing the crop monitoring.

    [0101] The present invention automates and substantially improves upon manual crop inspection by processing and analysing a large number (typically hundreds or thousands) of aerial images of an area of interest (AOI) containing crops using computer vision and machine learning techniques to automatically detect and classify and analyse crop features and extract their attributes in a consistent manner throughout the AOI, and provide various quantitative outputs that a farmer can use to efficiently monitor and maintain the crops, such as accurate crop yield predictions, crop population and dimension statistics, crop-loss/disease diagnosis, spatially resolved maps of crop attributes and intervention instructions, as will be described in more detail below.

    [0102] FIG. 3 shows an exemplary method 100 of automated crop monitoring according to an embodiment of the invention.

    [0103] In step 120, image data containing a plurality of images I.sub.HR of crops the AOI is received or retrieved, e.g. from a storage medium or database. The images I.sub.HR are high resolution (HR) digital multi-spectral aerial images. In this context, multi-spectral means that each HR image I.sub.HR comprises at least red, green, blue (RGB) colour channels, and may also include one or more infrared (IR) channels. Each HR image I.sub.HR captures a different region of the AOI with sufficient pixel resolution to resolve individual crops 10 and crop features 12, 14, 16 in it, and contains geo-location (i.e. X, Y, or longitude and latitude) and time/date metadata. As such, the HR images I.sub.HR contain the necessary geo-location information to reference them to the AOI, to a particular crop and to each other. The HR images I.sub.HR are spatially overlapping to ensure complete coverage of the AOI. The HR images I.sub.HR may be referred to as “field” images generated by a “field” camera with specific image resolution and/or spectral response (e.g. of each band/channel).

    [0104] The required pixel resolution (pixels per meter) of the HR images I.sub.HR is dependent on the type and size of crop being monitored. It will be appreciated that the size of the crop feature will depend on the time in the crop cycle. In the case of wheat crops, a fully grown wheat head 12 typically has a length L of approximately 8-12 cm, and a width W of approximately 2 cm (see FIG. 1(b)), with kernels approximated 5-10 mm in size. In this case, the HR images I.sub.HR should have a pixel size of less than 10 mm, and preferably less than 2 mm if sub-feature recognition is required.

    [0105] In an embodiment, the HR images I.sub.HR are generated/captured by a camera mounted to an unmanned aerial vehicle (UAV) 202 or drone 202 (i.e. “drone” images), as illustrated in FIG. 5. The drone images are captured/generated at a constant altitude/height above ground level (AGL) in a drone survey of the AOL. The altitude AGL for generating the HR drone images I.sub.HR is in the range 5 m to 50 m, and preferably 10-20 m. It will be appreciated that for a given camera pixel resolution, image resolution is increased at lower altitudes AGL. Each image I.sub.HR is captured with the drone 202 in a different location in the AOI such that the images form an array whereby adjacent images overlap. In an embodiment the image overlap is at least 50% (50% in FIG. 5) to ensuring that each crop 10 is captured in multiple images, each from a different viewpoint. For example, where the array is a 2D array and the overlap is 50%, a crop in the AOI is imaged four times by the drone (with fixed camera orientation), where each image is taken when the drone is in a different position relative to the crop. As such, each image of the crop 10 is taken from a different position, providing different viewpoints/angles of the crop 10 from different sides. Imaging a particular crop 10 from multiple viewpoints increases the amount of input data per crop 10 increasing the reliability of the determined crop attributes (the same attributes can be determined from each viewpoint and combined), as described in more detail below. Alternatively, satellite images with the required resolution can be used (not shown).

    [0106] The image data may initially be stored in an on-board memory of the drone 202 and later transferred for storage, processing and analysis. Once the image data is received, the geolocation information from each HR image I.sub.HR is extracted and the images I.sub.HR are mapped/referenced to the AOI. Where the image data comprises video data, the plurality of HR images I.sub.HR are first extracted as image frames/image stills from the video data. In an embodiment, an orthomosaic map of the AOI is generated by stitching the HR images I.sub.HR together, as is known in the art. In the orthomosaic map, each pixel is referenced to a geo-location. The images I.sub.HR may then be adjusted for variations in altitude and colour scaling, as is known in the art. For example, a reference tile with a known size (e.g. 1 m×1 m) can be positioned within the AOI to appear in the orthomosaic map, and the images I.sub.HR can be normalised based on a reference tile's apparent size and RGB composition in image.

    [0107] The method 100 may include the step 110 of generating the image data. Where the images I.sub.HR are drone images, step 110 may comprise providing flight control instructions to a drone 202 to survey the AOI at a specified altitude AGL and image overlap.

    [0108] The AOI may be a whole field or a region of a field. Where the AOI is a region of a field, step 110 may comprise identifying an AOI. The AOI may be identified empirically based on historical data, for example, the locations of historical crop loss events (not shown). Alternatively, it may be identified by remote image sensing techniques including, but not limited to, normalised difference vegetation index (NDVI), visual atmospheric resistance index (VARI), normalized difference water index (NDWI), surface soil moisture index (SSMI), and soil organic-carbon index (SOCI). Each of these indexes can be extracted or derived from multi-spectral or hyperspectral satellite images, as is known in the art. VARI uses RGB channels/bands, whereas NDVI, NDWI, SSMI and SOCI use RGB and IR channels/bands. NDVI is a basic quantitative measure of live green plants, determined from the ratio NDVI=(red−NIR)/(red+NIR) where red and NIR are the red and near-infrared (NIR) light spectral channels of the image, and ranges from −1 to 1. NDVI emphasises the green colour of a healthy plant and is commonly used as an indicator of chlorophyll content in several different types of crops, including corn, alfalfa, soybean, and wheat. NDVI is therefore used as a course indication of crop health, 1 being healthy. VARI provides similar information to NDVI but is less sensitive to atmospheric effects, allowing for vegetation to be estimated in a wide variety of environments. NDWI is correlated to the plant water content, and is therefore a good indicator of plant stress. NDWI is determined from the ratio NDWI=(NIR−SWIR)/(NIR+SWIR) where NIR and SWIR are the NIR and shortwave-infrared (SWIR) light spectral channels of the image, and ranges from −1 to 1. SSMI provides information on the relative water content of the top few centimetres of soil, describing how wet or dry the soil is in its topmost layer, expressed in percent saturation. It therefore provides insights in local soil conditions. SSMI is described in “Retrieving soil moisture in rainfed and irrigated fields using Sentinel-2 observations a modified OPTRAM approach” by A. Ambrosone et al. International Journal of Applied Earth Observation and Geoinformation 89, 102113 (2020). SOC refers to the carbon component of organic compounds in soil, which contribute to nutrient retention and turnover, soil structure, moisture retention. SOCI is described in “Estimating soil organic carbon in cultivated soils using test data, remote sensing imagery from satellites (Landsat 8 and Plantscope). and web soil survey data” by M. Halil Koparan, Thesis, South Dakota State University (2019). The values of and spatial variations in any of these indexes across a field can therefore be used to identify an AOI.

    [0109] FIG. 4 shows example steps of identifying an AOI using remote image sensing and generating the HR image data. In step 114, image data containing one or more multi-spectral images I.sub.LR of a field is received. The image(s) I.sub.LR used for AOI identification may be relatively low resolution (LR) compared to the HR images I.sub.HR described above. In an embodiment, the LR image data contains a single whole-of-field satellite image (see FIG. 6(a)). Alternatively, the image data may contain a plurality of overlapping drone or satellite images from which an orthomosaic map of the field is generated. Where LR drone images are used, these are captured from a greater height AGL than the HR drone images, e.g. AGL>100 m. In step 116, the AOI is identified based on one or more of VARI, NDVI, NDWI, SOCI and SSMI maps of the field calculated from the spectral bands/channels of the LR images I.sub.LR. For example, a predefined threshold may be applied to the VARI, NDVI, NDWI, SOCI and/or SSMI maps to indicate or identify areas of less healthy crops. In step 118, image data containing the HR images I.sub.HR of the AOI is generated e.g. by drone survey, as described above.

    [0110] FIG. 6(a) shows an example satellite RGB image of a field 2 of crops 10 in RGB (colour) obtained from the Sentinel-2 satellite 201. FIG. 6(b) shows the same satellite image overlaid with the NDVI map of the field 2. Regions of healthy and less healthy crops can be identified from the high and low NDVI values as shown and used to identify the AOI 4 for HR imaging in step 118.

    [0111] In step 130, one or more crop features 12, 14, 16 of each crop in each HR image I.sub.HR are identified using a using a machine learning model trained on a dataset of crop images to detect and classify the one or more crop features in the respective HR image I.sub.HR. In an embodiment, the machine learning model comprises a convolution neural network (CNN) and leverages Google Vision, Opencv, and Scipy Libraries.

    [0112] Identifying the crop features 12, 14, 16 in an image I.sub.HR involves extracting and/or calculating one or more image features from the respective image I.sub.HR using one or more feature extraction filters. Image features may be specific structural features in the image itself, such as points, edges or objects, or statistical information calculated or determined from the pixels values, such as averages, standard deviations, or distributions. The one or more image features may be low-level image features, e.g. related to pixel attributes. The one or more image features may be extracted from one or more, or all, of the channels/bands of the respective image I.sub.HR. The one or more image features may comprise any one or more of: edges, corners, ridges, blobs, RGB colour composition, area range, shape, aspect ratio, and feature principle axis. In an embodiment, the one or more image features include at least edges.

    [0113] The machine learning model takes the one or more image features as inputs, detects objects including the location of each object in the image and outputs a probability of each detected object being a crop feature belonging to one of the crop feature types (e.g. head 12, stem 14, leaf 16). The geo-location of each detected crop feature is determined in order to combine information on the same crop feature from different images (see below).

    [0114] The general approach is to train the machine learning model using a dataset of crop images with known crop features and attributes which have been identified by an expert. Once trained, a new image can then be input to the machine learning model to detect and classify crop features and attributes in it. The machine learning model can be trained on historical training data, or where this is not present, using a subset of HR images I.sub.HR from the image data of the AOI. Continual expert input is not needed. The model requires training for each new crop or feature, but once a critical mass of training data is achieved, further expert input is not necessary. In an embodiment, the machine learning model is trained on a training dataset generated specifically for the field camera used to generate the HR images I.sub.HR, as described below with reference to FIGS. 13 to 17.

    [0115] The image features may be extracted from the images using various known (and open source) image feature detection algorithms or filters. Many different feature detection algorithms are known in the field of image processing. By way of example only, edges can be extracted using a Canny or Sobel edge detection algorithms, blobs can be detected using a Laplacian of Gaussian (LoG) algorithm, and corners may be detected using a Harris corner detection algorithm, etc. Each filter typically represents a portion of code that can be run to extract the image feature(s) in question. As such, it will be appreciated that a separate filter may be configured to extract each separate image feature, or one or more filters may be configured to extract a plurality of images features.

    [0116] An example process flow for crop feature identification process 130 is shown in FIG. 8. In step 132, the type(s) of crops features to be detected are retrieved from a database 220. This may involve retrieving certain attributes associated with each crop feature, such as expected colour, dimensions etc. The database 220 may store the training images and cropped images of various crop features for use by the machine learning model. In step 134, noise in the images is removed. Noise elimination involves removing parts of the image are definitely not part of the crop feature to be detected based on information retrieved from the database 220. For example, blobs that do not fit to the dimensions of the crop feature can be removed, and parts of the image that show soil, which is brown and which does not match the expected colour of a crop feature can be removed (e.g. pixels set to zero). This reduces the noise in the image to increase the signal for feature detection in the machine learning model, e.g. CNN. In step 136, the machine learning model (e.g. CNN) is used to detect and classify the crop features in the image I.sub.HR. This may involve processing the image I.sub.HR using the one or more filters (e.g. at least an edge detection filter).

    [0117] FIG. 7(a) shows an example input image I.sub.HR containing wheat crops 10, and FIG. 7(b) shows the same image I.sub.HR where the wheat heads 12 have been identified using the above process 130.

    [0118] FIG. 9(a) shows an example geo-referenced orthomosaic map of an AOI containing wheat crops 10 that may be input to the above process 130. The orthomosaic map was generated from a drone survey at a height AGL of 12 meters. The pixel size is approximately 6-7 mm and the scale bar is 9 meters. In this case, the drone survey was performed at a relatively early tillering stage in the wheat crop cycle before heading (so there are no wheat heads 12), such that the crop features to be detected are stems 14 and leaves 16. In step 132, these features are retrieved from the database 220. FIG. 9(b) shows a feature density map overlaid on the orthomosaic map, produced from the crop features and locations obtained in step 136 (noise elimination was performed in step 134). The feature density map is the density of crop features per unit area, and this can be used to display “health density”, by combining the various crop feature attributes determined in step 140, described below. FIG. 9(c) shows the same (in the left and right panels) for a zoomed in region of the orthomosaic map in which individual wheat crops 10 can be distinguished.

    [0119] In step 140, one or more crop feature attributes are determined for each identified crop feature. This may involve determined one or more primary, secondary and/or tertiary attributes.

    [0120] Primary attributes are derivable directly from the image pixel attributes/values and/or extracted image features in the respective image I.sub.HR and include, but are not limited to any one or more of: location, dominant colour, dimension(s), and sub-feature 12a size and/or count. The location of each crop feature may be determined, at least in part, using geolocation data/information of each respective image in the image data. Where the images form an orthomosaic map of the AOI, the locations of crop features can be determined directly from the pixel values.

    [0121] Secondary attributes are derivable/determined indirectly from the images I.sub.HR using a machine learning model similar to process 130. Secondary attributes include, but are not limited to any one or more of: diseased and disease type, pest-ridden and pest type, weed-ridden and weed type, healthy, and unhealthy. In this case, the machine learning model, which may be the same machine learning model used for feature classification 130, is trained on a dataset of images of crop features (e.g. stored in the database 220) with known secondary attributes, such as known diseases and pests. For example, the machine learning model may be trained to detect various diseases such as septoria, stripe rust and leaf rust, various pests such as aphids, mites, orange wheat blossom midge (OWBM), locusts, and various weeds such as black grass. A crop feature may be classed as healthy if no disease, pests or weeds are detected.

    [0122] Tertiary attributes are derived using additional input data, e.g. satellite images and ground data, and include any one or more of: NDVI, soil information, and weight.

    [0123] The one or more crop feature attributes are determined for each image/viewpoint of a particular crop, resulting in multiple values for each respective attribute for each crop feature. These respective crop feature attributes are then combined or stacked, e.g. averaged and/or weighted, to provide composite crop feature attributes that are more reliable and accurate than those determined from a single image/viewpoint, similar to image stacking.

    [0124] In step 160, a spatially resolved model of the crops in the AOI is generated based on the identified crop features and attributes. The crop model is a three-dimensional (3D) point cloud model, where each 3D point represents the location of a crop feature in the AOI, which is tagged or associated with a respective attribute vector containing all the determined attributes. The Z-component of each 3D point may be relative. Relative Z can be determined via the drone using an on-board range sensor (e.g. LIDAR) or altitude sensor (e.g. barometer), if available. If these instruments are not available, a digital elevation model (DEM) is created from the highest resolution satellite image data of the AOI available. The crop model is generated from the plurality (typically hundreds or thousands) of HR images I.sub.HR of crops taken from different angles. The reliability of the model data, i.e. detected crop features and attributes, is increased by stacking several feature attributes from different viewpoints to decrease noise (as described above), while not disturbing the primary signal (feature). A reference frame for the model is built from the extracted geo-location information of each image. Where an orthomosaic map has been generated from the HR images I.sub.HR, this can form the reference frame for the model. The reference frame can then be populated with 3D points for the individual crop features in the AOI using their determined locations (primary attribute). Each 3D point is associated/tagged with its respective attribute vector.

    [0125] The model is the primary data structure for the crop monitoring method 100 that can be stored (e.g. in database 220), referenced and updated over time. Where the crop model is already generated (e.g. based on the crop monitoring method being performed at an earlier time) step 160 may instead comprise updating the crop model with the (newly) identified crop features and attributes. For example, the model can comprise crop features and attributes for an AOI extracted from image data generated at multiple different times to provide for temporal analysis. The temporal data can also be used to further increase the reliability of detected crop features and attributes by stacking the feature attributes obtained at different times to decrease noise (as described above). For example, the image data may be generated and/or may capture the crop's state at a first time or date, and the method may be repeated for additional image data generated and/or capturing the crop's state at a different time or date, and that data can be added to the crop model. Each 3D point may then be tagged with an attribute matrix describing the attributes at different times.

    [0126] Using different viewpoints to identify crop features increases the fidelity of the model by stacking those different angles. Same can be said of stacking those different angles over time. For example, the image data may not include every viewpoint possible for a crop feature, however, if we miss a crop feature on one flight/mission/survey, but catch it on the next or previous, we can interpolate temporally to determine the crop features primary attributes. For example, if we detected a leaf of wheat on a drone flight mission, yet the next mission two weeks later we can't get the vantage point we need to detect it, we temporally interpolate that leaf (the leaf would grow depending on the crop stage of the wheat plant).

    [0127] In step 160 one or more crop monitoring outputs are generated, at least in part, based on the crop features and crop feature attributes. The one or more crop monitoring outputs include one or more of: a crop feature population count, a volumetric crop yield prediction, one or more intervention instructions, and various spatially resolved crop feature or sub-feature attribute maps or meshes. The maps may include a crop feature population density map, dimensions map (wheat head 12 length or stem 14 length), sub-feature count map (e.g. number of kernels 12a per crop), crop loss and/or disease map. The crop monitoring outputs may be generated from the data held in the crop model.

    [0128] In one example, a volumetric crop yield prediction can be generated based on the dimensions of each crop feature 12 and/or sub-feature 12a, the crop feature 12 count and/or sub-feature 12a count, and weight information such as average weight per kernel 12a.

    [0129] The one or more intervention instructions may comprise instructions to apply one or more treatments to one or more regions of the AOI, e.g. applying water, seeds, nutrients, fertiliser, fungicide (e.g. for disease treatment), and/or herbicide (e.g. for weed killing) based on the data in the crop model. The one or more interventions may include identifying high-productivity regions in the AOI combined with identifying regions for targeted seed planting, targeted fertilising (e.g. nitrogen, phosphorus, potassium) and other yield improvement measures, herbicides (e.g. targeted black grass spray interventions) and other weed killing interventions, targeted fungicides and other disease treatments and mitigation interventions, water stress/pooling monitoring with targeted interventions; as well as monitoring the effectiveness of any of these interventions over time. For example, regions of the AOI where diseased crops are detected may be targeted for disease treatments.

    [0130] In an embodiment, the instructions may be machine integrated instructions for one or more agricultural machinery units or vehicles to apply the one or more treatments to the one or more regions. Farm machinery is increasingly automated, and satellite guided (GPS, GNSS, etc), hence the machinery understands where it is located in that field or AOI. Some machinery is able to follow a predetermined track on autopilot (e.g. John Deere's fully-autonomous tractor concept). The intervention instructions can include a shapefile which integrates with this existing farm machinery in such a way that the tractor, variable rate fertiliser applicator, or any other form of farm machinery, understands where it is in the field relative to the shapefile, and what action it is meant to perform at that location (e.g. spray a certain volume of nitrogen, herbicide, fungicide).

    [0131] The method 100 can be used to develop optimal relationships between extracted primary crop feature attributes and derived secondary or tertiary attributes such as overall crop health. For example, currently NDVI is typically used to correlate to crop yield, with an error margin of approximately +/−33%. By contrast, the method 100 relies primarily on feature recognition to predict yield, which is more direct and accurate.

    [0132] FIG. 10 shows an exemplary system 200 for implementing the method 100 described above. The system 200 comprises: one or more processing devices 210; a database 220 in communication with the processing device(s) 210; and a drone 202 for generating the HR images I.sub.HR of the AOI 4. The system 200 may also comprise a satellite 201 for providing LR and HR images of the field 2 and/or AOI 4, and/or farm machinery 230 for applying treatments to one or more regions of the AOI 4 based on one or more of the outputs generated by the method 100. Alternatively, the system 200 may comprise any suitable imaging system for generating the HR image data, such as a smart phone or digital camera (not shown). Steps 120-160 are computer-implemented data processing steps which can be implemented in software or one or more software modules using the one or more processing devices 210. The system 100, 200 can be implemented in multiple platforms, such as a web page interface or an app interface.

    [0133] The processing device(s) 210 may include a one or more processors 210-1 and a memory 210-2. The memory 210-2 may comprise instructions that, when executed on the one or more processors 210-1, cause the one or more processors 210-1 to perform at least steps 120-160 described above. The one or more processors 210-1 may be single-core or multi-core processors. Merely by way of example, the one or more processors 210-1 may include a central processing unit (CPU), an application-specific integrated circuit (ASIC), an application-specific instruction-set processor (ASIP), a graphics processing unit (GPU), a physics processing unit (PPU), a digital signal processor (DSP), a field programmable gate array (FPGA), a programmable logic device (PLD), a controller, a microcontroller unit, a reduced instruction-set computer (RISC), a microprocessor, or the like, or any combination thereof.

    [0134] The processing device(s) 210 may be implemented on a computing device or a server. The server may be a single server, or a server group. The server group may be centralized, or distributed (e.g., server 210 may be a distributed system). The server 210 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.

    [0135] The database 220 may store data and/or instructions, such as the HR and LR image data, the crop model, and training images for the machine learning models. Although shown as separate from the processing device(s) 210 in FIG. 10, the processing device(s) 210 may comprise the database 220 in its memory 210-2. The database 220 and/or the memory 210-2 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM), or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drives, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. Exemplary volatile read-and-write memory may include a random access memory (RAM). Exemplary RAM may include a dynamic RAM (DRAM), a double date rate synchronous dynamic RAM (DDR SDRAM), a static RAM (SRAM), a thyristor RAM (T-RAM), and a zero-capacitor RAM (Z-RAM), etc. Exemplary ROM may include a mask ROM (MROM), a programmable ROM (PROM), an erasable programmable ROM (PEROM), an electrically erasable programmable ROM (EEPROM), a compact disk ROM (CD-ROM), and a digital versatile disk ROM, etc.

    [0136] Exchange of information and/or data between the processing device(s) 210, satellite 201, drone 202, and database 220 may be via wired or wireless network (not shown). Merely by way of example, the network may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN), a wide area network (WAN), a wireless local area network (WLAN), a metropolitan area network (MAN), a wide area network (WAN), a public telephone switched network (PSTN), a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof.

    [0137] FIG. 11 shows an exemplary method 300 of generating a database 220 of training data for training the machine learning model according to an embodiment of the invention. The training data can include training images of crops 10 and/or crop features 12, 14, 16 obtained at various stages of the crop's lifecycle. Each crop feature 12, 14, 16 is labelled or tagged with one or more crop feature attributes (primary, secondary and/or tertiary attributes) used for classifying the crop feature. The crop feature attributes can include one or more primary, secondary and/or tertiary attributes, as described above. In this case, all the crop feature attribute labels are derived from high resolution hyperspectral training images I.sub.TR while the crops are growing in a controlled environment or laboratory under control growth conditions, as described below. In particular, the method 300 produces training data adaptable for the specific imaging device or camera used in the field for the crop monitoring process 100 (referred to hereafter as the “field camera”). For example, a specific field camera will have a specific image resolution (e.g. pixel resolution, spatial resolution and spectral resolution/responsivity) depending on the type of camera it is. It may also have a specific focal length and other equivalence parameters affecting the resulting images. As such, the same crop imaged by two different field cameras under identical conditions (e.g. same lighting conditions and same position relative to the crop) may look slightly different, e.g. in RGB and/or NIR channels, which may yield potentially different classification outcomes when using a machine learning model trained on or utilising the same training data. As such, the generation and use of camera-specific training data, e.g. with a pixel and spectral resolution that matches that of the field camera, can improve the accuracy and reliability of the classification and crop monitoring process. In addition, the method 300 applies the same geometric and spectral pre-processing to each lab-based training image to reduce noise and make them directly comparable for temporal analysis. Comparing and iterating through a large number of noisy training data is already an issue with known methodologies, which would be compounded with different effects from lab and field conditions. As such, each lab training dataset generated from the method 300 becomes quite unique.

    [0138] In this context, hyperspectral mean that each pixel of each image contains a large number of narrow spectral bands (i.e. a spectrum) covering a wide spectral range e.g. visible to NIR, as opposed to multiple-spectral images that contain a relatively low number of broad spectral bands or colour channels such as R, G, B and optionally NIR.

    [0139] To understand the highest level of feature analysis it is necessary to start with and analyse the highest possible resolution image data. In step 310, high spectral, temporal, and spatial resolution training images I.sub.TR of crops are generated, obtained or collected. The training images I.sub.TR are obtained in a controlled setting or environment 400, such as a laboratory, using a HR hyperspectral imaging device or camera 410 (referred to hereafter as a training camera 410), as shown schematically in FIG. 12. In the controlled setting 400, crops 10 are grown under controlled conditions and training images I.sub.TR are captured from a fixed position above the crops 10 at different times throughout the crop cycle. This produces a set of training images I.sub.TR for the crop 10. For example, training images I.sub.TR may be taken over the course of several months, with several images taken each month. The growth conditions of the crops 10 are controlled and can be varied to develop specific crop feature attributes. Individual crops 10 can be deliberately infected with a known disease, and/or be supplied with different levels of nutrients, water and/or sunlight to controllably vary their health, yield and associated crop feature attributes. As such, step 310 may comprise a step of growing crops 10 in a controlled setting wherein at least some of the crops 10 have one or more known crop feature attributes, and/or controlling one or more growth conditions of the crops 10.

    [0140] In the example controlled setting 400 of FIG. 12, the training camera 410 is mounted to a support structure or gantry 420, allowing the position of the training camera 410 relative to the crops 10 to be controlled and fixed over long periods of time. The height h of the training camera 410 above the crops 10, which is equivalent to the height above ground level (AGL) in field data such as a drone survey, is relatively low compared to a typical field data to maximise the spatial resolution of the training images I.sub.TR. In an embodiment, the height h is less than 1 m. The setting 400 may also include one or more UV lights to simulate sunlight and control crop growth (not shown). Individual crops 10 are grown in predefined positions in an array 10A of pots or containers 10C.

    [0141] Depending on the size of the array 10A and/or number of arrays 10A, the training camera 410 can be mounted to a motorised translation stage (not shown) to adjust its position (x and/or y) and capture images of the different areas or arrays 10A at the same height h sequentially. In one example, the training camera can be used to capture a panoramic scan of the one or more arrays 10A at a constant height h. Alternatively, a plurality of training cameras 410 can be used to capture images I.sub.TR of different areas of an array 10A or different arrays 10A. In this case, identical training cameras 410 are positioned at the same height h and capture images simultaneously. Multiple images from the same or different training cameras 410 can be stitched together to generate an orthomosaic map of the crops 10 at the given point in time, as described above.

    [0142] FIG. 13 shows an example raw RGB training image I.sub.TR-1 of a set of four 6×3 arrays 10A containing wheat crops 10 taken. The image I.sub.TR-1 was taken as a panoramic scan. There are 72 containers 10C divided between four arrays 10A. Each array 10A contains a different category of crop 10, as indicated by the letters m (mixed blackgrass and healthy wheat), b (blackgrass), f (fusarium head blight), and s (septoria).

    [0143] Each training image I.sub.TR comprises x-y pixel data in a plurality of a plurality of narrow spectral bands. For example, a hyperspectral camera 410 can contain 973 discrete spectral bands spanning a broad spectral range e.g. λ˜400-1000 nm, whereas a multi-spectral camera 410 typically contains 3-5 bands such as red (λ˜650 nm), green (λ˜560 nm), blue (λ˜450 nm), red edge (λ˜730 nm) and NIR (λ˜840 nm). Where hyperspectral images are acquired, each training image I.sub.TR can be represented by a three dimensional hyperspectral data cube I.sub.TR(x, y, λ), where x and y are the two spatial dimensions of the scene and λ is the spectral dimension comprising a range of wavelengths, as shown in FIG. 14. Different spectral bands reveal different features of a crop and therefore contain different information relevant to identifying and classifying crop features and attributes. For example, certain diseases such as septoria are more visible in the NIR than in the visible wavelength range (RGB). The different spectral bands in the image data I.sub.TR(x, y, λ) can be analysed individually or combined to produce specific types of composite spectral images. As such, various different types of spectral images can be derived from each hyperspectral training image I.sub.TR(x, y, λ), including, but not limited to, traditional RGB images, red edge images (λ˜730 nm) and NIR images (λ˜840 nm), as well as the more advanced remote sensing image indices described above such as NDVI, VARI, NDWI, SSMI and SOCI.

    [0144] Temporal analysis of image features in the training images I.sub.TR(x, y, λ) involve comparing training images and crop features in the images taken at different times. As such, any meaningful temporal analysis requires each training image I.sub.TR(x, y, λ) in the set (which is acquired at a different time) to have comparable geometric properties and spectral signature so that they can be accurately aligned/overlaid. Because each training image I.sub.TR(x, y, λ) in the set is acquired at different time, with a time interval between images on the order of days, the position of the training camera 410 relative to the crops 10 (e.g. its height h, lateral x, y position, and/or tilt) is subject to possible drift and the lighting conditions (e.g. natural and/or artificial light) in the controlled setting 400 may vary between images. As the spatial resolution of the training images I.sub.TR is so high, any misalignment and/or change in lighting conditions introduces error the analyses performed the proceeding steps. In particular, it is difficult to quantitatively compare image features taken at different times or crop features in the same image if the crop features or images taken at different times have different lighting conditions. As such, any changes in geometric and/or spectral characteristics are corrected for in steps 320a and 320b.

    [0145] In step 320a, the set of raw training images I.sub.TR-1 are processed to apply geometric corrections. Geometric corrections involve reorienting and stretching the training images to ensure each training image has the same pixel size and shape. Step 320a comprises assigning one of the training images of the set as a reference image, and adjusting each other training image to the spatial location and pixel sampling of the reference image. This involves determining or identifying one or more ground control point (GCP) locations in each training image, and applying a transformation (e.g. an affine transformation) to adjust the pixel locations and sampling to match that of the reference training image based on the location and size of one or more pixels at a GCP, as is known in the art. Example GCPs are shown in FIG. 13, which are checkered reference tiles with known dimensions.

    [0146] In step 320b, the set of training images are processed to apply a spectral correction. Spectral correction involves applying a white balance or a global adjustment to the spectral intensities (increase or decrease) of the training images so that all the training images have the same whiteness level (also known as colour balance, grey balance or neutral balance). Step 320b comprises adjusting the whiteness level of each other training image to match the whiteness level of the reference training image. This involves identifying one or more lighting control point (LCP) locations in each training image and adjusting the whiteness level to match that of the reference training image based on the spectrum of one or more pixels at the LCP, as is known in the art. A LCP is a reference tile with neutral colours such as white and grey. Step 320b can be performed before or after step 320a.

    [0147] In the example shown in FIGS. 13, each array 10A of crop containers 10C represents an area of interest (AOI) with crops 10 of interest. All other areas and features around and between the arrays 10A (e.g. floor, table etc.) are of no interest for crop feature detection and should preferably be removed to reduce the noise and increase the signal for crop feature detection in step 340. As such, step 320a may further involve a further geometric correction of clipping/cropping the transformed training images so that they contain only an AOI with crops 10 of interest (whether healthy or not) and not features or objects of no interest. This step comprises determining pixel locations of the corners of the or each AOI in each transformed training images, and extracting a sub-image of the or each AOI from each transformed training image. Where there are multiple AOIs in the training images, this step may further comprise creating a composite image of the AOIs for each transformed training image of the set. In the example of FIG. 13, the corners of each array 10A are determined and the images of each array 10A are extracted, to create a composite image of the four arrays 10A for each training image of the set, as shown in FIG. 15.

    [0148] In step 330, one or more field camera-specific training images are generated or extracted from each hyperspectral training image I.sub.TR(x, y, λ) of the set based on the spectral response/sensitivity characteristics and spatial/pixel resolution of the field camera. The resulting field camera-specific training images have an equivalent image resolution to the images generated in the field by the field camera. Step 330 may be performed before or after steps 320a, 320b.

    [0149] Step 330 comprises determining spectral filtering weights for the field camera at each spectral band of the training images I.sub.TR(x, y, λ) based on the spectral response of the field camera and applying the filtering weights to the pixel values (i.e. intensities) of the hyperspectral training images I.sub.TR(x, y, λ) at the respective spectral bands to produce one or more field camera-specific training images. In an example, the field camera-specific training images include RGB, red edge and NIR images (see FIGS. 17(a)-17(c)).

    [0150] The spectral filtering weights for the field camera are determined from the spectral response/sensitivity characteristics or quantum efficiency (QE) curves of the field camera. FIG. 16 shows example QE curves for the red (R), green (G) and blue (B) spectral bands of a multi-spectral field camera. Each curve R, G and B represents the relative spectral response/sensitivity for each colour channel, which is typically associated with separate colour filters or image sensors of the field camera. The filtering weights represent the weight or coefficient needed to be applied to each spectral band of the hyperspectral training images I.sub.TR(x, y, λ) in order to simulate or match the spectral response of the field camera. A different set of filter weights is determined for each spectral band of the field camera (e.g. red, green and blue in the example of FIG. 16) to simulate each spectral band of the field camera. Field-specific training images, corresponding to the types of images produced by the field camera with its spectral response, such as RGB, red edge and NIR images, can then be produced or extracted from the filtered hyperspectral training images I.sub.TR(x, y, λ), as is known in the art.

    [0151] FIGS. 17(a) to 17(c) show example field camera-specific RGB, NIR and red-edge images generated from the training image of FIG. 13 using the above process.

    [0152] Step 330 also comprises image re-sampling (down-sampling) to decrease the spatial/pixel resolution of the training images to match that of the field camera (e.g. drone camera, mobile camera or other). Image re-sampling is based on field camera-specific equivalence parameters, such as pixel resolution, focal length, field/angle of view, aperture diameter, and/or depth of field. Equivalence parameters are known for every camera (e.g. from the technical specification of the field camera). Step 330 therefore results in one or more field camera-specific training images for each time point with the same pixel/spatial resolution and spectral characteristics as the field camera images.

    [0153] Step 320a, 320b, and 330 generate field camera-specific training images in a consistent format for crop feature analysis and attribute labelling in the proceeding steps. These processing steps help to relate crop features attributes derived from training data to real world crop feature attributes derived from field camera data, and interpret and label the complex spectral and spatial image data.

    [0154] In step 340, crop features and attributes are derived from the field camera-specific training images at each time instance. This step involves a similar process to that described above with reference to FIG. 8. Image segmentation is performed to identify and/or extract objects including crop features 12, 14, 16 (such as leaves, stems, heads etc.) in each image. Various image segmentation algorithms are known in the art and typically utilise edge detection (Canny or Sobel edge detection algorithms that are commonly used in the image processing field). Image segmentation defines objects in the image bounded by edges, also referred to as object polygons. Some of the detected objects are crops features, and other are of no interest. An object filter can be used to exclude objects in the image that are not crop features (e.g. soil) and/or are not of interest for crop feature classification (e.g. small objects that should be included as children of a parent object such as leaf blight). The remaining crop features 12, 14, 16 or feature polygons are then classified in terms of their geometric and spectral attributes. The geometric and spectral attributes correspond to primary attributes extracted directly from the detected crop feature polygon geometry and pixel values (see above). Geometric attributes include one or more of dimensions, area, aspect ratio, sub-feature 12a size and/or count. Spectral attributes include one or more of dominant colour, RGB, red edge and NIR pattern and hyperspectral signature, each with an associated temporal stamp and location in the image. “Patterns” refer to a collection of image features such as blobs or edges. For example, certain diseases may produce specific characteristic patterns detectable in RGB, NIR and red-edge images, such as leaf rust and stripe rust shown in FIG. 2. A hyperspectral signature refers to a spectral histogram, extracted from a specific crop feature or polygon. The spectral attributes may also include one or more remote sensing indices typically obtained through satellite images, such as NDVI and NDWI. These attributes correspond to tertiary attributes described above.

    [0155] In step 350, a sub-set of classified crop features is labelled with one or more attributes by an expert based at least partly on the geometric and spectral (i.e. primary) attributes derived above in step 340 and ground control data. Ground control data may include information on the type of disease that a crop 10 has been infected with, type of pests and/or weeds present, growth conditions that the crop has been subjected to e.g. whether the crop 10 has received enough water or too much water, enough sunlight (UV) or not enough sunlight, and/or the temperature or soil (e.g. moisture) conditions. The labels may include one or more of crop feature type, healthy crop, crop with specific disease or weed or pest (e.g. septoria, blackgrass, fusarium head blight), and others. These attributes correspond to the secondary attributes described above.

    [0156] Labelling may comprise labelling or tagging each crop feature with an attribute vector containing its primary, secondary and/or tertiary attributes determined or derived from the training images.

    [0157] Each crop feature and its attributes can therefore be related to the same crop feature in other images taken at different points in time through the temporal stamp and location in the image. Attributes can be combined to generate unique composite or compound attributes containing richer information. For example, the observation or detection of a specific spectral signature or colour together with a specific aspect ratio may provide a stronger indicator of a certain disease or weed than any one single attribute taken in isolation. In particular, the external properties of some crops/features correlate with certain diseases. For example, blackgrass exhibits darker matter on the plant/image, which can be readily identified when combined with certain geometric properties extracted from the crop feature.

    [0158] In step 360, this sub-set of labelled classified crop features is stored in the database 220 as a training data set for the machine learning model. The remaining crop features in the training images are used as test data for the machine learning model. The training data set includes at least a minimum number of classified features needed to be statistically relevant, which may be about 30%.

    [0159] The machine learning model can then be trained to identify crop features in an image and determine one or more crop feature attributes for each identified crop feature using the training data set and test data, in the usual manner. Training is initially supervised learning, and can then be semi-autonomous. The result is a bespoke machine learning model trained on the most appropriate training data engineered for the specific field camera. This machine learning model can then be used in the crop monitoring method 100 describe above.

    [0160] Steps 330 to 360 can be repeated for any number of field cameras, to populate the database 220.

    [0161] From reading the present disclosure, other variations and modifications will be apparent to the skilled person. Such variations and modifications may involve equivalent and other features which are already known in the art, and which may be used instead of, or in addition to, features already described herein.

    [0162] Although the appended claims are directed to particular combinations of features, it should be understood that the scope of the disclosure of the present invention also includes any novel feature or any novel combination of features disclosed herein either explicitly or implicitly or any generalisation thereof, whether or not it relates to the same invention as presently claimed in any claim and whether or not it mitigates any or all of the same technical problems as does the present invention.

    [0163] Features which are described in the context of separate embodiments may also be provided in combination in a single embodiment. Conversely, various features which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination.

    [0164] For the sake of completeness it is also stated that the term “comprising” does not exclude other elements or steps, the term “a” or “an” does not exclude a plurality, and any reference signs in the claims shall not be construed as limiting the scope of the claims.