You are provided with a tabular dataset consisting of 44,298 soil samples collected at different locations for training. Features include the geo-location, the date, the depth and the extractable amount of each nutrient. A full description of the dataset and its features is given in dataset_data_dictionary.csv
In addition, you are required to source and download satellite data to complement the provided dataset. No other public datasets are allowed. We encourage you to share the code used for downloading the data so it is accessible to all participants. Your solution must also include the preprocessing code for any additional data.
Here is a list of suggested sources for extra data:
- Sentinel-2 (ESA)
- Landsat (NASA)
- CHIRPS Climate Data
- WorldClim Climate Data
- FAOSTAT and KNBS Agricultural Statistics
- IFPRI / HarvestChoice Crop Maps
Please pay attention the file called TargetPred_To_Keep.csv. For some elements, there are no reference values at specific location. The corresponding entries are denoted with zeroes. Before submitting your submission file, ensure that the value predicted for these entries are 0 so that they won't affect your final score.