Primary competition visual

EY Biodiversity Challenge

$3 500 USD
~2 months left
Classification
Feature Engineering
Geospatial Data
Geospatial Analysis
588 joined
170 active
Starti
Mar 27, 26
Closei
May 24, 26
Reveali
May 24, 26
Clarification on use of external data
1 Apr 2026, 10:24 · 2

Hello, so I noticed the benchmark notebook tip mentions exploring "other climate or environmental datasets" beyond TerraClimate, but the competition rules in the Info tab says "you may use only the datasets provided."

So I want to confirm if use of external datasets are allowed as long as they are public.

Discussion 2 answers
User avatar
Koleshjr
Multimedia university of kenya

From the data page

Note:

  • Participants should NOT use latitude and longitude as “predictor” variables in their model. Using these data will create a spatially autocorrelated model that is not applicable to other regions and thus not generalized. The latitude and longitude locations should only be used to query the TerraClimate dataset to understand the environmental conditions surrounding the specific location.
  • You may only use the TerraClimate dataset as a source for your “predictor” variables. It is suggested that participants utilize all the TerraClimate variables and consider alterations to the time window or statistical variations of the variables. Also, variable scaling and normalization should be considered.
  • Students may use any common machine learning technique. This might include Random Forest, SVM, CNN, or regression variations.
1 Apr 2026, 10:26
Upvotes 0

Okay thanks for the response, just needed clarification because there's a contradiction to that statement in the notebook provided