Primary competition visual

Multilingual Health Question Answering in Low-Resource African Languages Challenge

$5 000 USD
~2 months left
Large Language Models
NLP
278 joined
16 active
Starti
Apr 30, 26
Closei
Jun 21, 26
Reveali
Jun 21, 26
About

The training dataset contains maternal, sexual and reproductive health (MSRH) question-and-answer pairs across four African languages – Akan, Amharic, Luganda and Swahili and the English language – spanning nine language-country configurations. It comprises approximately 29,815 training records and 6,686 validation records. It is suitable for sequence-to-sequence tasks such as health question answering and text generation in low-resource African languages.

The test dataset follows the same structure. It consists of 2,618 records in total. Unlike the training data, this dataset contains only the input health questions. Participants must use their trained model to generate the corresponding answers, which will be evaluated against the reference answers.

There is no other extra information to add.

Files
Description
Files
This file contains the question-and-answer pairs needed to train your model.
This file contains question-and-answer pairs for model validation.
This file contains the questions which you should be answered by your trained model.
This file shows the format and structure of the submission file.