SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA

SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA

Description

In recent times, one of the most impactful applications of the growing capabilities of Large Language Models (LLMs) has been their use in Retrieval-Augmented Generation (RAG) systems. RAG applications are inherently more robust against LLM hallucinations and provide source traceability, which holds critical importance in the scientific reading and writing process. However, validating such systems is essential due to the stringent systematic requirements of the scientific domain. Existing benchmark datasets are limited in the scope of research areas they cover, often focusing on the natural sciences, which restricts their applicability and validation across other scientific fields. To address this gap, we present a closed-question answering (QA) dataset for benchmarking scientific RAG applications. This dataset spans 34 research topics across 10 distinct areas of study. It includes 108 manually curated question-answer pairs, each annotated with answer type, difficulty level, and a gold reference along with a link to the source paper. Further details on each of these attributes can be found in the accompanying README.md file. Please cite the following publication when using the dataset: TBD The publication is available at: TBD A preprint version of the publication is available at: TBD
Show more

Year of publication

2024

Authors

Zenodo - Publisher

Mahira Ibnath Joytu - Creator

Md Raisul Kibria Orcid -palvelun logo - Creator

Sébastien Lafond Orcid -palvelun logo - Creator

Other information

Fields of science

Computer and information sciences

Open access

Open

License

Creative Commons Attribution 4.0 International (CC BY 4.0)

SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA - Research.fi