Hungarian-Colorectal-Screening | Digital pathological slides from Hungarian colorectal cancer screening
DOI: 10.7937/TCIA.9CJF-0127 | Data Citation Required | Image Collection
Location | Species | Subjects | Data Types | Cancer Types | Size | Status | Updated | |
---|---|---|---|---|---|---|---|---|
Colon | Human | 200 | Histopathology, Demographic, Diagnosis | Colorectal Cancer | Clinical, Image Analyses | Public, Complete | 2022/09/20 |
Summary
In this study, 200 digital whole-slide images are published which were collected via hematoxylin-eosin stained colorectal biopsy. This dataset contains the raw MIRAX (mrxs) formatted data. The samples were selected from the archives of the 2nd Department of Pathology of Semmelweis University, Budapest and were scanned with a 3DHistech Pannoramic 1000 Digital Slide Scanner at the highest available, 40x magnification. This is a single center dataset ensuring consequent and homogeneous data processing and patient handling. The related publication shows, how these data can be utilized for training an artificial neural network in order to detect pathological conditions.
Data Access
Version 2: Updated 2022/09/20
Added missing Data0033.dat files from folders 094, 158, 170, & 186
Title | Data Type | Format | Access Points | Subjects | License | |||
---|---|---|---|---|---|---|---|---|
Tissue Slide Images | Histopathology | MRXS and DAT | Download requires IBM-Aspera-Connect plugin |
200 | 200 | CC BY 4.0 | ||
Clinical data | Demographic, Diagnosis | CSV | CC BY 4.0 |
Citations & Data Usage Policy
Data Citation Required: Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution must include the following citation, including the Digital Object Identifier:
Data Citation |
|
Pataki, B. A., Olar, A., Ribli, D., Pesti, A., Kontsek, E., Gyongyosi, B., Bilecz, A., Kovács, T., Kovács, K. A., Zsofia, Kiss, A., Szócska, M., Pollner, P., & Csabai, I. (2021). Digital pathological slides from Hungarian (Europe) colorectal cancer screening (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.9CJF-0127 |
Detailed Description
Note about the data:
From the article, these data include hematoxylin- and eosin- (H&E) stained whole slide imaging (WSI), with resolution 0.1213 μm/pixel, as acquired by 3DHistech Pannoramic 1000 Digital Slide Scanner.
ICD10 explanations
The supporting csv metadata contains ICD10 codes for each slide. Below are some helpful links about this standard and the differences you might see depending on if you use the international ICD10 codes (which are coded with 4 characters), the Hungarian ICD10 codes (which are coded with 5 characters), or the Institute of 2.Pathology and 1.Pathology at the Semmelweis University which use an extended version of ICD10 and has 6 characters.
Introduction to ICD10:
https://www.cdc.gov/nchs/data/dvs/icd10fct.pdf
https://icd.who.int/browse10/Content/statichtml/ICD10Volume2_en_2019.pdf
Code structure, restricted characters, and code length (note that the Hungarian variant does not use * and – ):
https://datadictionary.nhs.uk/data_elements/icd-10_code.html
https://www.cdc.gov/nchs/icd/icd10cm_pcs_background.htm
Further notes:
Some ICD10 codes in the list are shorter than 6 characters. These are older samples from when the institute used only 5 characters.
Acknowledgements
The research was financed by the Thematic Excellence Programme (Tématerületi Kiválósági Program, 2020-4.1.1.-TKP2020) of the Ministry for Innovation and Technology in Hungary, within the framework of the DigitalBiomarker thematic programme of the Semmelweis University. This work was supported by the National Research, Development and Innovation Office of Hungary grants OTKA 128881 and K128780, the National Quantum Technologies Program and the Hungarian Artificial Intelligence National Laboratory.
Related Publications
Publications by the Dataset Authors
The authors recommended the following as the best source of additional information about this dataset:
Publication Citation |
|
Pataki, B. Á., Olar, A., Ribli, D., Pesti, A., Kontsek, E., Gyöngyösi, B., Bilecz, Á., Kovács, T., Kovács, K. A., Kramer, Z., Kiss, A., Szócska, M., Pollner, P., & Csabai, I. (2022). HunCRC: annotated pathological slides to enhance deep learning applications in colorectal cancer screening. In Scientific Data (Vol. 9, Issue 1). https://doi.org/10.1038/s41597-022-01450-y |
No other publications were recommended by dataset authors.
Research Community Publications
TCIA maintains a list of publications that leveraged this dataset. If you have a manuscript you’d like to add please contact TCIA’s Helpdesk.
Previous Versions
Version 1: Updated 2022/06/03
Title | Data Type | Format | Access Points | License | ||||
---|---|---|---|---|---|---|---|---|
Images | MRXS | Download requires IBM-Aspera-Connect plugin |
CC BY 4.0 | |||||
Clinical Data | CSV | CC BY 4.0 |