Skip to main content

BREAST-LESIONS-USG

The Cancer Imaging Archive

Breast-Lesions-USG | A Curated Benchmark Dataset for Ultrasound Based Breast Lesion Analysis

DOI: 10.7937/9WKK-Q141 | Data Citation Required | Image Collection

Location Species Subjects Data Types Cancer Types Size Status Updated
Breast Human 256 Segmentations, US Breast Cancer 66.67MB Public, Complete 2024/01/08

Summary

This dataset consists of 256 breast ultrasound scans collected from 256 patients and 266 benign and malignant segmented lesions.  It includes patient-level labels, image-level annotations, and tumor-level labels with all cases confirmed by follow-up care or biopsy result. Each scan was manually annotated and labeled by a radiologist experienced in breast ultrasound examination. In particular, each tumor was identified in the image via a freehand annotation and labeled according to BIRADS features. The tumor histopathological classification is stated for patients who underwent a biopsy. Patient-level labels include clinical data such as age, breast tissue composition, signs and symptoms. Image-level freehand annotations identify the tumor and other abnormal areas in the image. The tumor and image are labeled with BIRADS category, 7 BIRADS descriptors, and interpretation of critical findings as presence of breast diseases. Additional labels include the method of verification, tumor classification and histopathological diagnosis.

Since the role of machine learning and theoretical computing towards the development of augmented inference in the field of cancer detection is indisputable, the quality of the data used to develop any explainable augmented inference methods is extremely important. This dataset can be used as an external testing set for assessing a model’s performance and for developing explainable AI or supervised machine learning models for the detection, segmentation and classification of breast abnormalities in ultrasound images.

A detailed description of this dataset can be found here and should be cited along with the citation of the data:Pawłowska, A., Ćwierz-Pieńkowska, A., Domalik, A., Jaguś, D., Kasprzak, P., Matkowski, R., Fura, Ł., Nowicki, A., & Zolek, N. A Curated benchmark dataset for ultrasound based breast lesion analysis. Sci Data 11, 148 (2024). https://doi.org/10.1038/s41597-024-02984-z.

 

Data Access

Version 1: Updated

Title Data Type Format Access Points Subjects Studies Series Images License
Images and segmentations Segmentations, US PNG and ZIP 256 256 522 CC BY 4.0
Clinical data XLSX CC BY 4.0

Additional Resources for this Dataset

The following external resources have been made available by the data submitters.  These are not hosted or supported by TCIA, but may be useful to researchers utilizing this collection.

Citations & Data Usage Policy

Data Citation Required: Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution must include the following citation, including the Digital Object Identifier:

Data Citation

Pawłowska, A., Ćwierz-Pieńkowska, A., Domalik, A., Jaguś, D., Kasprzak, P., Matkowski, R., Fura, Ł., Nowicki, A., & Zolek, N. (2024). A Curated Benchmark Dataset for Ultrasound Based Breast Lesion Analysis (Breast-Lesions-USG) (Version 1) [dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/9WKK-Q141

Acknowledgements

We would like to acknowledge the individuals and institutions that have provided data for this collection:

  • The preparation of the dataset was supported by National Centre for Research and Development project INFOSTRATEG-I/0042/2021

Other Publications Using this Data

TCIA maintains a list of publications which leverage TCIA data. If you have a manuscript you’d like to add please contact TCIA’s Helpdesk.

Publication Citation

Pawłowska, A., Ćwierz-Pieńkowska, A., Domalik, A., Jaguś, D., Kasprzak, P., Matkowski, R., Fura, Ł., Nowicki, A., & Zolek, N. A Curated benchmark dataset for ultrasound based breast lesion analysis. Sci Data 11, 148 (2024). https://doi.org/10.1038/s41597-024-02984-z