What is TCIA?
TCIA is a service which de-identifies and hosts a large archive of medical images of cancer accessible for public download. The data are organized as “Collections”, typically patients related by a common disease (e.g. lung cancer), image modality (MRI, CT, etc) or research focus. DICOM is the primary file format used by TCIA for image storage. Supporting data related to the images such as patient outcomes, treatment details, genomics, pathology, and expert analyses are also provided when available.
New Collection proposals (primary data) are reviewed by the TCIA Advisory Group. If approved, the Data Collection Center (DCC) provides hands-on support to image providers to de-identify and curate their data. After the data has been processed it is made available in four different ways for users to access:
- Collection summary pages can be accessed from the home page which provide a detailed explanation of each data set as well as direct download links to quickly obtain all images and supporting data for a given Collection.
- The Data Portal provides more advanced searching, browsing and filtering capabilities to select image subsets or download images from multiple Collections which meet search criteria.
- The Programmatic Interface (REST API) allows software developers to build access to TCIA data into their scripts and applications.
- TCIA also encourages the creation of Data Analysis Centers (DACs) which provide additional capabilities for visualizing or analyzing TCIA data by connecting to our TCIA Programmatic Interface (REST API) or by mirroring our Collections.
To enhance the value of TCIA’s primary data collections we also encourage the research community to publish their analysis results. Potential analyses could include tumor segmentations, radiomics features, derived/reprocessed images, and radiologist assessments. You can view the analyses published by other TCIA users in our Analysis Results directory.
Use the tabs below for more information:
A huge amount of clinical and research images are collected each year. TCIA organizes and catalogs the images so that they may be used by the research community for a variety of purposes.
- Cancer researchers can use this data to test new hypotheses and develop new analysis techniques to advance our scientific understanding of cancer.
- Engineers and developers can build new analysis tools and techniques using this data as test material for developing and validating algorithms.
- Professors can use it as a teaching tool for introducing students to medical imaging technology and cancer phenotypes.
- The general public can see how cancer appears in diagnostic images and learn about the instruments doctor uses to diagnose cancer and measure the success of treatment.
TCIA is designed to make searching, reviewing and downloading DICOM data for research quick and easy. Each collection is linked to its own Wiki page that contains information about the source(s), metadata available, and envisioned research purposes of the data.
- Searching for images – The simple search page allows for filtering by: collection names, date of image availability, image modality, and encoded patient ID. There is an advanced search which can filter on modality manufacturer, software version, and several additional DICOM elements. For complex queries there is a dynamic search option allowing one to query over 90 elements of the DICOM files. Searches can be saved for future reference.
- Reviewing the results – There are multiple ways to review the data prior to download. This includes JPEG thumbnail previews, an interactive Cine tool which allows easily scanning through the images, and a link to view the full DICOM header to see the elements contained within the series.
- Downloading the data – Images are placed in your download basket for saving to your local computer. The archive utilizes a Java Web Start applet to quickly download the images to the desired location.
- Referencing the data – A ‘Shared List’ feature allows referencing to fixed sets of images from emails and publications.
TCIA addresses the technical and policy challenges faced by sites wanting to make image data available for public research.
- DICOM PS 3.15 Compliance – All data submitted to the archive is processed with the RSNA’s Clinical Trials Processor software using de-identification scripts which leverage Attribute Confidentiality Profile (DICOM PS 3.15: Appendix E), the official DICOM standard for clinical trials image de-identification.
- Curation and Quality Control – A team of subject matter experts performs curation and quality control against every image submitted to the archive. This review ensures that no protected health information ever makes it into the archive while verifying that meta data which is critical to research analysis is not mistakenly removed.
- Submission support – A submission helpdesk is available to assist submitters every step of the way. This includes providing tools for analyzing data sets as well as customizing and pre-configuring the submission software specifically for that data.
Learn more about what to expect as an image provider.
Most of the data sets in TCIA are accessible to all users. However, TCIA provides the security support to limit access to data sets where appropriate. For example, if a data set needs to be shared on TCIA among collaborators for preliminary analysis, TCIA will support this. At the time of submission, the data provider can provide the TCIA team with a list of collaborators who will have exclusive access. Having the data mounted on TCIA can facilitate making the data public at a later date as well if desired.
Not directly, but we hope by promoting new discovery the scientists, engineers and physicians who use this data will find new ways to fight this disease.
TCIA is a project funded by the Cancer Imaging Program of the National Cancer Institute that uses NBIA software at its core. TCIA provides services to extend the utility of NBIA, including a staff dedicated to curation and quality control for image submissions, a helpdesk to address end-user questions, and a Wiki site for hosting metadata related to the images.