Analysis results, clinical trial data and other non-image information are also managed by TCIA. A full list of the supporting data that compliment the available images can be found on the Collections wiki page. This information is available from the wiki pages associated with a given collection. The data itself is sometimes hosted directly on the wiki. Other times it is provided as additional series/annotations that are downloaded in the same way as the image data, or it may be hosted on external web sites.
Supporting data on the wiki
In some cases the supporting data is provided to us in the form of a CSV/XLS file or other relatively small file that can simply be attached to the collection wiki page. For example, the Prostate-Diagnosis wiki page provides links to files that contain clinical metadata and multiple NRRD 3D segmentations.
Supporting data downloaded with the image data
Supporting data is sometimes accessible directly via TCIA’s search interface along with the images. As an example, if you select the Collection named ‘QIN Breast DCE-MRI’ and then select the Submit button, a list of subjects will appear. Select the Subject ID ‘QIN-Breast-DCE-MRI-BC01’ by clicking on the Show Studies link. A table appears containing a list of series and descriptions. Under the ‘Description’ column, you will find Matlab Visit 1 and NIfTI Visit 1 files attached as annotations.
You can add either or both of these annotations to your Basket by selecting the check box then selecting the ‘Add To Basket’ button. Then you can select the ‘View My Basket’ button to see the file(s) you selected to download. Note the columns in the Data Basket which specify the file size of annotations in addition to the images.
Supporting data via other web sites
In addition to hosting supporting data directly we sometimes link to external web sites. A prime example of this is our partnership with The Cancer Genome Atlas (TCGA). Their TCGA Data Portal stores extensive genomic, clinical and pathology data for patients in our TCGA image collections. The patient identifiers are maintained across both TCIA and TCGA systems so that their information can easily be correlated.
Other collections such as NSCLC Radiogenomics and NSCLC-Radiomics-Genomics have leveraged Gene Expression Omnibus (GEO) to store the related gene sequencing information which connects to the images that are hosted in TCIA. Again, the patient IDs are kept consistent between both systems to allow researchers to easily connect the data sets from each site.
TCIA also has a history of providing data sets to be used as part of Challenge Competitions. Where applicable we link to the challenge management systems employed by the organizers and help promote awareness of these competitions to encourage participation by TCIA’s user community.