What is primary data?
Primary datasets include DICOM images from cancer patients and supporting data collected about the patient (e.g. demographics, clinical outcomes, treatment information). We encourage the submission of multi-disciplinary primary datasets which include genomics, proteomics, pathology and other non-image data types but will generally cross-reference other databases which are specifically designed to host such data rather than attempt to host it on TCIA directly. If you have done analyses on TCIA primary data and wish to publish your dataset you can find instructions for doing that here.
Requesting permission to publish primary data
TCIA is intended to be a resource to the research community. We aim to ensure that every dataset in the archive is one that would be of value to our target audiences. Researchers with the following objectives are encouraged to submit an application:
- Meeting the data sharing requirements set forth by an NCI grant or contract award
- Sharing data for analyzing imaging features to be used as biomarkers
- Sharing data for comparing image features to other data types such as genetics, pathology, or clinical information to create correlative signatures as biomarkers
- Sharing data for the creation of automated or semi-automated algorithms for detection of cancer
- Sharing data as a reference collection for testing and validating quantitative analysis techniques or algorithms in image processing
- Sharing data with unique characteristics for clinical training
If you feel that you have data at your institution that meets one or more of these objectives please fill out our application form so that we can evaluate the suitability of your data to the archive. Questions about filling out the application may be directed to the help desk. Note that cost-recovery arrangements may be necessary for data contributions in some circumstances.
Applications are reviewed monthly by the TCIA Advisory Group. The group is composed of experts in cancer imaging and related technologies. The TCIA Advisory Group reviews each candidate collection based on the criteria above and the availability of resources, and decides whether to accept, reject, or ask for clarifications for each candidate collection.
Starting the submission process
Once we have determined your data set is an appropriate fit for the archive we will initiate the submission process. You will be guided by a TCIA submission expert who will provide all the required tools for sending your data and will answer any questions you have throughout the process. This process includes:
- Install pre-configured Clinical Trials Processor (CTP) java software and use it to submit your data.
- CTP will perform an initial de-identification of the data according to DICOM standards (Attribute Confidentiality Profile – DICOM PS 3.15: Appendix E) before it leaves your institution.
- Our TCIA de-identification and curation process is documented in case there are any questions.
- Our quality control and curation staff will work with you to ensure the data are fully de-identified and received. Additional reviews are performed and any remaining PHI are deleted if found.
- TCIA will work with you to create a dataset summary page to inform users how your data might be of use to them. Please review our guidance on auxiliary information we like to provide to the user community where available.
- TCIA will publish the final data set and announce its addition via our mailing list and social media channels.
Getting credit for data sharing
New journals dedicated to describing data sets are beginning to gain in popularity. Below is a list of data journals which recognize TCIA as a Recommended Repository. These can be used to publish detailed descriptions of your TCIA data to gain academic credit (publication/citations) for your efforts in addition to the novel scientific findings you might publish in traditional journals.
Track your data’s usage
After your dataset is available on TCIA you can view our Data Usage Statistics page to find out how often users search or download your data. You can also use your dataset’s Digital Object Identifier (DOI) to track citations.