This article describes the DataBox project which offers a perspective of a new health data management solution in Germany. DataBox was initially conceptualized as a repository of individual lung cancer patient data (structured and unstructured). The patient is the owner of the data and is able to share his or her data with different stakeholders. Data is transferred, displayed, and stored online, but not archived. In the long run, the project aims at replacing the conventional method of paper- and storage-device-based handling of data for all patients in Germany, leading to better organization and availability of data which reduces duplicate diagnostic procedures, treatment errors, and enables the training as well as usage of artificial intelligence algorithms on large datasets.JMIR Cancer 2018;4(1):e10160
The development of intelligent storage, sharing, and analysis solutions for health care data has evolved over the recent years [- ]. DataBox is a research project based in Germany and funded by the federal ministry for health as well as the federal ministry for education and research. It aims at improving health data management for patients and health care providers by creating a platform that is accessible from landline phones, computers, mobile phones, and tablets. DataBox provides individual data spaces for storage, analysis, and sharing of health data. The collaborating partners are the National Center for Tumor Diseases in Heidelberg (project lead), Köln University Hospital, Charité University Hospital in Berlin, and the German technology companies SAP and Siemens Healthineers. The ethics committees of the three collaborating centers approved the project.
Currently, patients in Germany receive a printed report by their physician at the end of their stay, often accompanied with other sheets of paper, compact disks, or other physical storage devices containing diagnostic data such as radiological files. Patients are expected to manually carry all this information with them when switching health care providers. This status quo often leads to loss of data, duplicate diagnostic procedures, and treatment errors as well as a lack of instant access to available health data for patients not only during a hospital stay, but also in acute care situations. This lack of data is not only caused by patients losing some of these printed reports, disks, or storage devices or not bringing them to their new care provider, but also due to incompatibility of provided file types between health care providers.
DataBox aims at solving these problems by providing individual data spaces which are accessible for patients of all levels of digital literacy (from landline phones to smartphone devices). Patients can instantly access their individual health data as soon as it is available and share it with selected health care providers of their choice. At the same time, health providers can use the platform to upload health data and to open shared patient data with an integrated format-agnostic viewer.
The digital format of the data enables the training as well as usage of artificial intelligence algorithms on large datasets, ultimately increasing the understandability and value of digitalized health care data for the patient. Machine learning, and more specifically, deep learning algorithms for supervised and unsupervised data analysis, are on the rise in the medical field [- ] and may be enhanced in their precision by large organized datasets.
The need to give citizens back the control of their data is the current task for health care according to the General Data Protection Regulation . DataBox not only improves access by instantaneously synchronizing the health data in a secured cloud with individual data spaces but also lets the patient choose who may access it.
In the first 18 months (starting in January 2018), the DataBox project will focus on 4,000 lung cancer patients from Germany. However, the vision of the initiators of this government funded project is to replace the status quo as outlined above for the whole German health care system after the 18-month test period.
TJB is the technical lead of the DataBox project and the Head of App Development at the National Center for Tumor Diseases (NCT). SR is the current project manager, DR is the former project manager and CK is the initiator of the DataBox project and Head of the Department of Translational Oncology at the NCT which has the project lead.
Conflicts of Interest
- Zhang Y, Qiu M, Tsai CW, Hassan MM, Alamri A. Health-CPS: Healthcare Cyber-Physical System Assisted by Cloud and Big Data. IEEE Systems Journal 2017 Mar 17;11(1):88-95. [CrossRef]
- Almalki M, Gray K, Sanchez FM. The use of self-quantification systems for personal health information: big data management activities and prospects. Health Inf Sci Syst 2015;3(Suppl 1 HISA Big Data in Biomedicine and Healthcare 2013 Con):S1 [FREE Full text] [CrossRef] [Medline]
- Krumholz HM. Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff (Millwood) 2014 Jul;33(7):1163-1170 [FREE Full text] [CrossRef] [Medline]
- Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood) 2014 Jul;33(7):1123-1131. [CrossRef] [Medline]
- Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 2017 May 06. [CrossRef] [Medline]
- Ching T, Himmelstein D, Beaulieu-Jones B, Kalinin A, Do B, Way G, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 2018 Apr;15(141) [FREE Full text] [CrossRef] [Medline]
- Burlina P, Pacheco K, Joshi N, Freund D, Bressler NM. Comparing humans and deep learning performance for grading AMD: A study in using universal deep features and transfer learning for automated AMD analysis. Comput Biol Med 2017 Dec 01;82:80-86 [FREE Full text] [CrossRef] [Medline]
- Cabitza F, Rasoini R, Gensini GF. Unintended Consequences of Machine Learning in Medicine. JAMA 2017 Aug 08;318(6):517-518. [CrossRef] [Medline]
- Becker A, Blüthgen C, Mühlematter U, Boss A. Medicina ex Machina: Machine Learning in der Medizin. Praxis (Bern 1994) 2018 Jan;107(1):19-23. [CrossRef] [Medline]
- Niu Y, Gong E, Xu J, Pauly J, Zaharchuk G. Abstract WP53: Improved Prediction of the Final Infarct From Acute Stroke Neuroimaging Using Deep Learning. : Am Heart Assoc; 2018 May 02 Presented at: International Stroke Conference; Jan 24, 2018 – Jan 26, 2018; Hawaii p. A.
- Xu L, Tetteh G, Lipkova J, Zhao Y, Li H, Christ P, et al. Automated Whole-Body Bone Lesion Detection for Multiple Myeloma on Ga-Pentixafor PET/CT Imaging Using Deep Learning Methods. Contrast Media Mol Imaging 2018 May;2018(5):2391925-2391926 [FREE Full text] [CrossRef] [Medline]
- Gehrmann S, Dernoncourt F, Li Y, Carlson E, Wu J, Welt J, et al. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives. PLoS One 2018;13(2):e0192360 [FREE Full text] [CrossRef] [Medline]
- Mohamed A, Berg W, Peng H, Luo Y, Jankowitz R, Wu S. A deep learning method for classifying mammographic breast density categories. Med Phys 2018 Jan;45(1):314-321. [CrossRef] [Medline]
- Ekins S, Clark A, Perryman A, Freundlich J, Korotcov A, Tkachenko V. Accessible Machine Learning Approaches for Toxicology. Computational Toxicology: Risk Assessment for Chemicals 2018. [CrossRef]
- Hao Y, Khoo H, von Ellenrieder N, Zazubovits N, Gotman J. DeepIED: An epileptic discharge detector for EEG-fMRI based on deep learning. Neuroimage Clin 2018;17:962-975 [FREE Full text] [CrossRef] [Medline]
- Choi H, Na KJ. A Risk Stratification Model for Lung Cancer Based on Gene Coexpression Network and Deep Learning. Biomed Res Int 2018;2018:2914280 [FREE Full text] [CrossRef] [Medline]
- Kermany D, Goldbaum M, Cai W, Valentim C, Liang H, Baxter S, et al. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell 2018 Feb 22;172(5):1122-1131.e9. [CrossRef] [Medline]
- Lee HC, Ryu HG, Chung EJ, Jung CWPOBIDTCIOPDLA. Prediction of Bispectral Index during Target-controlled Infusion of Propofol and Remifentanil: A Deep Learning Approach. Anesthesiology 2018 Mar;128(3):492-501. [CrossRef] [Medline]
- Schirrmeister RT, Springenberg JT, Fiederer LDJ, Glasstetter M, Eggensperger K, Tangermann M, et al. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum Brain Mapp 2017 Nov;38(11):5391-5420 [FREE Full text] [CrossRef] [Medline]
- Shahin M, Ahmed B, Hamida S, Mulaffer F, Glos M, Penzel T. Deep Learning and Insomnia: Assisting Clinicians With Their Diagnosis. IEEE J Biomed Health Inform 2017 Nov;21(6):1546-1553. [CrossRef] [Medline]
- Rumbold J, Pierscionek B. The Effect of the General Data Protection Regulation on Medical Research. J Med Internet Res 2017 Feb 24;19(2):e47 [FREE Full text] [CrossRef] [Medline]
Edited by G Eysenbach; submitted 16.02.18; peer-reviewed by A Zeleke, A Benis; comments to author 08.03.18; revised version received 11.03.18; accepted 11.04.18; published 11.05.18Copyright
©Titus Josef Brinker, Stefanie Rudolph, Daniela Richter, Christof von Kalle. Originally published in JMIR Cancer (http://cancer.jmir.org), 11.05.2018.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Cancer, is properly cited. The complete bibliographic information, a link to the original publication on http://cancer.jmir.org/, as well as this copyright and license information must be included.