This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Cancer, is properly cited. The complete bibliographic information, a link to the original publication on https://cancer.jmir.org/, as well as this copyright and license information must be included.
The advancement of cancer research has been facilitated through freely available cancer literature, databases, and tools. The age of genomics and big data has given rise to the need for cooperation and data sharing in order to make efficient use of this new information in the COVID-19 pandemic. Although there are many databases for cancer research, their access is not easy owing to different ways of processing and managing the data. There is an absence of a unified platform to manage all of them in a transparent and more comprehensible way.
In this study, an improved integrated cancer research database and platform is provided to facilitate a deeper statistical insight into the correlation between cancer and the COVID-19 pandemic, unifying the collection of almost all previous published cancer databases and defining a model web database for cancer research, and scoring databases on the basis of the variety types of cancer, sample size, completeness of omics results, and user interface.
Databases examined and integrated include the Data Portal database, Genomic database, Proteomic database, Expression database, Gene database, and Mutation database; and it is expected that this launch will sort, save, advance the understanding and encourage the use of these resources in the cancer research environment.
To make it easy to search valuable information, 85 cancer databases are provided in the form of a table, and a database of databases named the Cancer Research Database (CRDB) has been built and presented herein. Furthermore, the CRDB has been herein equipped with unique navigation tools in order to be explored by three methods; that is, any single database can be browsed by typing the name in the given search bar, while all categories can be browsed by clicking on the name of the category or image expression icon, thus serving as a facility that could provide all the category databases on a single click.
The computational platform (PHP, HTML, CSS, and MySQL) used to build CRDB for the cancer scientific community can be freely investigated and browsed on the internet and is planned to be updated in a timely manner. In addition, based on the proposed platform, the status and diagnoses statistics of cancer during the COVID-19 pandemic have been thoroughly investigated herein using CRDB, thus providing an easy-to-manage, understandable framework that mines knowledge for future researchers.
Cancer is a category of diseases causing irregular cell growth with the ability to infiltrate or spread to other areas of the body. As of 2019, approximately 18 million new cases are reported per year [
Because of the effects of antineoplastic therapy, supportive drugs including steroids, and the immunosuppressive qualities of cancer, people with cancer could be immunocompromised [
Comparison of the Cancer Research Database with other published work.
Database | Databases, n | Type | Year | Component | Journal | Reference |
Cancer research database | 98 | Database+list | 2022 | Cancer | N/Aa | N/A |
Munich Information Center for Protein Sequences | 22 | List | 2011 | Different categories |
|
[ |
No name | 6 | Database | 2018 | Aging |
|
[ |
COVID-19 pandemic database | 59 | Database+list | 2021 | COVID-19 |
|
[ |
Swiss Institute of Bioinformatics | 12 | Database | 2016 | Different categories |
|
[ |
Human cancer databases | 58 | List | 2015 | Cancer |
|
[ |
No name | 38 | List | 2014 | Hepatology |
|
[ |
LiverAtlas | 53 | Databases | 2013 | Liver |
|
[ |
No name | 16 | List | 2015 | Cancer |
|
[ |
aN/A: not applicable.
Previously, we have published several articles in well-known journals, such as the database of Phospho-sites in Animals and Fungi [
We integrated the data from multiple different sources including PubMed, Google, Google Scholar, etc. We used various keywords such as “Cancer database,” “cancer database list,” and “database of cancer” as search terms to retrieve published cancer-related databases with the help of PubMed. To circumvent missing data, we have manually collected the latest cancer databases from
Flowchart and procedure for the collection and integration of cancer databases and the construction of the Cancer Research Database (CRDB).
Several articles have been published in this research area [
In cancer expression databases, the expression levels of thousands of genes can be continuously measured under particular experimental environments and conditions resulting from marked advancements in DNA microarray technology. This technology made it possible to understand life at the molecular level, and enables us to generate large-scale gene expression data. It has also been applied in a wide range of applications such as cancer prediction, diagnosis, and drug discovery, which are very important issues for cancer treatment [
Main pages of some commonly using cancer databases. (A) A screenshot of the Expression Database named GXD, (B) a screenshot of the Data Portal category named CNVs (copy number variations), (C) “IARCTP53”: a database of the Gene category, (D, E, and F) main pages of Proteomic database, Mutation database, and Genomic database respectively.
Data Portal is a type of database that provides comprehensive genomic, epigenomic, transcriptomic, and proteomic data. a large number of data are publicly available for anyone in the research community and are used to diagnose, treat, and prevent cancer [
Gene databases collect various types of gene data and information related to cancer [
Cancer proteome databases encompass tumor tissues, cells, and biological fluids to interpret signaling pathways, identify signatures related to tumor initiation, invasion, and metastasis, and determine analytical, predictive, and prognostic markers [
Mutation databases play an important role in science, diagnostics, and genetic health care and can play a vital role in life and death decisions. These databases are extensively used, but only gene- or locus-specific databases have been previously reviewed for their utility, accuracy, completeness, and currency [
The Cancer Genome Database represents one of numerous international groups dedicated to performing wide-ranging genomic and epigenomic studies of selected cancer types to develop our understanding of disease and provide an open-access resource for international cancer research [
In this work, we have provided almost all cancer databases (Table S1 in
The statistics data of the Cancer Research Database (DB)—distribution of the database category.
Year-wise growth of the Cancer Research Database.
Year | Database growth, % |
1999 | 1 |
2004 | 1 |
2005 | 1 |
2006 | 2 |
2007 | 2 |
2008 | 2 |
2009 | 2 |
2010 | 5 |
2011 | 6 |
2012 | 2 |
2013 | 7 |
2014 | 5 |
2015 | 6 |
2016 | 6 |
2017 | 11 |
2018 | 7 |
2019 | 7 |
2020 | 14 |
2021 | 12 |
The CRDB has been developed to provide an easy and user-friendly search experience; for easier and faster search, three options are provided for finding cancer databases. First, browsing can be carried out by typing the name of the database in the search bar, which is highlighted in
Browse options of the Cancer Research Database (CRDB). (A) Can be browsed by typing the name. (B) Can be browsed by category name or image expression. (C) An example and the final result.
As of the previously reported reductions in cancer screening and other preventive care visits during the COVID-19 pandemic, the number of new cancer cases in 2020 is likely to be smaller than anticipated. According to one survey of diagnostic results, there was a 46% decrease in diagnosis of six different cancers (colorectal, pancreatic breast, lung, esophageal, and stomach cancer) from March 1 to April 18, 2020, relative to the period between January 6, 2019, and February 29, 2020, varying from a 25% decrease in the detection rate of pancreatic cancer to a 52% decrease in that of breast cancer [
People with active cancer are more vulnerable to infectious pathogens as a result of a compromised immune system due to the malignancy and its treatment (eg, surgery and chemotherapy). This has raised fears that COVID-19–related problems and mortality may be more common among patients with cancer [
The number of cancers detected before and during the COVID-19 pandemic.
According to American institute for cancer research [
Country-wise cancer rate with an age-standardized average (per 100,000 population).
The American Cancer Society Cancer Action NetworkSM works worldwide to increase the quality of care for patients with and survivors of cancer. With time and the emergence of new cases worldwide, we have compiled a list of the top 10 cancers diagnosed in the United States in 2021.
Rates of new cancers and mortality between male and female patients.
Cancer type | New cancers, % | Mortality, % | |||
|
Male | Female | Male | Female | |
Prostate | 26 | N/Aa | 11 | N/A | |
Breast | N/A | 30 | N/A | 15 | |
Lung and bronchial | 12 | 13 | 22 | 22 | |
Colorectal | 8 | 8 | 9 | 8 | |
Urinary bladder | 7 | —b | 4 | — | |
Skin melanoma | 6 | 5 | — | — | |
Kidney and renal pelvis | 5 | 3 | — | — | |
Non-Hodgkin lymphoma | 5 | 4 | 4 | 3 | |
Oral cavity and pharynx | 4 | — | — | — | |
Leukemia | 4 | 3 | 4 | 3 | |
Pancreatic | 3 | 3 | 8 | 8 | |
Uterine corpus | N/A | 7 | N/A | 4 | |
Brain and other nervous system regions | — | — | 3 | 3 | |
Liver and intrahepatic bile duct | — | — | 6 | 3 | |
Esophageal | — | — | 4 | — | |
Ovarian | N/A | — | N/A | 5 |
aN/A: not applicable.
b—: not determined.
A biological database provides facilities for storing, organizing, and retrieving biological data such as DNA, RNA, carbohydrates, proteins, and cancers. It can be easily viewed, managed, and modified. A number of papers have been published in this research field, which have their own classification of cancer databases based on their function, use, certain technical aspects, and on species such as human, mouse, plant, and fungi. According to such published studies, we have classified the cancer databases into six categories: Data Portal database, Genomic database, Proteomic database, Expression database, Gene database, and Mutation database. Further, we have collected almost all cancer databases with a short introduction and have updated or removed all nonfunctional links. Furthermore, we have understood the current situation of cancer and its correlation with COVID-19; for example, the up-down, mortality, and new case count based on continent and countries, etc. Our database can be searched through an easy-to-use, user-friendly method, can be searched by clicking on category name of image expression, or users can type the name of needed databases in the given search bar and they will be updated with time. In addition, we have examined the status and diagnoses of cancer during the COVID-19 pandemic and have provided easy and understandable information for future researchers.
Cancer databases.
copy number variation
cancer research database
European Genome-phenome Archive
microRNA
The Cancer Genome Atlas
This project is supported by Shenzhen's introduction of talents and research start-up (392020) and this project is supported by National Natural Science Foundation of China (32100434).
These data will be available under the journal rule and regulation. To avoid future conflict and plagiarism issue, CRDB database is uploaded on the internet [
SU supervised the study with TG, DAK, WR, GA, FU, MI, AU and collected and verified the data carefully. SU drafted the manuscript. All authors reviewed the manuscript and agreed to submit it. Both TG and SU are the corresponding authors for this manuscript.
None declared.