Newly Published!

The first papers from Health Data Science, including the inaugural Editorial from Editor-in-Chief Qimin Zhan, have just published!

Click to read the first papers

Journal profile

The open access journal Health Data Science, published in association with PKU, publishes innovative, scientifically-rigorous research to advance health data science. 

Editorial board

Health Data Science's Editorial Board is led by Qimin Zhan (Chinese Academy of Engineering and Peking University) and is comprised of active researchers and experts in health data science from around the world.

Why publish with us?

• Rapid publication: We use the best systems and processes to ensure efficiency and quality.

• Open access: Articles are free to publish through December 2024 and will always be free to read for everyone.

• Impact: Journal articles are promoted by our expert marketing team.

Latest Articles

More articles
Review Article

Analysis of COVID-19 Guideline Quality and Change of Recommendations: A Systematic Review

Background. Hundreds of coronavirus disease 2019 (COVID-19) clinical practice guidelines (CPGs) and expert consensus statements have been developed and published since the outbreak of the epidemic. However, these CPGs are of widely variable quality. So, this review is aimed at systematically evaluating the methodological and reporting qualities of COVID-19 CPGs, exploring factors that may influence their quality, and analyzing the change of recommendations in CPGs with evidence published. Methods. We searched five electronic databases and five websites from 1 January to 31 December 2020 to retrieve all COVID-19 CPGs. The assessment of the methodological and reporting qualities of CPGs was performed using the AGREE II instrument and RIGHT checklist. Recommendations and evidence used to make recommendations in the CPGs regarding some treatments for COVID-19 (remdesivir, glucocorticoids, hydroxychloroquine/chloroquine, interferon, and lopinavir-ritonavir) were also systematically assessed. And the statistical inference was performed to identify factors associated with the quality of CPGs. Results. We included a total of 92 COVID-19 CPGs developed by 19 countries. Overall, the RIGHT checklist reporting rate of COVID-19 CPGs was 33.0%, and the AGREE II domain score was 30.4%. The overall methodological and reporting qualities of COVID-19 CPGs gradually improved during the year 2020. Factors associated with high methodological and reporting qualities included the evidence-based development process, management of conflicts of interest, and use of established rating systems to assess the quality of evidence and strength of recommendations. The recommendations of only seven (7.6%) CPGs were informed by a systematic review of evidence, and these seven CPGs have relatively high methodological and reporting qualities, in which six of them fully meet the Institute of Medicine (IOM) criteria of guidelines. Besides, a rapid advice CPG developed by the World Health Organization (WHO) of the seven CPGs got the highest overall scores in methodological (72.8%) and reporting qualities (83.8%). Many CPGs covered the same clinical questions (it refers to the clinical questions on the effectiveness of treatments of remdesivir, glucocorticoids, hydroxychloroquine/chloroquine, interferon, and lopinavir-ritonavir in COVID-19 patients) and were published by different countries or organizations. Although randomized controlled trials and systematic reviews on the effectiveness of treatments of remdesivir, glucocorticoids, hydroxychloroquine/chloroquine, interferon, and lopinavir-ritonavir for patients with COVID-19 have been published, the recommendations on those treatments still varied greatly across COVID-19 CPGs published in different countries or regions, which may suggest that the CPGs do not make sufficient use of the latest evidence. Conclusions. Both the methodological and reporting qualities of COVID-19 CPGs increased over time, but there is still room for further improvement. The lack of effective use of available evidence and management of conflicts of interest were the main reasons for the low quality of the CPGs. The use of formal rating systems for the quality of evidence and strength of recommendations may help to improve the quality of CPGs in the context of the COVID-19 pandemic. During the pandemic, we suggest developing a living guideline of which recommendations are supported by a systematic review for it can facilitate the timely translation of the latest research findings to clinical practice. We also suggest that CPG developers should register the guidelines in a registration platform at the beginning for it can reduce duplication development of guidelines on the same clinical question, increase the transparency of the development process, and promote cooperation among guideline developers all over the world. Since the International Practice Guideline Registry Platform has been created, developers could register guidelines prospectively and internationally on this platform.

Review Article

Cognitive Computing-Based CDSS in Medical Practice

Importance. The last decade has witnessed the advances of cognitive computing technologies that learn at scale and reason with purpose in medicine studies. From the diagnosis of diseases till the generation of treatment plans, cognitive computing encompasses both data-driven and knowledge-driven machine intelligence to assist health care roles in clinical decision-making. This review provides a comprehensive perspective from both research and industrial efforts on cognitive computing-based CDSS over the last decade. Highlights. (1) A holistic review of both research papers and industrial practice about cognitive computing-based CDSS is conducted to identify the necessity and the characteristics as well as the general framework of constructing the system. (2) Several of the typical applications of cognitive computing-based CDSS as well as the existing systems in real medical practice are introduced in detail under the general framework. (3) The limitations of the current cognitive computing-based CDSS is discussed that sheds light on the future work in this direction. Conclusion. Different from medical content providers, cognitive computing-based CDSS provides probabilistic clinical decision support by automatically learning and inferencing from medical big data. The characteristics of managing multimodal data and computerizing medical knowledge distinguish cognitive computing-based CDSS from other categories. Given the current status of primary health care like high diagnostic error rate and shortage of medical resources, it is time to introduce cognitive computing-based CDSS to the medical community which is supposed to be more open-minded and embrace the convenience and low cost but high efficiency brought by cognitive computing-based CDSS.

Research Article

Mobile Phone-Based Population Flow Data for the COVID-19 Outbreak in Mainland China

Background. Human migration is one of the driving forces for amplifying localized infectious disease outbreaks into widespread epidemics. During the outbreak of COVID-19 in China, the travels of the population from Wuhan have furthered the spread of the virus as the period coincided with the world’s largest population movement to celebrate the Chinese New Year. Methods. We have collected and made public an anonymous and aggregated mobility dataset extracted from mobile phones at the national level, describing the outflows of population travel from Wuhan. We evaluated the correlation between population movements and the virus spread by the dates when the number of diagnosed cases was documented. Results. From Jan 1 to Jan 22 of 2020, a total of 20.2 million movements of at-risk population occurred from Wuhan to other regions in China. A large proportion of these movements occurred within Hubei province (84.5%), and a substantial increase of travels was observed even before the beginning of the official Chinese Spring Festival Travel. The outbound flows from Wuhan before the lockdown were found strongly correlated with the number of diagnosed cases in the destination cities (log-transformed). Conclusions. The regions with the highest volume of receiving at-risk populations were identified. The movements of the at-risk population were strongly associated with the virus spread. These results together with province-by-province reports have been provided to governmental authorities to aid policy decisions at both the state and provincial levels. We believe that the effort in making this data available is extremely important for COVID-19 modelling and prediction.

Research Article

Urban-Rural Disparities for COVID-19: Evidence from 10 Countries and Areas in the Western Pacific

Background. Limited evidence on the effectiveness of various types of social distancing measures, from voluntary physical distancing to a community-wide quarantine, exists for the Western Pacific Region (WPR) which has large urban and rural populations. Methods. We estimated the time-varying reproduction number ( ) in a Bayesian framework using district-level mobility data provided by Facebook (i) to assess how various social distancing policies have contributed to the reduction in transmissibility of SARS-COV-2 and (ii) to examine within-country variations in behavioural responses, quantified by reductions in mobility, for urban and rural areas. Results. Social distancing measures were largely effective in reducing transmissibility, with estimates decreased to around the threshold of 1. Within-country analysis showed substantial variation in public compliance across regions. Reductions in mobility were significantly lower in rural and remote areas than in urban areas and metropolitan cities ( ) which had the same scale of social distancing orders in place. Conclusions. Our findings provide empirical evidence that public compliance and consequent intervention effectiveness differ between urban and rural areas in the WPR. Further work is required to ascertain the factors affecting these differing behavioural responses, which can assist in policy-making efforts and increase public compliance in rural areas where populations are older and have poorer access to healthcare.

Review Article

Active Vaccine Safety Surveillance: Global Trends and Challenges in China

Importance. The great success in vaccine-preventable diseases has been accompanied by vaccine safety concerns. This has caused vaccine hesitancy to be the top 10 in threats to global health. The comprehensive understanding of adverse events following immunization should be entirely based on clinical trials and postapproval surveillance. It has increasingly been recognized worldwide that the active surveillance of vaccine safety should be an essential part of immunization programs due to its complementary advantages to passive surveillance and clinical trials. Highlights. In the present study, the framework of vaccine safety surveillance was summarized to illustrate the importance of active surveillance and address vaccine hesitancy or safety concerns. Then, the global progress of active surveillance systems was reviewed, mainly focusing on population-based or hospital-based active surveillance. With these successful paradigms, the practical and reliable ways to create robust and similar systems in China were discussed and presented from the perspective of available databases, methodology challenges, policy supports, and ethical considerations. Conclusion. In the inevitable trend of the global vaccine safety ecosystem, the establishment of an active surveillance system for vaccine safety in China is urgent and feasible. This process can be accelerated with the consensus and cooperation of regulatory departments, research institutions, and data owners.

Review Article

Advances in Deep Learning-Based Medical Image Analysis

Importance. With the booming growth of artificial intelligence (AI), especially the recent advancements of deep learning, utilizing advanced deep learning-based methods for medical image analysis has become an active research area both in medical industry and academia. This paper reviewed the recent progress of deep learning research in medical image analysis and clinical applications. It also discussed the existing problems in the field and provided possible solutions and future directions. Highlights. This paper reviewed the advancement of convolutional neural network-based techniques in clinical applications. More specifically, state-of-the-art clinical applications include four major human body systems: the nervous system, the cardiovascular system, the digestive system, and the skeletal system. Overall, according to the best available evidence, deep learning models performed well in medical image analysis, but what cannot be ignored are the algorithms derived from small-scale medical datasets impeding the clinical applicability. Future direction could include federated learning, benchmark dataset collection, and utilizing domain subject knowledge as priors. Conclusion. Recent advanced deep learning technologies have achieved great success in medical image analysis with high accuracy, efficiency, stability, and scalability. Technological advancements that can alleviate the high demands on high-quality large-scale datasets could be one of the future developments in this area.