Submit Your Research

Innovative Technology Evaluation for Healthcare Applications

HDS welcomes research on technologies and analytic approaches for health, including innovative technology evaluation for healthcare applications.

Learn how to submit

Journal profile

The open access journal Health Data Science, published in association with PKU, publishes innovative, scientifically-rigorous research to advance health data science. 

Editorial board

Health Data Science's Editorial Board is led by Qimin Zhan (Chinese Academy of Engineering and Peking University) and is comprised of active researchers and experts in health data science from around the world.


Visit our news page to read about the latest developments with Health Data Science, including news releases and the announcement of our 2021 Reviewers of the Year!

Latest Articles

More articles
Research Article

Stratification of Patients with Diabetes Using Continuous Glucose Monitoring Profiles and Machine Learning

Background. Continuous glucose monitoring (CGM) offers an opportunity for patients with diabetes to modify their lifestyle to better manage their condition and for clinicians to provide personalized healthcare and lifestyle advice. However, analytic tools are needed to standardize and analyze the rich data that emerge from CGM devices. This would allow glucotypes of patients to be identified to aid clinical decision-making. Methods. In this paper, we develop an analysis pipeline for CGM data and apply it to 148 diabetic patients with a total of 8632 days of follow up. The pipeline projects CGM data to a lower-dimensional space of features representing centrality, spread, size, and duration of glycemic excursions and the circadian cycle. We then use principal components analysis and -means to cluster patients’ records into one of four glucotypes and analyze cluster membership using multinomial logistic regression. Results. Glucotypes differ in the degree of control, amount of time spent in range, and on the presence and timing of hyper- and hypoglycemia. Patients on the program had statistically significant improvements in their glucose levels. Conclusions. This pipeline provides a fast automatic function to label raw CGM data without manual input.

Research Article

Large-Scale Social Media Analysis Reveals Emotions Associated with Nonmedical Prescription Drug Use

Background. The behaviors and emotions associated with and reasons for nonmedical prescription drug use (NMPDU) are not well-captured through traditional instruments such as surveys and insurance claims. Publicly available NMPDU-related posts on social media can potentially be leveraged to study these aspects unobtrusively and at scale. Methods. We applied a machine learning classifier to detect self-reports of NMPDU on Twitter and extracted all public posts of the associated users. We analyzed approximately 137 million posts from 87,718 Twitter users in terms of expressed emotions, sentiments, concerns, and possible reasons for NMPDU via natural language processing. Results. Users in the NMPDU group express more negative emotions and less positive emotions, more concerns about family, the past, and body, and less concerns related to work, leisure, home, money, religion, health, and achievement compared to a control group (i.e., users who never reported NMPDU). NMPDU posts tend to be highly polarized, indicating potential emotional triggers. Gender-specific analyses show that female users in the NMPDU group express more content related to positive emotions, anticipation, sadness, joy, concerns about family, friends, home, health, and the past, and less about anger than males. The findings are consistent across distinct prescription drug categories (opioids, benzodiazepines, stimulants, and polysubstance). Conclusion. Our analyses of large-scale data show that substantial differences exist between the texts of the posts from users who self-report NMPDU on Twitter and those who do not, and between males and females who report NMPDU. Our findings can enrich our understanding of NMPDU and the population involved.

Review Article

A Review of Three-Dimensional Medical Image Visualization

Importance. Medical images are essential for modern medicine and an important research subject in visualization. However, medical experts are often not aware of the many advanced three-dimensional (3D) medical image visualization techniques that could increase their capabilities in data analysis and assist the decision-making process for specific medical problems. Our paper provides a review of 3D visualization techniques for medical images, intending to bridge the gap between medical experts and visualization researchers. Highlights. Fundamental visualization techniques are revisited for various medical imaging modalities, from computational tomography to diffusion tensor imaging, featuring techniques that enhance spatial perception, which is critical for medical practices. The state-of-the-art of medical visualization is reviewed based on a procedure-oriented classification of medical problems for studies of individuals and populations. This paper summarizes free software tools for different modalities of medical images designed for various purposes, including visualization, analysis, and segmentation, and it provides respective Internet links. Conclusions. Visualization techniques are a useful tool for medical experts to tackle specific medical problems in their daily work. Our review provides a quick reference to such techniques given the medical problem and modalities of associated medical images. We summarize fundamental techniques and readily available visualization tools to help medical experts to better understand and utilize medical imaging data. This paper could contribute to the joint effort of the medical and visualization communities to advance precision medicine.

Research Article

Cost-Utility Analysis of Screening for Diabetic Retinopathy in China

Background. Diabetic retinopathy (DR) has been primarily indicated to cause vision impairment and blindness, while no studies have focused on the cost-utility of telemedicine-based and community screening programs for DR in China, especially in rural and urban areas, respectively. Methods. We developed a Markov model to calculate the cost-utility of screening programs for DR in DM patients in rural and urban settings from the societal perspective. The incremental cost-utility ratio (ICUR) was calculated for the assessment. Results. In the rural setting, the community screening program obtained 1 QALY with a cost of $4179 (95% CI 3859 to 5343), and the telemedicine screening program had an ICUR of $2323 (95% CI 1023 to 3903) compared with no screening, both of which satisfied the criterion of a significantly cost-effective health intervention. Likewise, community screening programs in urban areas generated an ICUR of $3812 (95% CI 2906 to 4167) per QALY gained, with telemedicine screening at an ICUR of $2437 (95% CI 1242 to 3520) compared with no screening, and both were also cost-effective. By further comparison, compared to community screening programs, telemedicine screening yielded an ICUR of 1212 (95% CI 896 to 1590) per incremental QALY gained in rural setting and 1141 (95% CI 859 to 1403) in urban setting, which both meet the criterion for a significantly cost-effective health intervention. Conclusions. Both telemedicine and community screening for DR in rural and urban settings were cost-effective in China, and telemedicine screening programs were more cost-effective.

Research Article

Misinformation versus Facts: Understanding the Influence of News regarding COVID-19 Vaccines on Vaccine Uptake

Background. There is a lot of fact-based information and misinformation in the online discourses and discussions about the COVID-19 vaccines. Method. Using a sample of nearly four million geotagged English tweets and the data from the CDC COVID Data Tracker, we conducted the Fama-MacBeth regression with the Newey-West adjustment to understand the influence of both misinformation and fact-based news on Twitter on the COVID-19 vaccine uptake in the US from April 19 when US adults were vaccine eligible to June 30, 2021, after controlling state-level factors such as demographics, education, and the pandemic severity. We identified the tweets related to either misinformation or fact-based news by analyzing the URLs. Results. One percent increase in fact-related Twitter users is associated with an approximately 0.87 decrease (, , and ) in the number of daily new vaccinated people per hundred. No significant relationship was found between the percentage of fake-news-related users and the vaccination rate. Conclusion. The negative association between the percentage of fact-related users and the vaccination rate might be due to a combination of a larger user-level influence and the negative impact of online social endorsement on vaccination intent.


Social Determinants, Data Science, and Decision Making: The 3-D Approach to Achieving Health Equity in Asia