Summary
Senior research statistician at AbbVie. PhD in Biostatistics at Medical College of Wisconsin. BS in Applied mathematics and MS in Financial Statistics. Strong background in Statistics, Biostatistics and Mathematics. Advanced skills in bioinformatics, data analysis, machine learning, programming, mathematical modeling and quantitative finance. A fast, creative and energetic learner.
Education
- Medical College of Wisconsin, Milwaukee, WI, USA
Doctor of Philosophy, Biostatistics 9/2020 – 5/2025
- Rutgers University, New Brunswick, NJ, USA
Master of Science, Financial Statistics 9/2018 – 5/2020
- Northwest University, Xi’an, Shaanxi, China
Bachelor of Science, Applied Mathematics 9/2014 – 7/2018
Skills
- Core Domain Expertise: Data Analysis, Biostatistics, Single-cell Analysis(Seurat), Survival Analysis, Machine Learning, Missing data imputation, Mathematical Modeling, Programming, Financial Analysis, Regression, Meta analysis, Feature selection, High-dimension data analysis, Bayes, Statistical consulting, Quantitative finance, Time series analysis
- Computing and Programming: R, SAS, Python(Pytorch, Tensorflow), MATLAB, EMACS, OpenBUGS, Stan, Nimble, SPSS, C++, C, LaTex, shell script, Microsoft Office Suite, Lingo, SQL, R markdown, Rcpp
- Language skills: English, Chinese, Japanese
Research Experience
Missing data imputation based on sampling methods in single-cell analysis
- Propose imputation method based on consensus clustering to address dropout events.
- Apply deep learning methods to denoise scRNA-seq data with dropout events.
- Conduct integrated analysis using dropout information to improve downstream analyses.
Pseudo-value approach for informative cluster size
- Code from scratch for Kaplan-Meier estimates and competing risk estimates, and also pseudo-values.
- Perform simulation based on GEE using pseudo-value approach for informative cluster size to see performance.
Knockoff in case-cohort study based on group LASSO
- Code from scratch for group LASSO and knockoffs. Perform simulation to calculate power and false discovery rate.
- Improve performance based on derandomized knockoffs to control false discovery rate.
Use of external information to improve statistical estimation for infant mortality data
- Preprocess infant mortality data and seek relationship between factors and outcomes based on regression analysis.
- Perform Monte-Carlo simulation study to compare two types of estimation methods using additional information.
Improving outcomes of Chronic Lymphocytic Leukemia: Analysis of the SEER database
- To see if survival rate improves over time, estimate survival rate using Kaplan-Meier estimation.
- Perform regression analysis on survival data based on Cox proportional hazards model.
- For secondary malignancy data, perform Poisson regression model to estimate rate ratio.
Patients’ perspective on post-operative success following bariatric surgery
- Using quality of life data, explore whether time or type of bariatric surgery influences Physical component score (PCS) and mental component score (MCS) in SF-36 questionnaire.
- Find whether there is relationship between PCS, MCS versus Patient Health Questionaire-9(PHQ-9) score and Rosenberg Self-Esteem Scale(RSE).
Self-Reported Coping Strategies in Postlingually Deafened Adults and Speech Recognition Outcomes
- In a retrospective cohort study, characterize the degree to which individual coping strategies may influence speech perception following cochlear implantation.
- Perform Correlation analysis among quality-of-life measures, speech outcome measures and scores of coping strategies.
Publications
- Juan, W., Ahn, K.W., Chen, Y.G. and Lin, C.W., 2025. CCI: A Consensus Clustering-Based Imputation Method for Addressing Dropout Events in scRNA-Seq Data. Bioengineering, 12(1), p.31.
- Juan, W., 2025. Machine Learning Methods and R Packages for Addressing Dropout Events in scRNA-Seq Data (Doctoral dissertation, The Medical College of Wisconsin).
- Muthiah, C., Narra, R., Atallah, E., Juan, W., Szabo, A. and Murthy, G.S.G., 2024. Evaluating population-level outcomes in Chronic Lymphocytic leukemia in the era of novel therapies using the SEER registry. Leukemia Research, 140, p.107496.
- Espahbodi, M., Harvey, E., Livingston, A.J., Montagne, W., Kozlowski, K., Jensen, J., Liu, X., Juan, W., Tarima, S., Rusch, M. and Harris, M.S., 2022. Association of self-reported coping strategies with speech recognition outcomes in adult cochlear implant users. Otology & Neurotology, 43(8), pp.e888-e894.
Conferences
- Juan, W. (presenter), Ahn, K. W., Lin, C., 2025. DropDAE: Denosing Autoencoder for Addressing Dropout Events in scRNA-seq Data. Presented at ENAR (Eastern North American Region), New Orleans, LA.
- Juan, W. (presenter), Ahn, K. W., Lin, C., 2024. CCI: A Consensus Clustering-Based Imputation Method for Addressing Dropout Events in scRNA-seq Data. Presented at ENAR (Eastern North American Region), Baltimore, MD.
- Homan, M. E. (presenter), Tarima, S., Juan, W., 2022. Exploratory Analysis of the Wisconsin Pregnancy Risk Assessment Monitoring System to Identify Modifiable Factors Related to Adverse Pregnancy and Birth Outcomes. Presented at the APHA (American Pharmacists Association) 2022 Annual Meeting and Expo, Boston, MA.
- Tarima, S. (presenter), Juan, W., Zenkova, Z., & Homan, M., 2022. Use of Previously Published Results for More Powerful Testing of Factors Related to Adverse Birth Outcomes. Presented at JSM (Joint Statistical Meetings), Washington, DC.