Non-Federal Acute Care Hospital Health IT Adoption and Use Analysis

Led data acquisition, preprocessing, and imputation for a graduate-level analysis on hospital EHR adoption trends using a public dataset from the U.S. Department of Health & Human Services.

Designed and implemented a full cleaning pipeline in Databricks, including schema alignment, type correction, and multi-model ML-based imputation (Random Forest, GBT, Linear Regression).

(Tools & Stack)

Python (Programming Language) · GitHub · SQL · Jupyter Notebook

(Process & Contribution)

Collaborated with faculty for method validation and integrated statistical feedback into imputation strategy.

Registered dataset as a SQL view to enable seamless team access for analysis and modeling.

Developed a dataflow diagram to visualize project stages (import → ML → SQL → visualization).

Created a new YouTube channel to host the final group presentation and recorded the technical intro.
Tools used: Python, PySpark, Databricks, SQL, GitHub, Canva

Want a consultation or research help for your project, let’s talk

* Will get back to you in 24 hours

Want a consultation or research help for your project, let’s talk

* Will get back to you in 24 hours

Want a consultation or research help for your project, let’s talk

* Will get back to you in 24 hours

Create a free website with Framer, the website builder loved by startups, designers and agencies.