Skip to content

Conversation

@jenniferjiangkells
Copy link
Member

@jenniferjiangkells jenniferjiangkells commented Nov 22, 2025

Description

Added ML/tabular data support for HealthChain, enabling FHIR-to-DataFrame conversion and feature extraction for machine learning workflows.

Related Issue

Changes Made

  • FHIR Bundle → pandas DataFrame conversion with configurable feature schemas
  • TabularDataset class for ML-ready data containers with feature validation
  • FHIRFeatureMapper for extracting structured features from FHIR resources
  • YAML-based feature schema configuration for reproducible pipelines

FHIR Module Refactor:

  • Split helpers.py into focused modules
  • Added dataframe.py for FHIR → DataFrame conversion
  • Added readers.py for reading FHIR data from files/directories

MIMIC Loader Enhancement:

  • Added load_as_dict() method for ML workflows

@jenniferjiangkells jenniferjiangkells marked this pull request as ready for review November 28, 2025 15:10
@jenniferjiangkells jenniferjiangkells merged commit dababa9 into main Nov 28, 2025
7 checks passed
@jenniferjiangkells jenniferjiangkells deleted the feature/ml-tabular-data-container branch December 2, 2025 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants