Research

Tools and models for faster materials discovery.

A compact view of projects connecting molecular representation, polymer design, simulation automation, and interpretable machine learning.

Research software Paper published

Foundation language model for generative polymer design

POLYT5

An encoder-decoder chemical language model that predicts polymer properties and generates new polymer structures conditioned on target performance.

  • Generates chemically valid, synthesizable polymer candidates without exhaustive enumeration.
  • Applied to dielectric polymer discovery with over 20,000 promising candidates.
  • Connected with a general-purpose LLM for natural-language property prediction and design.
Open-source package Active development

Physics-informed Gaussian Process Regression for materials informatics

matgpr

A public Python package for reproducible materials-informatics workflows with Gaussian Process Regression, uncertainty-aware prediction, materials fingerprints, and physics-informed mean functions.

  • Supports scikit-learn and GPyTorch GPR workflows with validation, learning curves, uncertainty diagnostics, and plotting utilities.
  • Includes physics-aware kernels, target transforms, virtual-observation constraints, derivative-constrained models, and reusable materials-physics equation templates.
  • Designed as a practical platform for future tutorials and blog posts on physics-informed GPR, Bayesian optimization, and small-data materials modeling.
Open-source package Paper published

Autonomous atomic-scale polymer model generation

Polymer Structure Predictor

A Python toolkit that builds a hierarchy of polymer models from repeat-unit SMILES, including oligomers, infinite chains, crystals, and amorphous structures.

  • Generates structures and force-field files for downstream simulations.
  • Supports workflows in VASP, ORCA, LAMMPS, and GAMESS.
  • Open-source package designed to make polymer simulations more autonomous.
Research theme Papers published

Machine learning for organic photovoltaic materials

Organic Solar Cell Design

Models and descriptors for predicting device-relevant properties and guiding virtual screening of donor and acceptor materials.

  • Built ML models for power conversion efficiency and device parameters.
  • Screened candidate molecules from common donor/acceptor building blocks.
  • Identified molecular descriptors connected to Voc, Jsc, and fill factor.
Research theme Paper published

Informatics-guided design rules for electrical conductivity

Conducting Polymers

Machine-learning models for polymer conductivity using hierarchical molecular descriptors and explainable feature analysis.

  • Screened large candidate spaces for conductive polymer/dopant combinations.
  • Used fragment importance and SHAP analysis to derive design guidelines.
  • Integrated predictive capability into broader polymer informatics platforms.