Database of 1 million polymer structures for polymer informatics

Author: TENGFEI LUO

Ci0c00726 0006

Polymer structure database - PI1M (stands for 1 million for Polymer Informatics): We present an open-source database of ~1 million polymers (SMILES) generated from RNN (trained by PolyInfo polymers). It will be a playground for Polymer Informatics, and we continue to label properties.

 

Read paper: https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c00726; Download database from https://github.com/RUIMINMA1996/PI1M.  


P-SMILES: the polymers are stored in p-SMILES format, referring to polymer-SMILES, which include “*” for polymerization points. 

Polymer embedding: we also propose the polymer counterpart of molecular embedding, called polymer embedding (PE) as an effective machine learning representation for polymers.