Database of 1 million polymer structures for polymer informatics


Polymer structure database - PI1M (stands for 1 million for Polymer Informatics): We present an open-source database of ~1 million polymers (SMILES) generated from RNN (trained by PolyInfo polymers). It will be a playground for Polymer Informatics, and we continue to label properties.


Read paper:; Download database from  

P-SMILES: the polymers are stored in p-SMILES format, referring to polymer-SMILES, which include “*” for polymerization points. 

Polymer embedding: we also propose the polymer counterpart of molecular embedding, called polymer embedding (PE) as an effective machine learning representation for polymers.