Open Babel and Open Source Database Systems
Introduction. Chemical data can be stored into databases. In the fact, the storage of thousands to millions of compounds in relational database systems, like PostgreSQL, MySQL and Oracle, is frequent. However such systems do not provide functions for handling and converting chemical data. For example, it is not possible to compute the molecular weight of a SMILES string. In order to provide a solution, two complementary projects are under development, Pgchem and Mychem.
Pgchem. Pgchem::tigress is a cheminformatics extension to PostgreSQL. The last version of the project - v7.1 - has been released on August 2007. Pgchem is released under the GPL v2.
Mychem. Mychem is a cheminformatics extension to MySQL. The project has been started in June 2007 and the last version - v0.5.2 - has been released on March 2008. Like Pgchem, Mychem is released under the GPL v2 license.
Examples. The both projects are enabling the RDMS to handle chemical data through SQL statements. They are based on Open Babel, the well-known chemoinformatics toolbox. They are using, as much as possible, the same function naming. We present in this section two SQL statements, working with Pgchem and Mychem.
The first example permits you to compute the molecular weight of Glycine.
sql> SELECT MOLWEIGHT(`molecule`) FROM `structures` WHERE
The second example permits you to select structures that are analogues to a reference structure.
sql> SELECT `id` FROM `structures` WHERE
`molecule`) > 0.7;
| id |
| 175 |
| 214 |
| 418 |
Perspectives. Some works are remaining. For example, we need to improve the compatibility of Pgchem and Mychem, so they can be used without distinction by a web service framework. The development of such project for Oracle is also planned. At last, the next version of Open Babel will permit to include many new functions, like 1D -> 2D,3D conversions or LogP and LogS computations.