About cheminformatics

The roots of what we now call cheminformatics began very early in the history of computing: 1950's for statistical models, 1960's for first computer representations, mainly by curious chemists. However the term "cheminformatics" wasn't adopted until the early 1990's (the spelling of this - cheminformatics or chemoinformatics - is still in dispute). The bulk of the foundational work was done in the 70's and 80's, and was strongly supported by the pharmaceutical industry and the need for computational drug discovery research.

Reading assignments

I define cheminformatics as follows. Various other definitions have been used

Cheminformatics: the field of study of all aspects of the representation and use of chemical and related biological information on computers

The history and possible future directions of cheminformatics is nicely documented in some journal articles:

There are also some good introductory guides:

Note that cheminformatics is highly related to some other terms you are likely to come across:
  • Computational Chemistry: the application of mathematical and computational methods to particularly to theoretical chemistry
  • Molecular Modeling: using 3D graphics and optimization techniques to help understand the nature and action of compounds and proteins
  • Computer-Aided Drug Design: the discipline of using computational techniques to assist in the discovery and design of drugs.
  • Chemogenomics: the study of relationships between chemical compounds and genes

We also have to position it with reference to Bioinformatics, Genomics, Biomedical Informatics, and so on

Cheminformatics has some traditional areas of application (pharmaceutical drug discovery, databases of available chemicals, journal article indexing, patent databases) and some newer ones (pathway databases, probe discovery, polypharmacology, toxicology, etc). In particular, there has recently been a big increase in the amount of chemical information in the public domain, and a deeper integration with other related areas.

Solved problems (kind of...)

  • How do you represent 2D and 3D chemical structures? - Not just a pretty picture
  • How do you search databases of chemical structures? (Google doesn’t help much, but it might do soon…)
  • How do you organize large amounts of chemical information?
  • How do you visualize chemical structures & proteins?
  • Can computers predict how chemicals are going to behave … in the test tube? … in the body?

Unsolved problems

  • Integrating information of multiple types and from many sources
  • Integrating with bioinformatics tools and other fields
  • System-wide prediction of the effect of chemicals on the body (systems chemical biology)
  • Extracting and mining information in journal articles


Magazines & Online Resources