Chemical File Formats


There are a number of file formats that can be used for storing chemical structures. One of the simplest is a tab or CSV file containing SMILES, but it is also possible to store explicitly the connection table. The slides below describe some of the more common of these, such as the SDfile , MOL file, and MOL2 File. These formats are very rich formats and can represent the stereochemistry as well as the 2D and 3D chemical structure information.



SMARTS


Before checking the video you should have a general idea about SMARTS . You will get all the information and examples here and here



The SMILES used in the video are:

CC(=O)Nc1ccc(O)cc1 Acetaminophen
CC(C)NCC(O)COc1ccccc1CC=C Alprenolol
CC(N)Cc1ccccc1 Amphetamine
CC(CS)C(=O)N1CCCC1C(=O)O Captopril
CN(C)CCCN1c2ccccc2Sc3ccc(Cl)cc13 Chlorpromazine
OC(=O)Cc1ccccc1Nc2c(Cl)cccc2Cl Diclofenac
NCC1(CC(=O)O)CCCCC1 Gabapentin
COC(=O)c1ccccc1O Salicylate
Nc1ccc(N=Nc2ccccc2)c(N)n1 PHENAZOPYRIDINE
C1=CC=C(C(=C1)C(=O)OC2=CC=CC(=C2)CO[N+](­=O)[O-])O AC1L1DBO
IC(=O)c1ccccc1 Benzoyl Iodide
CCOP(=S)(OCC)Oc1cc(Cl)cc(Cl)c1 Dichlofenthion
c1c(C)c(O)c(N)cc1
Oc1c(C)cc(N)cc1
Oc1c(C)ccc(N)c1
c1c(C)c(N)c(O)cc1

Example SMARTS used in the Video


[!C;R]
Any atom in a ring that is not aliphatic Carbon
[O;H1]
Hydroxyl group (-OH)
c:c
Two carbons separated by aromatic bond
C~N
Carbon and nitrogen attached by any bond
*C(=O)O
Carboxyl Group
RC(=O)O
carboxyl attached to a ring
[CX3](=O)[O-]
Hits conjugate bases of carboxylic, carbamic, and carbonic acids.
[CX3](=[OX1])C
Carbonyl with Carbon. Hits aldehyde, ketone, carboxylic acid (except formic), anhydride (except formic), acyl halides. Won't hit carbamic acid/ester, carbonic acid/ester.
[CX3](=[OX1])O
Won't hit aldehyde or ketone.
[#16X2H]
Thiol group
[#6][F,Cl,Br,I]
Any carbon attached to any halogen
[CX3](=[OX1])[F,Cl,Br,I]
Acyl halide
[OD2]([#6])[#6]
Ether aliphatic O with 0 futher total connections
[NX3][CX3](=[OX1])[#6]
Amide Aliphatic N with 2 further total connections and aliphatic C with 0 further total connections
[NX3;H2,H1;!$(NC=O)]
Primary or secondary amine, not amide.Aliphatic N with 1 or 2 futher total connections with 1,2 further hydrogen
[$([NX3](=[OX1])(=[OX1])O),$([NX3+]([OX1-])(=[OX1])O)]
Nitrate group
[NX2]=[NX2]
Azo Nitrogen.diazene
[NX3][NX3]
Hydrazine
[OX2H],
[#6][OX2H],
[OX2H][CX3]=[OX1]
Hydroxyl ,
Hydroxyl in alcohol ,
Hydroxyl in Carboxylic Acid
[OX2H][cX3]:[c]
Phenol Aliphatic O with 0 further total connections with 1 further hydrogen.
[#6,#7;R0]=[#8]
H-bond Acceptor where C is not in a ring and N not in a ring.double bonded to an oxygen
[$([C]aaO);$([C]aaaN)]
Aliphatic carbon that is ortho to an O and meta to an N