So I went to the ChEMBL homepage where there is a 'Marvin JS' chemical drawing tool by the company ChemAxon. Here was my picture:
[Later note: actually it's not necessary to do -OA here, you can just do '-O' and the software shows a -OH but this substructure search will hit things with -O-A unless you specifically draw in a -O-H (ie. H instead of A)].
Then I clicked on the 'Substructure search - Fetch compounds' button under the 'Marvin JS' chemical drawing tool, on the ChEMBL webpage. This brought back 248 hits.
Then I clicked on the arrow at the top of the 'Max phase' column to sort by phase. This found a few old friends at the top, albendazole and mebendazole, plus a couple of others:
The fourth was a molecule I hadn't seen before:
Some things I noticed:
- For some reason the 'atom toolbar' seemed to disappear when I was using Safari. I tried using Firefox instead and it was there again. Not sure why this is..
- When I searched ChEMBL for compounds containing a -SO3 substructure, it got hits that on first glance didn't seemed to contain this substructure. However, when I looked at the ChEMBL pages for those hits, they said 'Alternative forms of this compound in ChEMBL', and those alternative forms did have -SO3 substructures. So I guess it was searching the 'Alternative forms' as well.
- The 'A' atom seems to mean any atom except H.
- I found that when I search for a benzene ring with -OH attached, it also finds hits that have benzene attached to -O-(something else) so that it doesn't seem to enforce that it's -OH.
- Something is wrong with the ChEMBL results page. When I searched for a benzene ring with two -OH groups attached, the default display showed 25 hits per page, and when I clicked through the pages, I saw some compounds repeated on different hit pages, but didn't find hexylresorcinol, which should be a hit. However, when I asked for 100 hits per page, it showed hexylresorcinol! I emailed the ChEMBL helpdesk about this but no reply yet...