Browser Chemistry Part I

A few months back Noel O’Boyle wrote a series of blog posts wherein he used Emscripten to compile a number of chemoinformatics toolkits to Javascript including OpenBabel, RDKit, and Helium. As someone interested in using the browser to interact with molecules, I thought this was a great series of posts and thought I might even be able to reuse some of the javascript he had generated. However, upon attempting to reuse his smiles-to-svg functions on a different webpage I discovered that output of Emscripten is difficult to reuse (or at least not obvious to someone unfamiliar with ASM.js. I tried downloading the JS file in order to call the smiles-to-svg function in another context but started running into errors with the generated code. I was still interested in trying to generate images in the browser, however, and in his final post, Noel points the way by mentioning the kemia project, a javascript chemistry library that aims to be “the world’s first open source, 100% javascript chemistry toolkit.” The rest of this post is how I used kemia, ChemDoodle, and clojurescript to make pure-browser smiles-to-canvas widget.

Kemia, Chemdoodle, Clojurescript#

To carry out this project I wanted to use clojurescript, a compile-to-javascript, lisp inspired by clojure that has a vibrant and innovative community. On the technical side, clojurescript offers the advantage of using Google’s closure compiler, which, among other things, allows for dead code elimination. My idea was to use:

  1. kemias' smiles-parsing functionality as well as its structure-diagram generation facilities.
  2. ChemDoodle WebComponents for a nice-looking frontend
  3. clojurescript for all the glue code, DOM handling etc, ideally via React/reagent

The result is below (code here) and it is a pure-browser smiles-parsing widget. Try your own smiles string, or paste in an example from below. The rest of this post documents how i made the widget below.

  • Try entering the following or type your own smiles tring into the box and click:
    • Vanillin: O=Cc1ccc(O)c(OC)c1
    • Melatonin: CC(=O)NCCC1=CNc2c1cc(OC)cc2

Packaging and using Kemia#

The cljsjs team is working to provide an easy way to use JS libraries. By providing a standard convention for packaging, these libraries can be included in new clojure/clojurescript projects using a simple :require statement. I decided to package both the kemia and chemdoodle webcomponents libraries for cljsjs. (right now they are pull requests - if you need them you can install them from my clsjsjs/packages branches) Because kemia uses the Google Closure library, the source is included using the :libs option - which allows kemia modules and namespaces to be used just as if they were cljs files. Once kemia was packaged I needed to use it to generate coordinates from a smiles string which would require using the smiles parser and the coordinate generation. You can see how this is done below - all in a single function, parsesmiles.

(defn parsesmiles 
  "use kemia to parse smiles"
  (let [mol ( smiles)
        mol2 (kemia.layout.CoordinateGenerator.generate mol)]

Packaging and using ChemDoodle#

ChemDoodle is a great opensource library for drawing molecules. I wanted to use it instead of Kemia’s drawing methods. To do so I would need to convert kemia’s molecule information into the ChemDoodle-ready JSON. Fortunately the chemdoodle and kemia documentation make it a straightforward problem. Kemia has already calculated x/y positions for each atom so we are simply feeding chemdoodle the atom positions and the bond information. Note the clj->js function at the end which will convert the clojurescript data structures into pure JS.

(defn process_atom [^kemia.model.Atom Atom]
  "take a kemia Atom and return json ready for chemdoodle"
  (let [coord (.-coord Atom)] 
   {"l" (.-symbol Atom)
    "x" (.-x coord)
    "y" (.-y coord)
    "c" (.-charge Atom)}))

(defn process_bond [^kemia.model.Bond Bond ^kemia.model.Molecule Mol]
  "take a kemia Bond and return json ready for chemdoodle"
  (let [atom1 (.-source Bond)
        atom2 (.-target Bond)]
   {"b" (.indexOfAtom Mol atom1)
    "e" (.indexOfAtom Mol atom2)
    "o" (.-order Bond)}))

(defn process_molecule [^kemia.model.Molecule Mol]
  "kemia Molecule to chemdoodle JSON"
  (let [atoms (.-atoms Mol)
        bonds (.-bonds Mol)]
      {"a" (map process_atom atoms)
       "b" (map #(process_bond % Mol) bonds)})))

Once we have chemdoodle-compatible json, we can use chemdoodle to parse the JSON and associate the newly created ChemDoodle molecule with a viewer canvas. We make a molecule, chemmol and associate it with the viewercanvas while setting the bond width (we can add more options at a later point). Chemdoodle needs the ID of an html canvas or it will create a new canvas where it is called. On this webpage there is a <canvas id="chemdoodle"> tag.

(defn chemdoodle 
  "put your chemdoodle JSON into Chemdoodle"
  [id json]
  (let [viewer (new ChemDoodle.ViewerCanvas id 500 500)
       chemmol  (-> (new
                    (.molFrom json))]
      (set! (.. viewer -specs -bonds_width_2D) 0.6 )
      (.scaleToAverageBondLength chemmol 15)
    (.loadMolecule viewer chemmol)))

Event Handling in Clojurescript#

Now we have just about everything we need on the chemistry side - a smiles parser, a coordinate generator, and a way to depict the compounds; but we need some control code. My initial idea to use a react-based solution fell short when I encountered lot of callback errors for incomplete smiles strings. So I chose a simpler route where a button will trigger evaluation - in this case with the help of the dommy library and a nice blog post on it’s usage.

Dommy listens for the button click and calls the smiles-to-chemdoodle function which will pull the smiles string and depict it. Voila!

(defn smiles-to-chemdoodle!
  (let [smiles-input (sel1 :#smiles)
        smiles (dommy/value smiles-input)
        mol (parsesmiles smiles)
        json (process_molecule mol)]
    (println "smiles" smiles)
    (chemdoodle "chemdoodle" json)))

(dommy/listen! (sel1 :#submit) :click smiles-to-chemdoodle!))


In trying to make a relatively straightforward interactive widget I ended up packaging two javascript libraries for use in clojurescript, learning a bit about the clojurescript toolchain and JS/CLJS interop. I also came across a number of helpful posts including this one on js interop and this one on the use of the dommy library. I think the idea of a browser based chemistry toolkit is fantastic and I believe Kemia is a great foundational library. However, kemia looks to be inactive for a number of years and if you run through a number of difficult smiles strings (try daptomycin and tetracycline, for example) you can see that kemia is not as battle tested as CDK or RDKit.

However, I think the power of the web is only growing and that it would be a worthwhile expenditure (of whose time, you may ask) to improve kemia. I think the power of React could allow for a very sophisticated and interactive rendering engine. What might be the minimal-work path? As kemia is based off of CDK, the first thing would be to focus on bringing the kemia structure diagram generation code up to date with CDK’s. The second is to think about the depiction code. Currently kemia uses goog.path to draw its lines as SVG. I like SVG but the codebase would need a lot of work to be able to match the beautiful images of CDK’s new Standard Generator. Alternatively it is possible to use ChemDoodle canvas’s with better immediate appearance but a sacrifice in usability/programability (maybe canvas is not so bad to learn but it is something else to learn.)

I suppose for most users there doesn’t need to be a pure browser solution as interop ala ipython notebook and R’s htmlwidgets is providing fantastic bindings that allow you to have the best of both worlds- native code with battle-tested libraries with nice looking graphics up front. I think the primary need for something purely browser based would be for interactive webpages and particularly for chemistry-themes animations - of metabolic and biosynthetic pathways, for example. All in all, this has been interesting to build and I hope to be able to expand on some of these ideas down the line.