Tunnel search with MOLE 2.0 and MOLEonline 2.0

In short, a tunnel is a path between a point inside the macromolecular structure and its boundary. A tunnel is formed by its centerline and profile that specifies tunnel radius in each of centerline points.

In essence, to find a tunnel the previous version of MOLE required the user to specify a starting point and from this point, the graph formed by the dual of the Delaunay triangulation (the Voronoi diagram, see bellow) got traversed in a depth first manner (paths with the lower cost were visited first). Each time a boundary vertex was reached, a tunnel got reported.

There are numerous problems with this approach. For example, when the first tunnel got reported, the next one was very similar to the first one (and therefore provided no useful additional information) because the path "branched" near the end point of the first tunnel. Another issue was caused by small "ridges" that were formed by vertices near the boundary of the diagram. As a result, a large part of the tunnel was often "going along" the surface of the protein and therefore they didn't provide a very relevant information about the structure of the tunnel. It was also very difficult to identify interesting starting points of the tunnels, unless prior knowledge.

The current version of the new MOLE 2.0 algorithm addresses these problems issues by preprocessing the Voronoi diagram, splitting it into several smaller parts (called cavity diagrams) and identifying suitable start and end points. Finally, Dijkstra's algorithm is used to find the tunnels. This algorithm is used also in MOLEonline 2.0 web application.

MOLE 2.0 Algorithm outline

MOLE 2.0 algorithm overview

The tunnel computation in MOLE 2.0 is performed in several steps:

1. Macromolecular Voronoi diagram representation

Voronoi diagram

The Voronoi diagram divides a metric space according to the distances between discrete sets of specified objects. In MOLE 2.0 case objects are the centres of the atom with van der Waals spheres. vdW radii of the spheres are taken from AMBER force field. The edges of the diagram represent equidistant positions between pairs of the closest atoms. The Voronoi diagram can be computed as a dual of the Delaunay triangulation of vdW sphere centers. Therefore, the vertices of the diagram correspond to the tetrahedrons of the triangulation and are located at the tetrahedrons' circumcenters. Similarly, the edges of the diagram represent the adjacency of the tetrahedrons.

2. Preprocessing the diagram

The preprocessing works in several steps:

3. Detection of Start points and End points

Once the diagram is split into several smaller cavity diagrams, these are analyzed for suitable tunnel start points and end points between vertices of the cavity. The general idea here is that the start point is the "deepest" vertex in the cavity and the end point is the "largest" boundary vertex (i.e. corresponding to the largest tetrahedron).

Start Points

There are two ways to specify a start point for a tunnel - user specified list of residues, XYZ coordinates or (MOLE 2.0 only) automatic computation:

End Points

Similarly to the start points, endpoints are either user defined (MOLE 2.0 desktop version only) or calculated:

4. Tunnel computation

Finally, when the set of start and end points is identified for each cavity, the Dijkstra's Shortest Path algorithm is used to find the tunnels between all pairs of start and end points points (or the molecular surface if the algorithm reaches it before it finds the tunnel exit). The edge weight function used in the algorithm takes into account the distance to the surface of the closest vdW sphere and the edge length.

Then the tunnel centerline is represented as a 3D natural spline representation defined by the Voronoi diagram vertices that form the path found by the Dijkstra's algorithm.

Because of the density of the computed exits, the algorithm might find duplicate tunnels. Therefore, in the last post-processing step, these duplicate tunnels are removed.

Assignment of physico-chemical properties

When tunnel is shown, a set of unique residues lining it is shown, while those surrounding the tunnel with side chain are noted. According to side chain, a list of properties is calculated (for lining aminoacid side chains only):

Note on advanced options

Advanced settings offers possibility to adjust parameters determining tunnel searching algorithm. All parameters are set in Angstroms (0.1 nm):

Final Note on the complexity of the algorithm

The worst case complexity of the algorithm is O(M log M) where M is the number of vertices of the Voronoi diagram. In the worst case, M = N2, where N is the number of atoms. However, in most practical cases M is almost equal to N. Additionally, the complexity of calculating the Voronoi diagram is O(N log N). The complexity of all steps of the pre-processing phase is at most O(M log M). Finding the paths using the Dijkstra's algorithm is O(K M logM) where K is number of tunnels.

Together, the complexity of the MOLE 2.0 tunnel search algorithm is O(K N2 log N), where K is the number of detected tunnels and N is the number of atoms in the molecule.