This tool was designed and developed as a part of my dissertation research on develoment of Bayesian networks from medical domain ontologies. A thorough description of the development and design motivations are described in this dissertation. Development was done in R using the Shiny package and employs the rrdf, RHugin, shinysky, shinyBS, and shinyIncubator libraries.
The primary use of this software is to semi-automate the construction of dependency networks (Bayesian networks) from a domain ontology knowledge base. The domain ontology specifies what concepts are dependent on others, so it is not neccessary to have a priori knowledge of what they are. When the user selects a set of concepts of interest, the tool will automatically create the network nodes and arcs in a graph object and display the structure of the network, among other things (described in more detail below). This eases the development of Bayesian networks in that the user need not reconstruct a network from scratch for every use case, nor search the literature or interview domain experts to establish an appropriate network structure. The interface is also interactive, thus, changes in concept selection, dependency level, and other parameters result in real-time updating of the graphics and other reactive features.
In order to use this tool, you must select or upload an ontology. The main sidebar panel has options to do both. Here, you can select a preloaded ontology or load one of your own. If you upload an ontology it will appear in the selectable list of available ontologies. A simple ontology ("testontology.owl") is available for experimenting with software features. These ontologies contains object properties that define the dependencies between classes. Once an ontology is selected, the software will automatically read the class-subclass hierarchy and recreate it in the main sidebar panel as a folder tree. This folder tree structure is an instance of jstree and is therefore interactive and selectable.
With an ontology loaded, you can now select among the various concepts shown in the folder tree. By holding the 'ctrl' key you can select multiple concepts. Selection of a concept or category of concepts will add it and any subconcepts to the checkbox list just to the right of the folder tree. Some ontologies are too large to view easily in a folder display, so this list is here to keep track of all the things you have selected so far. The checkbox group is also selectable, such that network nodes can be removed and added from the graph without re-navigating the folder tree.
If dependencies among the selected concepts exist, then the network graph belonging to this set of concepts will be immediately computed and displayed in this panel1. If the dependency properties in the ontology have numeric tags specifying the strength of dependency, ("dependsOn3", for example) then some statistical metrics describing the network (nodes and edges) will be computed and displayed above the graph. The "Mean Evidence" metric is the mean value of the set of dependency arcs strengths. The "Evidence" term used comes from a medical context and represents a ranking system used to describe the strength of the results measured in a clinical trial or research study, though this value could represent some other type of dependency more generally. For an ontology that contains a layered dependency such as this, the "Dependency Slider" can be changed to exclude nodes below the selected level. If no specific strength is provided, all arcs are considered level 1 strength. Networks are updated in real-time as this value is changed. With respect to levels of evidence, one can use this slider to adjust or examine the strength of the overall network. As previously mentioned, because the software is reactive, changes in inputs (ontology choice, selected nodes, dependency slider, etc) will automatically result in recomputation and redrawing of the network display and any other dependnent factors on other tab panels. Fortunately, there are ways to save your work as you go (see Download Panel).
The network toplogy graph is also interactive. Clicking on nodes will produce any existing annotations or definitions of the term in the network graph (via sparql query of the ontology) and display them just below the graph itself. This works by mapping click locations in the client side graph layout to the server side layout and identifying the nodes. The nodes are not "clickable objects" per se such as in a d3 model, so the layout matching is somewhat inexact, but it is close enough to be functional. The feature is useful for understanding terms in larger more complex graphs or for terms which have acronyms and/or domain jargon.
1A bit about the logic: if there are no dependencies among the selected terms, nothing will be displayed. Similarly, if a selected node has no edge connection to any other terms in the set, it will not be displayed or included in the graph structure.
The "Network Edges" panel simply displays a tabluated version of the edge connections and their dependency types contained in the network generated from the user selected choices. It is searchable and sortable.
2Hugin Expert A/S, Aalborg, Denmark
The core component of this software is a path searching algorithm that leverages the similarity between the RDF semantic web "object-predicate-object" formalism and the "node-arc-node" dependency network formalism. The creation of Bayes networks relies on the presence of a transitive dependency layer in the RDF domain ontology. This layer, as previously mentioned, involves the use of an object property term "dependsOn" applied to various concepts in the ontology. In a classic diagnostic sense, the ontology represents a translation of natural language statements such as "the probability of fever depends on the presence of influenza" into a triple formalism: fever-dependsOn-influenza. When the ontology is read into the system, it is translated into an array of triples, including any reference annotations (the domain knowledge source).
From this array of triples, we can extract subsets of the concepts and dependency relations for network construction and updating. The extraction is performed via an exhaustive non-binary level-order search algorithm. A set of concepts of interest selected from the folder tree {X1,X2,X3...}, and the ontology array are passed to a path-finding function and run for all pairs (Xi, Xj), i ≠ j against all the concepts in the ontology array. However, this pathway search results in a new set of triples which often includes concepts not in the original requested set, X. Therefore, this software also employs a node pruning algorithm.
For pruning, leaf nodes can be simply deleted from the network, but within-path nodes which separate nodes of interest de-mand additional treatment. Traditional Bayes Net nomenclature refers to dependent nodes in a child-parent relation. For clarity, we use a familial analog for BN pathways, and consider the parent of a parent a grandparent, and the child of the parent a grandchild, for example, the path A→B→C involves grandparent A, parent B, and grandchild C. Because "dependsOn" relations are by our definition transitive, all pathway diagrams in the set are commutative. For the d-separated nodes of interest (A,C), given no information about the parent node B, dependency is retained between A and C by way of deductive reasoning. Removal of the parent B means we generate new dependency relationships between all grandparents and all grandchildren nodes, and then delete any parent node and preceding dependencies in the local network resulting in A→C. The task of removing multiple nodes, i.e. removing (B,C) from pathway A→B→C→D is nothing more than a matter of applying the one parent node pruning method iteratively. Pruning can also done based on dependency level. This tool takes the value of the dependency slider and evaluates all the "dependsOnN" triples and adds any concept whose dependency tag N is less than the slider value to list of "nodes to be pruned". Pruning these nodes proceeds in the same manner as described above. Other tags of this nature can be added to software as they are found useful.
After node selection and pruning, the software creates a directed graph from the remaining nodes and displays it in the "Network Toplogy" panel. When you switch to another panel, any reactive elements that are dependent on selection changes will also update, such as the edges in the "Network Edges" panel, or the list of selectable nodes in the "Pathway Explorer" panel. The basic path of reactive dependency is ontology→folder tree→graph/edges→statistics/downloads. The network graph is also dependent on slider inputs. A more detailed overview of Shiny reactiviy can be found here.
Viewing requirements: This software requires a modern browser. It has been tested in Chrome, Iceweasel, and Firefox.
There are known issues with Shiny applications not working in IE v9 and below, but most problems can be resolved by restarting your browser. More specific issues are addressed below:
Error messages:
I am an Assistant Professor (Medical Physics) at the University of Washington in the Department of Radiation Oncology- Univeristy of Washington Medical Center. I have a BS and MS degree in Physics and PhD in Biomedical Informatics. My CV can be found here or via my personal page. For questions or comments about this software contact Alan Kalet at amkalet@uw.edu.