This demo uses the real-time co-occurrence analytics capabilities introduced in MarkLogic Server 4.0 to produce interactive visualizations of how various diseases, treatments, symptoms and diagnostic methods are mentioned together in the text of abstracts of medical research articles.
The database behind this demo contains about 2.5 million XML-encoded abstracts randomly chosen from the well-known MEDLINE dataset. Diseases, treatments, symptoms and diagnostic methods within the abstracts were identified by TEMIS' Luxid Annotation Factory. The entities extracted by TEMIS were then re-inlined as XML markup within the original abstracts by MarkLogic Server.
In the demo, you can specify a simple keyword search, which filters the abstracts in the database into a result set on which real-time analytics are performed. You are then presented with:
- a listing of the most relevant abstracts, based on relevance to the search term(s) provided
- four "facets" listing the most frequently occurring diseases, treatments, symptoms and diagnostic methods in abstracts of articles in the result set
- a co-occurrence tool, which visualizes the pairings of disease and treatment that most frequently co-occur in abstracts of articles in the result set
The visualization tool can be be used to further explore and filter the result set, as follows:
- Clicking on a value will make that value the axis for co-occurrence. For instance, clicking on a treatment will show you what diseases most frequently co-occur with that treatment. Likewise, clicking on a disease will show you what treatments most frequently co-occur with it. This exploration activity can continue indefinitely.
- Alt-clicking (or command-clicking) on a value refines the result set by restricting it to articles whose abstracts contain the specified value. Co-occurrence will be recalculated for the refined result set. Filters set using the visualization tool accumulate at the left of the visualization display.
- If filters have previously been set, holding down the Alt (or Command) key displays the sequence of filters and the order in which they have been set at left.
- Alt-clicking (or command-clicking) on a filter at left will "revert" to that state, undoing any filters set after that point.
- Clicking on the arrow at left will "go back" one step in the visualization, undoing any filters applied in that step.
The facets can both be used to further filter the result set, as desired:
- Clicking on a facet further refines the result set by restricting it to articles whose abstracts contain the specified facet value.
- Clicking on the "Remove All Facet Selections" link will clear all previously set filters, whether those filters were set using the facet interface or the visualization tool.
- Selecting one of these predetermined queries:
- Entering your own search term:
Notes
This demo is running on a commodity Linux server with 2 dual-core x64 chips and 8 GB of RAM. The server is a shared resource that supports multiple demos simultaneously. The visualization tool runs in Flex and requires that Adobe Flash Player 9 (version 124 or later) be installed. To update your Flash Player, go to http://www.adobe.com/go/getflashplayer.
The complete source code for this demo is provided in the Samples/pairs directory of the MarkLogic Server 4.0 release package. Information about its use can be found in Samples/samples-license.txt.
Note: this site is provided for demonstration purposes only. The dataset behind it is incomplete and should not be used for diagnostic, treatment or research purposes. It may only be used as a demonstration of the co-occurrence analytics capability of MarkLogic Server.
Mark Logic, MarkLogic Server and the Mark Logic logo are trademarks of Mark Logic Corporation, TEMIS and Luxid are trademarks of TEMIS S.A., and MEDLINE is a registered trademark of the National Institutes of Health National Library of Medicine.

