MarkLogic Server 4.0 Co-Occurrence Demo

This demo uses the real-time co-occurrence analytics capabilities introduced in MarkLogic Server 4.0 to produce interactive visualizations of how various diseases, treatments, symptoms and diagnostic methods are mentioned together in the text of abstracts of medical research articles.

The database behind this demo contains about 2.5 million XML-encoded abstracts randomly chosen from the well-known MEDLINE dataset. Diseases, treatments, symptoms and diagnostic methods within the abstracts were identified by TEMIS' Luxid Annotation Factory. The entities extracted by TEMIS were then re-inlined as XML markup within the original abstracts by MarkLogic Server.

In the demo, you can specify a simple keyword search, which filters the abstracts in the database into a result set on which real-time analytics are performed. You are then presented with:

  1. a listing of the most relevant abstracts, based on relevance to the search term(s) provided
  2. four "facets" listing the most frequently occurring diseases, treatments, symptoms and diagnostic methods in abstracts of articles in the result set
  3. a co-occurrence tool, which visualizes the pairings of disease and treatment that most frequently co-occur in abstracts of articles in the result set

The visualization tool can be be used to further explore and filter the result set, as follows:

The facets can both be used to further filter the result set, as desired:

Start the demo by either:

  1. Selecting one of these predetermined queries:
  2. Entering your own search term:

Notes

This demo is running on a commodity Linux server with 2 dual-core x64 chips and 8 GB of RAM. The server is a shared resource that supports multiple demos simultaneously. The visualization tool runs in Flex and requires that Adobe Flash Player 9 (version 124 or later) be installed. To update your Flash Player, go to http://www.adobe.com/go/getflashplayer.

The complete source code for this demo is provided in the Samples/pairs directory of the MarkLogic Server 4.0 release package. Information about its use can be found in Samples/samples-license.txt.

Note: this site is provided for demonstration purposes only. The dataset behind it is incomplete and should not be used for diagnostic, treatment or research purposes. It may only be used as a demonstration of the co-occurrence analytics capability of MarkLogic Server.