An Integrated Framework For Histological Image Data Analytics
Automated image analysis enables the mining of rich information from digitized histological slides. A major challenge is the complexity and large size of the images. Whole-slide images contain a multitude of different structures, like sections, regions of different tissue types and the contained cells. To make sense of these struc- tures, often multiple analysis solutions must be com- bined. A common example is the initial identification of regions-of-interest and the subsequent evaluation of cellular structures with respect to these regions.
There is no general standard for representing image analysis data. Different analysis solutions may represent analysis data as either XML or JSON documents, spread- sheets or images. When combining multiple analysis solutions, the inconsistent data representation makes it necessary to convert information between different formats and to match related entities. This complicates data analytics considerably. To overcome this problem, we describe an integrated framework for histological image data analytics.
The framework represents image analysis data in an open relationalÂ data model. ImageÂ regions and cel- lular structures are represented as individual entitieswith properties and mutual relations. The framework incorporates multiple image analysis solutions for identifying image regions or cellular structures with machine-learning methods. The solutions are executed sequentially and populate the data model with more and more information from the image. Every step can take advantage of data generated in previous steps in order to target image processing operations to specific regions, or in order to reuse previously computed image features.
The relational data model greatly simplifies data ana- lytics in histological images. Region-specific statistics about cellular structures, or heat-maps of their spatial distribution can simply be computed by database queries. Furthermore, the relational data model en- ables the efficient management of the huge amounts of data generated by histological image analysis. We demonstrate the generic applicability of the framework by three example applications for the region-specific analysis of nuclear positivity, steatosis and inflammation in whole-slide images.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
4. In case of virtual slide publication the authors agree to copy the article in a structural modified version to the journal's VS archive.