The Image Data Resource: a scalable resource for FAIR biological imaging data

Abstract number
80
Presentation Form
Poster
DOI
10.22443/rms.elmi2024.80
Corresponding Email
[email protected]
Session
Poster Session
Authors
Frances Wong (4), Dominik Lindner (4), Jean-Marie Burel (4), Sebastien Besson (3), Josh Moore (2), Will Moore (4), Petr Walczysko (4), Khaled Mohamed (4), David Gault (4), Ugis Sarkans (1), Matthew Hartley (1), Jason Swedlow (4)
Affiliations
1. EMBL-EBI
2. German BioImaging e.V.
3. Glencoe Software, Inc.
4. University of Dundee
Abstract text

Access to primary research data is fundamental for the advancement of science. Much of the published research in the life sciences is based on image datasets that sample 3D space, time, and the spectral characteristics of detected signal to provide quantitative measures of cell, tissue and organismal processes and structures. However, the sheer size and heterogeneity of original image data sets– multi-dimensional image stacks combined with experimental metadata and analytic results– makes image data handling and publication extremely complex and, in practice, rarely achieved.

 

To address this challenge, we have built a next-generation imaging database, the Image Data Resource (IDR; http://idr.openmicroscopy.org). IDR is an added-value resource that combines and integrates data from multiple independent imaging experiments and from many different imaging modalities, into a single public resource. IDR supports browsing, search, visualisation and computational processing within and across datasets acquired from a wide variety of imaging domains. IDR stores, publishes and integrates >380 TB of super-resolution, high content screening, timelapse and histological whole slide imaging data with metadata related to experimental design, image acquisition, downstream analysis and interpretation. Data from >125 studies are available for search and query through a user-friendly web interface, with links from imaging data to reagents, methods and phenotypes via published ontologies. Cloud-based re-analysis of IDR data is enabled using JupyterHub. Reference image data submitted to IDR is also published in EMBL-EBI’s BioImage Archive, assuring sustainability and long-term data availability.

 

We will show recent updates to IDR including cloud-optimised OME-NGFF file formats for large datasets, and the appearance of a new national level for independent federated IDRs.