Remote (image) data access

Prerequisites

Before starting this lesson, you should be familiar with:

Learning Objectives

After completing this lesson, learners should be able to:
  • TODO

Motivation

TODO

Concept map

graph TD WWW("WWW") --- WWWP("WWW Protocols")



Figure


Accessing remote (cloud hosted) data typically relies on specific protocols, such as HTTP and FTP.



Cloud compatible serving of big image data

Aim: sharing big image data with collaborators at different institutions or the general public.

Considerations that let to the implementation of (OME-)Zarr

  • Security: A simple URL download link is an easy and safe way to share data via the web
  • Efficiency: Downloading the whole image can be slow and inefficient if it is large (>10 GB)
  • Chunking and multi-resolution are established methods for accessing parts of large image data
  • “One chunk = one file = one download URL” seemed the simplest web compatible implementation of chunking
    • This let to the development of Zarr (not specifically for image data, but generic arrays of numerical data)
    • OME-Zarr is Zarr with bioimaging specific metadata
  • S3 Object Stores are a well established web server technology to efficiently serve many files in parallel, thus OME-Zarr is often hosted on S3 object stores
    • Technically, the efficient parallelisation is important, because HTTP requests typically have ~100 ms overhead. Thus, accessing chunks sequentially would be slow (slower than on a hard-disk where where the overhead per read is less)



Activities

<a href=#act_ref>Activity title</a>


Show activity for:  

ImageJ GUI

skimage napari







Assessment

Fill in the blanks

  1. TODO ___ .
  2. TODO ___ .

Solution

  1. TODO
  2. TODO




Follow-up material

Recommended follow-up modules:

Learn more: