After completing this lesson, learners should be able to:
Concept map
graph TD
WWW("WWW") --- WWWP("WWW Protocols")
Accessing remote (cloud hosted) data typically relies on specific protocols, such as HTTP and FTP.
Cloud compatible serving of big image data
Aim: sharing big image data with collaborators at different institutions or the general public.
Considerations that let to the implementation of (OME-)Zarr
Security: A simple URL download link is an easy and safe way to share data via the web
Efficiency: Downloading the whole image can be slow and inefficient if it is large (>10 GB)
Chunking and multi-resolution are established methods for accessing parts of large image data
“One chunk = one file = one download URL” seemed the simplest web compatible implementation of chunking
This let to the development of Zarr (not specifically for image data, but generic arrays of numerical data)
OME-Zarr is Zarr with bioimaging specific metadata
S3 Object Stores are a well established web server technology to efficiently serve many files in parallel, thus OME-Zarr is often hosted on S3 object stores
Technically, the efficient parallelisation is important, because HTTP requests typically have ~100 ms overhead. Thus, accessing chunks sequentially would be slow (slower than on a hard-disk where where the overhead per read is less)