Big image data formats
Prerequisites
Before starting this lesson, you should be familiar with:
Learning Objectives
After completing this lesson, learners should be able to:
Understand the concepts of lazy-loading, chunking and scale pyramids
Understand some file formats that implement chunking and scale pyramids
Motivation
Modern microscopy frequently generates image data in the GB-TB range. Such data cannot be naively opened. First, the data may not fit into the working memory (RAM) of your computer. Second, it would take a lot of time to load the data into the memory. Thus, it is important to know about dedicated concepts and implemenations that enable swift interaction with such big image data.
Concept map
Figure
Similarities of big microscopy data with Google maps
We can think of the data in Google maps as one very big 2D image. Loading all the data in Google maps into your phone or computer is not possible, because it would take to long and your device would run out of memory.
Another important aspect is that if you are currently looking at a whole country, it is not useful to load very detailed data about individual houses in one city, because the monitor of your device would not have enough pixels to display this information.
Thus, to offer you a smooth browsing experience, Google Maps lazy loads only the part of the world (chunk) that you currently look at, at an resolution level that is approriate for the number of pixels of your phone or computer monitor.
Chunking
The efficiency with which parts (chunks) of image data can be loaded from your hard disk into your computer memory depends on how the image data is layed out (chunked) on the hard disk. This is a longer, very technical, discussion and what is most optimal probably also depends on the exact storage medium that you are using. Essentially, you want to have the size of your chunks small enough such that your hardware can load one chunk very fast, but you also want the chunks big enough in order to minimise the number of chunks that you need to load. The reason for the latter is that for each chunk your software has to tell your computer “please go and load this chunk”, which in itself takes time, even if the chunk is very small. Thus, big image data formats typically offer you to choose the chunking such that you can optimise it for your hardware and access patterns.
Resolution pyramids
TODO
Activities
Lazy load from a TIFF stack
- Download TODO: Large TIFF stack
- Inspect the size of the file on disk and compare to your computer’s memory
- Open the whole file
- Observe that this takes some time
- Observe that your memory fills up
- Lazy-access the file
- Observe that TIFF chunking is plane-wise, which means that slicing “at an angle” requires loading everything.
Show activity for:
ImageJ GUI
- Open Fiji
- [ Plugins > Utilities > Monitor Memory… ]
- Use [ File > Open ] to open the entire TIFF stack
- Observe that this takes time and that your computer’s memory fills up
- Close the image an observe that memory is freed
- Maybe use [ Plugins > Utilities > Collect Garbage ] to enforce freeing the memory
- Use [ Plugins > Bio-Formats > Bio-Formats Importer ] to lazy open the TIFF stack
- Open virtual (<= this is key!)
- Observe that initial opening is faster and your memory is not filling up as much
- Move up and down along the z-axis
- Observe that this is a bit slow because it needs to fetch the data
- Observe that your memory fills up while you move
- Use [ Image > Stacks > Orthogonal Views ] to look at the data from the side
- Observe that now it needs to load all data
Key points
- “Bio-Formats Importer” with the “Open virtual” option allows you to lazy load image data into Fiji
- “Bio-Formats Importer” only supports plane-wise lazy loading from a single resolution level
Create Imaris files
Create chunked multi-resolution HDF5 based Imaris files.
Show activity for:
Imaris File Converter
- Install Imaris File Converter
- Open this website: https://imaris.oxinst.com/microscopy-imaging-software-free-trial
- Scroll all the way to the bottom
- Download and install the Imaris File Converter for your OS
- TODO
Assessment
Fill in the blanks
- Opening data piece-wise on demand is also called ___ .
- Storing data piece-wise is also called ___ .
- In order to enable fast inspection of spatial data at different scales (like on Google maps) one can use ___ .
Solution
- lazy-loading
- chunking
- resolution pyramids
Follow-up material
Recommended follow-up modules:
Learn more: