After completing this lesson, learners should be able to:
Understand the concepts of lazy-loading, chunking and scale pyramids
Understand some file formats that implement chunking and scale pyramids
Motivation
Modern microscopy frequently generates image data in the GB-TB range. Such data cannot be naively opened. First, the data may not fit into the working memory (RAM) of your computer. Second, it would take a lot of time to load the data into the memory. Thus, it is important to know about dedicated concepts and implemenations that enable swift interaction with such big image data.
Concept map
graph TD
BIG("Big image data") --- RP("Resolution pyramids")
BIG --- C("Chunking")
C --- LL("Lazy loading")
Figure
Big image data formats typically support flexible chunking of data and resolution pyramids. Chunking enables efficient loading of image subregions. Resolution pyramids prevent loading useless details when being zoomed out.
Similarities of big microscopy data with Google maps
We can think of the data in Google maps as one very big 2D image. Loading all the data in Google maps into your phone or computer is not possible, because it would take to long and your device would run out of memory.
Another important aspect is that if you are currently looking at a whole country, it is not useful to load very detailed data about individual houses in one city, because the monitor of your device would not have enough pixels to display this information.
Thus, to offer you a smooth browsing experience, Google Maps lazy loads only the part of the world (chunk) that you currently look at, at an resolution level that is approriate for the number of pixels of your phone or computer monitor.
Chunking
The efficiency with which parts (chunks) of image data can be loaded from your hard disk into your computer memory depends on how the image data is layed out (chunked) on the hard disk. This is a longer, very technical, discussion and what is most optimal probably also depends on the exact storage medium that you are using. Essentially, you want to have the size of your chunks small enough such that your hardware can load one chunk very fast, but you also want the chunks big enough in order to minimise the number of chunks that you need to load. The reason for the latter is that for each chunk your software has to tell your computer “please go and load this chunk”, which in itself takes time, even if the chunk is very small. Thus, big image data formats typically offer you to choose the chunking such that you can optimise it for your hardware and access patterns.
Use [ Plugins > Utilities > Monitor Memory… ] to keep an eye on the memory
Open xyz_uint8__em_platy__3d_chunk.xml via [ Plugins › BigDataViewer › Open XML/HDF5 ]
Use Shift X, Y, Z to view orthogonal planes
Use the mouse wheel to move along the current axis
Use the arrow keys to zoom
Key observations:
Chunks are lazy-loaded on demand
BDV is non-blocking: One can move around even while data is being loaded
Open 3-D chunked data through Bio-Formats with BigDataViewer
Open Fiji
Open xyz_uint8__em_platy__3d_chunk.xml via drag & drop on Fiji menu bar
The Bio-Format UI will open, select use virtual stack
Use [ Plugins › BigDataViewer › Open Current Image ]
Use Shift X, Y, Z to view orthogonal planes
Key observations:
Compared to abvove, the loading performance is very slow, because Bio-Formats can only lazy load planes, which does not match the 3-D chunking of the data
Open 3-D chunked data multi-resolution data directly with BigDataViewer
Open Fiji
Use [ Plugins > Utilities > Monitor Memory… ] to keep an eye on the memory
Open xyz_uint8__em_platy__3d_chunk_multires.xml via [ Plugins › BigDataViewer › Open XML/HDF5 ]
Key observations:
The viewing performance is improved compared with the above 3-D chunked data that did not have multi-resolution
Open 3-D chunked data multi-resolution data with Bio-Formats
Open xyz_uint8__em_platy__3d_chunk_multires.xml via drag & drop on Fiji menu bar
Key observations:
Since the normal ImageJ viewer does not support multi-resolution data, Bio-Formats asks you to choose one resolution layer
python bioio
# %%
# Open a CZI image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .czi just do pip install bioio bioio-czi
# %%
# Load BDV file
# - Observe that BioImage chooses the correct reader plugin
frombioioimportBioImagefrompathlibimportPathbioimage=BioImage(Path().cwd()/'xyz_uint8__em_platy__3d_chunk_multires.xml')print(bioimage)print(type(bioimage))# %%
# load whole data
image_data=bioimage.data# %%
# lazy load data
image_data=bioimage.dask_dataprint(image_data)#%%
# load specific image plane
bioimage_data=bioimage.dask_data[:,:,:,10,:].compute()
Assessment
Fill in the blanks
Opening data piece-wise on demand is also called ___ .
Storing data piece-wise is also called ___ .
In order to enable fast inspection of spatial data at different scales (like on Google maps) one can use ___ .