After completing this lesson, learners should be able to:
Open various image files formats
Understand the difference between image data and metadata
Motivation
There are numerous ways how to save image data on disk. Virtually every microscope vendor has their own file format. It is thus very important to understand how to open those files and inspect their content. Moreover, some software will open only specific image file formats and thus it is sometime necessary to re-save the data. During such image file format conversions information can be lost; it is important to be aware of this and avoid such information loss as much as possible.
Note that opening the CZI file via drag and drop also triggers the Bio-Formats plugin; this is not always the case(!); this is only true if ImageJ itself does not want to open the image itself (see the above activity about TIFF files where this can become confusing).
python BioIO
# %%
# Open a CZI image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .czi just do pip install bioio bioio-czi
# %%
# Load .czi file
# file needs first to be downloaded from https://github.com/NEUBIAS/training-resources/raw/master/image_data/xyz__multiple_images.czi
# save file in the same directory as this notebook
# - Observe that BioImage chooses the correct reader plugin
frombioioimportBioImagefrompathlibimportPathbioimage=BioImage(Path().cwd()/'ExampleImages/xyz__multiple_images.czi')print(bioimage)print(type(bioimage))# %%
# Inspect number of images in object
print(bioimage.scenes)# %%
# Inspect both images in the object
forimageinbioimage.scenes:print(f'Image name: {image}')# Select image:
bioimage.set_scene(image)# Inspect dimension and shape of image
print(f'Image dimension: {bioimage.dims}')print(f'Dimension order is: {bioimage.dims.order}')print(f'Image shape: {bioimage.shape}')# Extract image data (5D)
image_data=bioimage.dataprint(f'Image type: {type(image_data)}')print(f'Image array shape: {image_data.shape}')# Extract specific image part
image_data=bioimage.get_image_data('YX')print(f'Image type: {type(image_data)}')print(f'Image array shape: {image_data.shape}')# Read pixel size
print(f'Pixel size: {bioimage.physical_pixel_sizes}')# Read metadata
print(f'Metadata type: {type(bioimage.metadata)}')print(f'Metadata: {bioimage.metadata}')print('\n')
Discuss that for a 2D file series, it is not obvious where to store the z spacing, because the individual files are only 2D and thus may not have a metadata tag for the z-dimension
Discuss that using pattern files might be useful for such data that is distributed across multiple files
A dialog will appear that lets you configure how to load the multiple files “with similar names”
Study the dialog, but you should not need to change anything
Press [OK] to open the TIFF series
Use [Image > Properties] to observe that the pixel size metadata is missing/wrong
Visit https://www.ebi.ac.uk/empiar/EMPIAR-10982/ to find and then enter the pixel calibration data
python BioIO
# %%
# Open a TIFF series
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .tif just do pip install bioio bioio-tifffile
# %%
# create list of image file names and order them
frompathlibimportPathpath_to_files=Path('/Users/fschneider/Training/training-resources/image_data/xyz_8bit__em_volume_tiff_series')tiff_files=[fileforfileinpath_to_files.glob("*.tif")]# order files
tiff_files.sort()print(tiff_files)# %%
# Open each image with BioIO
frombioioimportBioImagebioimages=[BioImage(file)forfileintiff_files]# Print pixel sizes
forbioimageinbioimages:print(bioimage.physical_pixel_sizes)# %%
# Concatenate in 3D volume
em_volume=[bioimage.data.squeeze()forbioimageinbioimages]# make numpy array
importnumpyasnpem_volume=np.stack(em_volume,axis=0)print(em_volume.shape)# %%
# Lazy load
importdask.arrayasdaem_volume=[bioimage.dask_data[0,0,:,:,:]forbioimageinbioimages]em_volume=da.stack(em_volume,axis=0)print(em_volume.shape)
For us, it seemingly opened the image data wrongly, showing some weird RGB data
Now open the VSI file using [Plugins > Bio-Formats > Bio-Formats Importer]
Display metadata
Display OME-XML Metadata
Press [OK]
In the upcoming dialog select the first image “series”, which represents the actual image data
Press [OK]
The image data and metadata will open and can be inspected
Copy the VSI file to a different location and try again opening it. This should NOT work anymore, because the link to actual data, which is in the ETS file, is now broken
python BioIO
# TODO: Change the below code to open the VSI dataset
# %%
# Open a VSI image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .lif just do pip install bioio bioio-lif
# %%
# Load .lif file
# - Observe that BioImage chooses the correct reader plugin
frombioioimportBioImageimage_url="https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))# %%
# Inspect number of images in object
print(bioimage.scenes)# %%
# Inspect both images in the object
forimageinbioimage.scenes:print(f'Image name: {image}')# Select image:
bioimage.set_scene(image)# Inspect dimension and shape of image
print(f'Image dimension: {bioimage.dims}')print(f'Dimension order is: {bioimage.dims.order}')print(f'Image shape: {bioimage.shape}')# Extract image data (5D)
image_data=bioimage.dataprint(f'Image type: {type(image_data)}')print(f'Image array shape: {image_data.shape}')# Extract specific image part
image_data=bioimage.get_image_data('YX')print(f'Image type: {type(image_data)}')print(f'Image array shape: {image_data.shape}')# Read pixel size
print(f'Pixel size: {bioimage.physical_pixel_sizes}')# Read metadata
print(f'Metadata type: {type(bioimage.metadata)}')print(f'Metadata: {bioimage.metadata}')print('\n')
[Image > Adjust > Brightness/Contrast] such that cells appear saturated
[File > Save As > GIF…]
Open with Fiji
Pixel values have changed
Open with a web browser
Movie plays and looks as when you saved it
python BioIO
# %%
# Load different image files and access various levels of metadata
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# IMPORTANT: bioio by default expects z,y,x to be in microns and T to be in frames per second
# %%
# Load .tif file with minimal metadata
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object is not the image matrix
frombioioimportBioImageimage_url='https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_PLK1_control.tif'bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))# %%
# Print some onject attributes
# - Observe that the object is 5 dimensional with most dimensions being empty
# - Observe that the dimension order is always time, channel, z, y, x, (TCZYX)
print(bioimage.dims)print(bioimage.shape)print(f'Dimension order is: {bioimage.dims.order}')print(type(bioimage.dims.order))print(f'Size of X dimension is: {bioimage.dims.X}')# %%
# Extract image data
# - Observe that the returned numpy.array is still 5 dimensional
image_data=bioimage.dataprint(type(image_data))print(image_data)print(image_data.shape)# %%
# Extract specific part of image data
# - Observe that numpy.array is reduced to populated dimensions only
yx_image_data=bioimage.get_image_data('YX')print(type(yx_image_data))print(yx_image_data)print(yx_image_data.shape)# %%
# Access pixel size
importnumpyasnpprint(bioimage.physical_pixel_sizes)print(f'An pixel has a length of {np.round(bioimage.physical_pixel_sizes.X,2)} microns in X dimension.')# %%
# Access general metadata
print(type(bioimage.metadata))print(bioimage.metadata)# %%
# Load .tif file with extensive metadata
image_url="https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__collagen.md.tif"bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))# - Observe that the image is larger than the previous
print(bioimage.dims)# %%
# Access image and reduce to only populated dimensions
yx_image_data=bioimage.data.squeeze()print(type(yx_image_data))print(yx_image_data)print(yx_image_data.shape)# %%
# Access pixel size
print(bioimage.physical_pixel_sizes)print(f'An pixel has a length of {np.round(bioimage.physical_pixel_sizes.Y,2)} microns in Y dimension.')# Access general metadata
# - Observe that metadata are more extensive than in the previous image
print(type(bioimage.metadata))print(bioimage.metadata)# %%
# Load .lif file
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object has 4 different channels
# - Observe that the general metadata are an abstract element
image_url="https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))print(bioimage.dims)print(bioimage.metadata)print(type(bioimage.metadata))# %%
# Access channel information
print(bioimage.channel_names)# %%
# Access image data for all channels
img_4channel=bioimage.data.squeeze()# Alternative
img_4channel=bioimage.get_image_data('CYX')# - Observe that numpy.array shape is 3 dimensional representing channel,y,x
print(img_4channel.shape)# Access only one channel
img_1channel=bioimage.get_image_data('YX',C=0)# Alternative
img_1channel=img_4channel[0]# - Observe that numpy.array shape is 2 dimensional representing y,x
print(img_1channel.shape)# %%
# Access different images in one image file (scenes)
# - Observe that one image file can contain several scenes
# - Observe that they can be different in various aspects
print(bioimage.scenes)print(f'Current scene: {bioimage.current_scene}')# - Observe that the image in the current scene as 4 channel and Y/X dimensions have the size of 1024
print(bioimage.dims)print(bioimage.physical_pixel_sizes)# Switch to second scene
# - Observe that the image in the other scene as only one channel and Y/X dimensions are half as large as the first scene
# - Observe that the pixel sizes are doubled
bioimage.set_scene(1)print(bioimage.dims)print(bioimage.physical_pixel_sizes)# %%
# Load .czi file
# file needs first to be downloaded from https://github.com/NEUBIAS/training-resources/raw/master/image_data/xyz__multiple_images.czi
# save file in the same directory as this notebook
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object has a z dimension
bioimage=BioImage('/Users/fschneider/skimage-napari-tutorial/ExampleImages/xyz__multiple_images.czi')print(bioimage)print(type(bioimage))# %%
# little excersise in between
# Access image dimensions
print(bioimage.dims)# Access general metadata
# - Observe that metadata are abstract
print(bioimage.metadata)print(type(bioimage.metadata))# Access pixel size
print(bioimage.physical_pixel_sizes)# Access image data for all channels
img_3d=bioimage.data.squeeze()# Alternative
img_3d=bioimage.get_image_data('ZYX')# - Observe that numpy.array shape is 3 dimensional representing z,y,x
print(img_3d.shape)# Access only one channel
img_2d=bioimage.get_image_data('YX',Z=0)# Alternative
img_2d=img_3d[0]# - Observe that numpy.array shape is 2 dimensional representing y,x
print(img_2d.shape)# %%
# little excercise:
# paticipants should try to open one of their files with python
Resaving images in different file formats very often leads to a loss of metadata or distortion of the pixel values. It is critical to be aware of this!
Checks to be done after each resaving
Open the resaved image in all relevant applications and check whether the pixels values and/or metadata are different from the original image
This is critical for the scientific integrity of the resaving
Open the image in a web-browser and observe how the image is rendered
This can be useful to share previews with collaborators
[Image > Adjust > Brightness/Contrast] such that cells appear saturated
[File > Save As > GIF…]
Open with Fiji
Pixel values have changed
Open with a web browser
Movie plays and looks as when you saved it
Assessment
True or false
One could use Excel’s XLSX file format for saving image data.
Solution
One could use Excel’s XLSX file format for saving image data. True, the matrix of each sheet could represent one image plane and one could use the first sheet to store metadata and the mapping of each sheet (image plane) to the zct coordinates, e.g. sheet 12 c 2 z 3 t 1.
Discuss
What are the pros and cons of converting an image into another format?
What are the pros and cons of splitting metadata and image pixel data into separate files?
Do you know any good file formats for image metadata?
Solution
(A) Sometimes it is necessary to convert to another format to be able to open the image in a specific software. (B) Converting an image to another format typically loose information, e.g. because the file format that you are saving to cannot represent all the metadata of the original image file. Thus, it is in general recommened to keep to original image file. (C) Converting to a file format with good compression may save you considerable disk space.
(A) Metadata typically is much smaller than the pixel data. Thus, it can be a good idea to keep metadata in a separate file that can be readily inspected (inspecting the potentially TB sized pixel data files can be tricky). (B) The best file formats for metadata and pixel data can be very different due to the nature of the data, thus splitting can make sense. (C) Having separate files always bares the risk that you loose one of them, e.g. you may forget to copy both to a new folder.
TXT, XML, and JSON are good formats for image metadata, because they are human readable standard formats that can be openend with any text editor.