Image data formats

Prerequisites

Before starting this lesson, you should be familiar with:

Learning Objectives

After completing this lesson, learners should be able to:
  • Open and save various image files formats

  • Understand the difference between image voxel data and metadata

  • Understand that converting between image file formats likely leads to loss of information

Motivation

There are numerous ways how to save image data on disk. Virtually every microscope vendor has their own file format. It is thus very important to understand how to open those files and inspect their content. Moreover, some software will open only specific image file formats and thus it is sometime necessary to re-save the data. During such image file format conversions information can be lost; it is important to be aware of this and avoid such information loss as much as possible.

Concept map

graph TD F("TIFF, JPEG, XML/HDF5, CZI, LIF, ...") F --> PD("Pixel data") PD --> Values PD --> Dimensions F --> MD("Metadata") MD --> IC("Image calibration") MD --> MS("Microscope settings") MD --> DS("Display settings") MD --> NA("...")



Figure


Image pixel data are saved as binary data on disk. Essential metadata is needed to load the binary data into an image array.






Activities

Open TIF image data

Data

Show activity for:  

ImageJ GUI

  • Open the file mentioned in the activity by drag and drop into the FIJI window:
    • Inspect the image
    • Inspect the metadata via [Image > Show info…]
  • Close the image and open it again via Bio-Formats Importer:
    • [Plugins > Bio-Formats > Bio-Formats Importer]
    • Select your image
    • Display metadata
    • Click [OK]

python BioIO

# %% 
# Open a tif image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .tif just do pip install bioio bioio-tifffile


# %%
# Load .tif file
# - Observe that BioImage chooses the correct reader plugin
from bioio import BioImage
image_url = 'https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_PLK1_control.tif'
bioimage = BioImage(image_url)
print(bioimage)
print(type(bioimage))

# %%
# Inspect dimension and shape of image
print(f'Image dimension: {bioimage.dims}')
print(f'Dimension order is: {bioimage.dims.order}')
print(f'Image shape: {bioimage.shape}')

# %%
# Extract image data (5D)
image_data = bioimage.data
print(f'Image type: {type(image_data)}')
print(f'Image array shape: {image_data.shape}')
# Extract specific image part
image_data = bioimage.get_image_data('YX')
print(f'Image type: {type(image_data)}')
print(f'Image array shape: {image_data.shape}')

# %%
# Read pixel size
print(f'Pixel size: {bioimage.physical_pixel_sizes}')
# Read metadata
print(f'Metadata type: {type(bioimage.metadata)}')
print(f'Metadata: {bioimage.metadata}')

# %%
# Load .tif file with extensive metadata
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__collagen.md.tif"
bioimage = BioImage(image_url)

# %%
# Read pixel size
print(f'Pixel size: {bioimage.physical_pixel_sizes}')
# Read metadata
print(f'Metadata type: {type(bioimage.metadata)}')
print(f'Metadata: {bioimage.metadata}')



Open CZI image data

Data

Show activity for:  

ImageJ GUI

  • Open the file mentioned in the activity using:
    • [Plugins > Bio-Formats > Bio-Format Importer]
      • Display metadata
      • Display OME-XML Metadata
    • Press [OK]
    • Select both “Series”
    • Look at the images
    • Inspect the metadata

python BioIO

# %% 
# Open a CZI image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .czi just do pip install bioio bioio-czi


# %%
# Load .czi file
# file needs first to be downloaded from https://github.com/NEUBIAS/training-resources/raw/master/image_data/xyz__multiple_images.czi
# save file in the same directory as this notebook
# - Observe that BioImage chooses the correct reader plugin
from bioio import BioImage
bioimage = BioImage('~/skimage-napari-tutorial/ExampleImages/xyz__multiple_images.czi')
print(bioimage)
print(type(bioimage))

# %%
# Inspect number of images in object
print(bioimage.scenes)

# %%
# Inspect both images in the object
for image in bioimage.scenes:
    print(f'Image name: {image}')
    # Select image:
    bioimage.set_scene(image)
    # Inspect dimension and shape of image
    print(f'Image dimension: {bioimage.dims}')
    print(f'Dimension order is: {bioimage.dims.order}')
    print(f'Image shape: {bioimage.shape}')
    # Extract image data (5D)
    image_data = bioimage.data
    print(f'Image type: {type(image_data)}')
    print(f'Image array shape: {image_data.shape}')
    # Extract specific image part
    image_data = bioimage.get_image_data('YX')
    print(f'Image type: {type(image_data)}')
    print(f'Image array shape: {image_data.shape}')
    # Read pixel size
    print(f'Pixel size: {bioimage.physical_pixel_sizes}')
    # Read metadata
    print(f'Metadata type: {type(bioimage.metadata)}')
    print(f'Metadata: {bioimage.metadata}')
    print('\n')



Open Leica LIF image data

Data

Show activity for:  

ImageJ GUI

  • Open the file mentioned in the activity using:
    • [Plugins > Bio-Formats > Bio-Formats Importer]
      • Open all series (you can check this to open all series automatically)
      • Colormode: Composite
      • Display metadata
      • Display OME-XML Metadata
    • Press [OK]
    • Select both series
    • Look at the images
    • Inspect the different metadata for these two images.

python BioIO

# %% 
# Open a LIF image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .lif just do pip install bioio bioio-lif


# %%
# Load .lif file
# - Observe that BioImage chooses the correct reader plugin
from bioio import BioImage
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"
bioimage = BioImage(image_url)
print(bioimage)
print(type(bioimage))

# %%
# Inspect number of images in object
print(bioimage.scenes)

# %%
# Inspect both images in the object
for image in bioimage.scenes:
    print(f'Image name: {image}')
    # Select image:
    bioimage.set_scene(image)
    # Inspect dimension and shape of image
    print(f'Image dimension: {bioimage.dims}')
    print(f'Dimension order is: {bioimage.dims.order}')
    print(f'Image shape: {bioimage.shape}')
    # Extract image data (5D)
    image_data = bioimage.data
    print(f'Image type: {type(image_data)}')
    print(f'Image array shape: {image_data.shape}')
    # Extract specific image part
    image_data = bioimage.get_image_data('YX')
    print(f'Image type: {type(image_data)}')
    print(f'Image array shape: {image_data.shape}')
    # Read pixel size
    print(f'Pixel size: {bioimage.physical_pixel_sizes}')
    # Read metadata
    print(f'Metadata type: {type(bioimage.metadata)}')
    print(f'Metadata: {bioimage.metadata}')
    print('\n')



Open volume EM TIFF series

Data

Show activity for:  

ImageJ GUI

  • Open the file mentioned in the activity using:
    • [Plugins > Bio-Formats > Bio-Format Importer]
      • Select one of the EM TIFF slice files
      • Group files with similar names
        • This will load all the slices (see below)
      • Display metadata
      • Display OME-XML Metadata
    • Press [OK]
    • A dialog will appear that lets you configure how to load the multiple files “with similar names”
    • Study the dialog, but you should not need to change anythin
    • Press [OK] to open the TIFF series
    • Use [Image > Properties] to observe that the pixel size metadata is missing/wrong
    • Visit https://www.ebi.ac.uk/empiar/EMPIAR-10982/ to find and then enter the pixel calibration data

python BioIO

# %% 
# Open a TIFF series
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .tif just do pip install bioio bioio-tifffile


# %%
# create list of image file names and order them
from pathlib import Path
path_to_files = Path('/Users/fschneider/Training/training-resources/image_data/xyz_8bit__em_volume_tiff_series')
tiff_files = [file for file in path_to_files.glob("*.tif")]
# order files
tiff_files.sort()
print(tiff_files)

# %%
# Open each image with BioIO
from bioio import BioImage
bioimages = [BioImage(file) for file in tiff_files]

# Print pixel sizes
for bioimage in bioimages:
    print(bioimage.physical_pixel_sizes)

# %%
# Concatenate in 3D volume
em_volume = [bioimage.data.squeeze() for bioimage in bioimages]
# make numpy array
import numpy as np
em_volume = np.stack(em_volume,axis=0)
print(em_volume.shape)



Open Olympus VSI image data

Data

Show activity for:  

ImageJ GUI

  • Drag and drop the .VSI file on Fiji
    • What happens?
    • For us, it seemingly opened the image data wrongly, showing some weird RGB data
  • Now open the VSI file using [Plugins > Bio-Formats > Bio-Formats Importer]
    • Display metadata
    • Display OME-XML Metadata
    • Press [OK]
    • In the upcoming dialog select the first image “series”, which represents the actual image data
    • Press [OK]
  • The image data and metadata will open and can be inspected

Copy the VSI file to a different location and try again opening it. This should NOT work anymore, because the link to actual data, which is in the ETS file, is now broken

python BioIO

# TODO: Change the below code to open the VSI dataset

# %% 
# Open a VSI image file
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# Note: for only dealing with .lif just do pip install bioio bioio-lif


# %%
# Load .lif file
# - Observe that BioImage chooses the correct reader plugin
from bioio import BioImage
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"
bioimage = BioImage(image_url)
print(bioimage)
print(type(bioimage))

# %%
# Inspect number of images in object
print(bioimage.scenes)

# %%
# Inspect both images in the object
for image in bioimage.scenes:
    print(f'Image name: {image}')
    # Select image:
    bioimage.set_scene(image)
    # Inspect dimension and shape of image
    print(f'Image dimension: {bioimage.dims}')
    print(f'Dimension order is: {bioimage.dims.order}')
    print(f'Image shape: {bioimage.shape}')
    # Extract image data (5D)
    image_data = bioimage.data
    print(f'Image type: {type(image_data)}')
    print(f'Image array shape: {image_data.shape}')
    # Extract specific image part
    image_data = bioimage.get_image_data('YX')
    print(f'Image type: {type(image_data)}')
    print(f'Image array shape: {image_data.shape}')
    # Read pixel size
    print(f'Pixel size: {bioimage.physical_pixel_sizes}')
    # Read metadata
    print(f'Metadata type: {type(bioimage.metadata)}')
    print(f'Metadata: {bioimage.metadata}')
    print('\n')



Explore various image file formats

Example image data

Show activity for:  

ImageJ GUI

  • Open the files mentioned in the activity:
    • [Plugins > Bio-Formats > Bio-Format Importer].
      • Display metadata
      • Display OME-XML Metadata
        • Should be the same information as above but in XML (sometimes it is more correct than the above)
  • For ICS/IDS and XML/HDF5:
    • The ICS and XML file are the entry points that should be opened (the respective other file will be read automatically).
    • Also inspect the ICS and XML files in a simple text editor.
  • Saving 8 bit single channel image as TIFF:
    • Open xy_8bit__nuclei_PLK1_control.tif
    • [Image > Adjust > Brightness/Contrast] such that cells appear saturated
    • [File > Save As > TIFF…]
      • Open with Fiji
        • LUT metadata has changed, but pixel values and calibration metadata are preserved
      • Open with a web browser
        • It may not open at all
  • Saving 8 bit single channel image as JPEG:
    • Open xy_8bit__nuclei_PLK1_control.tif
    • [Image > Adjust > Brightness/Contrast] such that cells appear saturated
    • [File > Save As > JPEG…]
      • Open with Fiji
        • Pixel values have changed
        • Calibration metadata is gone
      • Open with a web browser
        • It should look the same as when you saved it
  • Saving 16 bit two channel movie as JPEG: xyzct_16bit__mitosis.tif
    • Select a timepoint in the middle of the movie
    • [File > Save As > JPEG…]
      • Open JPEG with Fiji
      • Image dimensions, data type, pixel values, and metadata have changed
  • Saving 8 bit single channel movie as GIF: xyt_8bit__mitocheck_incenp.tif
    • [Image > Adjust > Brightness/Contrast] such that cells appear saturated
    • [File > Save As > GIF…]
      • Open with Fiji
        • Pixel values have changed
      • Open with a web browser
        • Movie plays and looks as when you saved it

python BioIO

# %% 
# Load different image files and access various levels of metadata
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook

# %%
# Load .tif file with minimal metadata
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object is not the image matrix
from bioio import BioImage
image_url = 'https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_PLK1_control.tif'
bioimage = BioImage(image_url)
print(bioimage)
print(type(bioimage))

# %%
# Print some onject attributes
# - Observe that the object is 5 dimensional with most dimensions being empty
# - Observe that the dimension order is always time, channel, z, y, x, (TCZYX)
print(bioimage.dims)
print(bioimage.shape)
print(f'Dimension order is: {bioimage.dims.order}')
print(type(bioimage.dims.order))
print(f'Size of X dimension is: {bioimage.dims.X}')

# %%
# Extract image data
# - Observe that the returned numpy.array is still 5 dimensional
image_data = bioimage.data
print(type(image_data))
print(image_data)
print(image_data.shape)

# %%
# Extract specific part of image data
# - Observe that numpy.array is reduced to populated dimensions only
yx_image_data = bioimage.get_image_data('YX')
print(type(yx_image_data))
print(yx_image_data)
print(yx_image_data.shape)

# %%
# Access pixel size
import numpy as np
print(bioimage.physical_pixel_sizes)
print(f'An pixel has a length of {np.round(bioimage.physical_pixel_sizes.X,2)} microns in X dimension.')

# %%
# Access general metadata
print(type(bioimage.metadata))
print(bioimage.metadata)

# %%
# Load .tif file with extensive metadata
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__collagen.md.tif"
bioimage = BioImage(image_url)
print(bioimage)
print(type(bioimage))

# - Observe that the image is larger than the previous
print(bioimage.dims)

# %%
# Access image and reduce to only populated dimensions
yx_image_data = bioimage.data.squeeze()
print(type(yx_image_data))
print(yx_image_data)
print(yx_image_data.shape)

# %%
# Access pixel size
print(bioimage.physical_pixel_sizes)
print(f'An pixel has a length of {np.round(bioimage.physical_pixel_sizes.Y,2)} microns in Y dimension.')

# Access general metadata
# - Observe that metadata are more extensive than in the previous image
print(type(bioimage.metadata))
print(bioimage.metadata)

# %%
# Load .lif file
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object has 4 different channels
# - Observe that the general metadata are an abstract element
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"
bioimage = BioImage(image_url)
print(bioimage)
print(type(bioimage))
print(bioimage.dims)
print(bioimage.metadata)
print(type(bioimage.metadata))

# %%
# Access channel information
print(bioimage.channel_names)

# %%
# Access image data for all channels
img_4channel = bioimage.data.squeeze()

# Alternative
img_4channel = bioimage.get_image_data('CYX')

# - Observe that numpy.array shape is 3 dimensional representing channel,y,x
print(img_4channel.shape)

# Access only one channel
img_1channel = bioimage.get_image_data('YX',C=0)

# Alternative
img_1channel = img_4channel[0]

# - Observe that numpy.array shape is 2 dimensional representing y,x
print(img_1channel.shape)

# %%
# Access different images in one image file (scenes)
# - Observe that one image file can contain several scenes
# - Observe that they can be different in various aspects
print(bioimage.scenes)
print(f'Current scene: {bioimage.current_scene}')

# - Observe that the image in the current scene as 4 channel and Y/X dimensions have the size of 1024
print(bioimage.dims)
print(bioimage.physical_pixel_sizes)

# Switch to second scene
# - Observe that the image in the other scene as only one channel and Y/X dimensions are half as large as the first scene
# - Observe that the pixel sizes are doubled
bioimage.set_scene(1)
print(bioimage.dims)
print(bioimage.physical_pixel_sizes)

# %%
# Load .czi file
# file needs first to be downloaded from https://github.com/NEUBIAS/training-resources/raw/master/image_data/xyz__multiple_images.czi
# save file in the same directory as this notebook
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object has a z dimension
bioimage = BioImage('/Users/fschneider/skimage-napari-tutorial/ExampleImages/xyz__multiple_images.czi')
print(bioimage)
print(type(bioimage))

# %%
# little excersise in between
# Access image dimensions
print(bioimage.dims)

# Access general metadata
# - Observe that metadata are abstract
print(bioimage.metadata)
print(type(bioimage.metadata))

# Access pixel size
print(bioimage.physical_pixel_sizes)

# Access image data for all channels
img_3d = bioimage.data.squeeze()

# Alternative
img_3d = bioimage.get_image_data('ZYX')

# - Observe that numpy.array shape is 3 dimensional representing z,y,x
print(img_3d.shape)

# Access only one channel
img_2d = bioimage.get_image_data('YX',Z=0)

# Alternative
img_2d = img_3d[0]

# - Observe that numpy.array shape is 2 dimensional representing y,x
print(img_2d.shape)

# %%
# little excercise:
# paticipants should try to open one of their files with python



Resave images in various file formats

Resaving images in different file formats very often leads to a loss of metadata or distortion of the pixel values. It is critical to be aware of this!

Checks to be done after each resaving
Resave 8 bit single channel image as TIFF
Resave 8 bit single channel image as JPEG
Resave 16 bit two channel movie as JPEG
Resave 8 bit single channel movie as GIF

Show activity for:  

ImageJ GUI

  • Saving 8 bit single channel image as TIFF:
    • Open xy_8bit__nuclei_PLK1_control.tif
    • [Image > Adjust > Brightness/Contrast] such that cells appear saturated
    • [File > Save As > TIFF…]
      • Open with Fiji
        • LUT metadata has changed, but pixel values and calibration metadata are preserved
      • Open with a web browser
        • It may not open at all
  • Saving 8 bit single channel image as JPEG:
    • Open xy_8bit__nuclei_PLK1_control.tif
    • [Image > Adjust > Brightness/Contrast] such that cells appear saturated
    • [File > Save As > JPEG…]
      • Open with Fiji
        • Pixel values have changed
        • Calibration metadata is gone
      • Open with a web browser
        • It should look the same as when you saved it
  • Saving 16 bit two channel movie as JPEG: xyzct_16bit__mitosis.tif
    • Select a timepoint in the middle of the movie
    • [File > Save As > JPEG…]
      • Open JPEG with Fiji
      • Image dimensions, data type, pixel values, and metadata have changed
  • Saving 8 bit single channel movie as GIF: xyt_8bit__mitocheck_incenp.tif
    • [Image > Adjust > Brightness/Contrast] such that cells appear saturated
    • [File > Save As > GIF…]
      • Open with Fiji
        • Pixel values have changed
      • Open with a web browser
        • Movie plays and looks as when you saved it






Assessment

True or false

  1. One could use Excel’s XLSX file format for saving image data.

Solution

  1. One could use Excel’s XLSX file format for saving image data. True, the matrix of each sheet could represent one image plane and one could use the first sheet to store metadata and the mapping of each sheet (image plane) to the zct coordinates, e.g. sheet 12 c 2 z 3 t 1.

Discuss

  1. What are the pros and cons of converting an image into another format?
  2. What are the pros and cons of splitting metadata and image pixel data into separate files?
  3. Do you know any good file formats for image metadata?

Solution

  1. (A) Sometimes it is necessary to convert to another format to be able to open the image in a specific software. (B) Converting an image to another format typically loose information, e.g. because the file format that you are saving to cannot represent all the metadata of the original image file. Thus, it is in general recommened to keep to original image file. (C) Converting to a file format with good compression may save you considerable disk space.
  2. (A) Metadata typically is much smaller than the pixel data. Thus, it can be a good idea to keep metadata in a separate file that can be readily inspected (inspecting the potentially TB sized pixel data files can be tricky). (B) The best file formats for metadata and pixel data can be very different due to the nature of the data, thus splitting can make sense. (C) Having separate files always bares the risk that you loose one of them, e.g. you may forget to copy both to a new folder.
  3. TXT, XML, and JSON are good formats for image metadata, because they are human readable standard formats that can be openend with any text editor.




Follow-up material

Recommended follow-up modules:

Learn more: