Data types

Prerequisites

Before starting this lesson, you should be familiar with:

Learning Objectives

After completing this lesson, learners should be able to:
  • Understand that images have a data type which limits the values that the pixels in the image can have.

  • Understand common data types such as 8-bit, 12-bit and 16-bit unsigned integer.

Motivation

Images contain numerical values that must be somehow stored on the hard disc or within the computer memory. To do so, for each pixel a certain amount of space (memory) must be allocated (usually measure in bits). Generally, the more bits you allocate, the bigger are the numbers that you can store, however, you also need more space. Thus choosing the right data type usually is a balance between what you can represent and how much space you want to afford for this. Especially, for large image data such as volume EM and light-sheet data, the choice of the data type can have quite some impact on your purse. In addition, certain operations on images can yield results with values outside of the original data type; this is a serious and frequently occurring source of mistakes when handling image data and thus must be well understood!

Concept map

graph TD I("Image") -->|has|DT("Data type") DT -->|limits|PV("Pixel values") DT -->|has|BD("Bit depth") DT -->|has|VR("Value range")



Figure


Examples for data types of different bit depths.



Image data types

The pixels in an image have a certain data type. The data type limits the values that pixels can take.

For example, unsigned N-bit integer images can represent values from 0 to 2^N -1, e.g.

  • 8-bit unsigned integer: 0 - 255
  • 12-bit unsigned integer: 0 - 4095
  • 16-bit unsigned integer: 0 - 65535

Intensity clipping (saturation)

If the value of a pixel in an N-bit unsigned integer image is equal to either 0 or 2^N - 1, you cannot know for sure whether you lost information at some point during the image acquisition or image storage. For example, if there is a pixel with the value 255 in an unsigned integer 8-bit image, it may be that the actual intensity “was higher”, e.g. would have corresponded to a gray value of 302. One speaks of “saturation” or “intensity clipping” in such cases. It is important to realise that there can be also clipping at the lower end of the range (some microscopes have an unfortunate “offset” slider that can be set to negative values, which can cause this).




Activities

Find saturated pixels in an 8-bit image

Saturation, i.e. pixel value at the upper end of the datatype, is a typical problem in fluorescence microscopy images.


Show activity for:  

skimage napari

# %% 
# Check for saturation in an 8-bit image 

# %%
# Import libraries and instantiate napari
import napari
import numpy as np
import matplotlib.pyplot as plt
from OpenIJTIFF import open_ij_tiff

viewer = napari.Viewer()

# %%
# Open an image and view it
image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_intensity_clipping_issue_a.tif')
viewer.add_image(image)

# TODO: This would be nice https://forum.image.sc/t/add-hilo-colormap-to-napari/95601

# %% 
# Check the image's datatype
print(image.dtype)
print(np.iinfo(image.dtype)) # Useful as it also prints the value range

# %%
# Check for clipping, i.e. pixels values at the limits of the value range
# This is important for many reasons, for example: 
# - Pixel values at the limit of the value range typically cannot be used for intensity quantification 
# - Important algorithms, e.g. for spot detection, do not work well in regions with intensity clipping
print("Min:", image.min()) # Are there any clipped pixels?
print("Max:", image.max()) # Are there any clipped pixels?
print("Number of 0 pixels:", np.sum(image==0)) # How many clipped pixels are there?
print("Number of 255 pixels:", np.sum(image==255))
plt.hist(image.flatten(), bins=np.arange(image.min(), image.max() + 1));



Find saturated pixels in an 12-bit image

Saturation, i.e. pixel value at the upper end of the datatype, is a typical problem in fluorescence microscopy images.

Inspecting 12-bit images, as acquired by some camera based systems, is particularly tricky, because 12-bit images are typically represented as 16-bit images, both on disk and within analysis software.


Show activity for:  

skimage napari

# %% 
# Check for saturation in a 12 bit image

# %%
# Import libraries and instantiate napari
import napari
import numpy as np
import matplotlib.pyplot as plt
from OpenIJTIFF import open_ij_tiff

# %%
# Open an image and view it in napari
image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_12bit__saturated_plant.tif')
viewer = napari.Viewer()
viewer.add_image(image)

# %%
# Napari:
# - Hover with the mouse to look for saturation

# %% 
# Check the image's datatype
print(image.dtype)
print(np.iinfo(image.dtype)) # Useful as it also prints the value range

# %%
# Check for clipping, i.e. pixels values at the limits of the value range
# This is important for many reasons, for example: 
# - Pixel values at the limit of the value range typically cannot be used for intensity quantification 
# - Important algorithms, e.g. for spot detection, do not work well in regions with intensity clipping
print("Min:", image.min()) # Are there any clipped pixels?
print("Max:", image.max()) # Are there any clipped pixels?

# %% 
# Compute the maximal value of various data types,
# and observe that, suspiciously, our image's maximum value 
# matches that of a 12-bit image
print("8 bit max:", 2**8-1)
print("12 bit max:", 2**12-1)
print("16 bit max:", 2**16-1)

# %% 
# Check how many satured pixels we have
print("Number of 4095 pixels:", np.sum(image==4095))

# %%
# To double check that this really is a 12 bit image 
# you can try to inspect the image metadata
# - If you open the image in Fiji you can do: Image > Show Info
# - TODO: find out how to do this in python



Inspect a binary image

This activity shows that correctly handling binary images can be tricky because there typically is no dedicated binary datatype for storing images on disk.


Show activity for:  

skimage napari

# %% 
# Open and inspect a binary image 

# %%
# Import libraries and instantiate napari
import napari
import numpy as np
import matplotlib.pyplot as plt
from OpenIJTIFF import open_ij_tiff

viewer = napari.Viewer()

# %%
# Open image and view it
image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit_binary__h2b.tif')
viewer.add_image(image)

# %% 
# Check the image's datatype and values
# - From the datatype alone we cannot tell that this is a binary image (aka a mask)
# - But the fact that it only has two values suggests that it in fact is a binary image
print(np.iinfo(image.dtype)) 
print("Min:", image.min())
print("Max:", image.max()) 
print(np.unique(image))

# %%
# Convert to a boolean binary image
# - For working with this mask in python it will be probably more convenient to convert it to a boolean type image
# - The issue is that boolean type images cannot be saved as such on disk, because e.g. TIFF does not support this datatype
binary_image = ( image == 255 ) 
print(image.shape, binary_image.shape) # ensure we did not mess up the shape
# np.iinfo is not implemented for bool
print(binary_image.dtype)
print(np.unique(binary_image))



Inspect a ScanR microscope 12-bit image

This activity shows that correctly handling 12-bit data can be tricky, becuause typically neither on disk nor in memory there is a 12-bit datatype.


Show activity for:  

skimage napari

# %% 
# Explore the pixels values of an image acquired with a 12-bit camera 

# %%
# Import libraries and instantiate napari
import napari
import numpy as np
import matplotlib.pyplot as plt
from OpenIJTIFF import open_ij_tiff

viewer = napari.Viewer()

# %%
# Open image and view it
image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__scanR_datatype_issue.tif')
viewer.add_image(image)

# %% 
# Check the image's datatype and value range
# - The intensity values reside "strangely" in the middle of the 16 bit data type range
print(image.dtype) 
dtype_min = np.iinfo(image.dtype).min
dtype_max = np.iinfo(image.dtype).max
print(dtype_min, dtype_max)

print(image.min(), image.max())

plt.hist(image.flatten(), bins=np.arange(dtype_min, dtype_max+1))
plt.yscale("log")

# %%
# This is a bit advanced/annoying 
# but here's how to shift the image values to be better interpretable
# and then how to check for saturation
image_rescaled = image - 2**15  # remove offset due to misinterpretation of the first bit (signed/unsigned)
print(image_rescaled.min(), image_rescaled.max()) 
print(0, 2**12-1) # print uint12 data range, which is the range of the camera the image was acquired with
print(np.sum(image_rescaled == 2**12-1)) # check number of saturated pixels



Explore various image data types

Open the following images and discuss their data type and whether there are any signs of intensity clipping.


Show activity for:  

skimage napari

# %% 
# Explore image data types and value ranges

# %%
# Import libraries and instantiate napari
import napari
import numpy as np
import matplotlib.pyplot as plt
from OpenIJTIFF import open_ij_tiff

viewer = napari.Viewer()

# %%
# Open an image and view it
image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_intensity_clipping_issue_a.tif')
viewer.add_image(image)

# https://forum.image.sc/t/add-hilo-colormap-to-napari/95601

# %% 
# Check the image's datatype
print(image.dtype)
print(np.iinfo(image.dtype)) # Useful as it also prints the value range

# %%
# Check for clipping, i.e. pixels values at the limits of the value range
# This is important for many reasons, for example: 
# - Pixel values at the limit of the value range typically cannot be used for intensity quantification 
# - Important algorithms, e.g. for spot detection, do not work well in regions with intensity clipping
print("Min:", image.min()) # Are there any clipped pixels?
print("Max:", image.max()) # Are there any clipped pixels?
print("Number of 0 pixels:", np.sum(image==0)) # How many clipped pixels are there?
print("Number of 255 pixels:", np.sum(image==255))
plt.hist(image.flatten(), bins=np.arange(image.min(), image.max() + 1));


#image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit_binary__h2b.tif')
#image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__autophagosomes.tif')
image, *_ = open_ij_tiff('https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__scanR_datatype_issue.tif')

# View the image
viewer.add_image(image)

# %%
# Check the image's datatype and its value limits
print(np.iinfo(image.dtype))

# Check the image's minimum and maximum intensity
print(np.min(image),np.max(image))

# %%

# %%

ImageJ GUI

  • For each image mentioned in the activity perform the below tasks.
  • Open the image in Fiji.
  • Use various ways to inspect the image and verify the comments that are given below the respective image in the activity.
  • To this end, useful tools are:
    • Image › Show Info...
    • Inspect pixel values by hovering over the image with the mouse.
    • Analyze › Histogram
    • Analyze › Plot Profile



Explore more image data types and their metadata

Observe that for some software the datatype of the loaded image does not match the datatype given in the metadata.

The reason is that some software only support data types where the bit depth is a multiple of 8. For example, unsigned integer 12-bit data may not be supported.

This is very important as you may misinterpret whether your image contains saturated pixels or not.

Image data

Show activity for:  

ImageJ GUI

  • Download any of the above images
  • Open the image using Plugins > Bio-Formats > Bio-Formats Importer
    • Select [X] Display OME-XML metadata
    • Click [ OK ]
  • Check whether the information in Image > Type is the same as the one mentioned in the displayed metadata (look for SignificantBits and Type)
  • Also check the maximum value in the image, e.g. using Analyze > Histogram
  • How does this maximum value compare to the image datatype?
    • For example, you may find a value of 4095, which is the maximum of an unsigned integer 12-bit image, which may be the datatype mentioned in the image metadata, however ImageJ may represent this image as a 16-bit image. Appreciate that this can be confusing!
    • If you find the maximum of the image to be identical to maximum that the datatype of the image can represent you may have an issue with saturation! Check this
      • by hovering with the mouse over bright regions
      • using the HiLo LUT with appropriate contrast settings, i.e. the maximum should be the maximum of your datatype!






Assessment

True or false? Discuss with your neighbor!

  1. Changing pixel data type never changes pixel values.
  2. Converting from 16-bit unsigned integer to 32-bit floating point never changes the pixel values.
  3. Changing from 32-bit floating point to 16-bit unsigned integer never changes the pixel values.
  4. There is only one correct way to convert from 16-bit to 8-bit.
  5. If the highest value in an image is 255, one can conclude that it is an 8-bit unsigned integer image.
  6. If the highest value in an image is 1034, one can conclude that it is not an 8-bit unsigned integer image.
  7. If the bit-depth is 16 and there are a lot of neighboring pixels with the value 4095 and no pixels with a higher value, most likely this image was acquired with 12-bit camera.

Solution

  1. False
  2. True
  3. False
  4. False
  5. False
  6. True
  7. True




Follow-up material

Recommended follow-up modules:

Learn more: