Batch processing

Prerequisites

Before starting this lesson, you should be familiar with:

Learning Objectives

After completing this lesson, learners should be able to:
  • Automatically process a number of images

Motivation

Scientific discovery is based on reproducibility. Thus, it is very common to apply the same analysis workflow to a number of images, possibly comprising different biological conditions. To achieve this, it is very important to know how to efficiently “batch process” many images.

Concept map

graph TD I1("Image 1") --> S("Analysis workflow") I2("Image 2") --> S IN("...") --> S S --> R1("Result 1") S --> R2("Result 2") S --> RN("...")



Figure


Batch processing of several images, yielding as many segmentations and object measurement tables.






Activities

Batch analysis of nuclear shapes

FUNCTION analyse(image_path, output_folder)
    PRINT "Analyzing:", image_path
END FUNCTION

FOR each image_path in image_paths
    CALL analyse(image_path, output_dir)
END FOR

Show activity for:  

ImageJ SciJava Macro

/**

  • 2D Nuclei area measurement
  • Requirements:
    • Update site: IJPB-Plugins (MorpholibJ) */

// Scijava script parameters // Use the [ Batch ] button in the Fiji script editor to automatically analyse multiple files #@ File (label=”Input image”) inputImageFile #@ File (label=”Output directory”, style=”directory”) outputDir

// Processing parameters threshold = 25;

// Clean up and avoid popping up of image windows during run run(“Close All”); run(“Clear Results); setBatchMode(true);

// init options run(“Options…”, “iterations=1 count=1 black do=Nothing”);

// open and process // open(inputImageFile);

// extract image name to create output file names (s.b.) imageName = File.getNameWithoutExtension(inputImageFile);

// segment setThreshold(threshold, 65535); run(“Convert to Mask”); run(“Connected Components Labeling”, “connectivity=4 type=[8 bits]”); run(“glasbey_on_dark”); // save segmentation saveAs(“Tiff”, outputDir + File.separator + imageName + “_labels.tif”); // measure run(“Analyze Regions”, “area”); // save measurements saveAs(“Results”, outputDir + File.separator + imageName + “.txt”);

run(“Close” ); // close results table

skimage python

# %% 
# Batch analysis of 2D nuclei shape measurements


# %%
# Import python modules
from OpenIJTIFF import open_ij_tiff, save_ij_tiff
from skimage.measure import label, regionprops_table
from skimage.filters import threshold_otsu
import pandas as pd
import pathlib
from pathlib import Path


# %%
# Create a function that analyses one image
# Below, this function will be called several times, for all images
def analyse(image_filepath, output_folder):

    # This prints which image is currently analysed
    print("Analyzing:", image_filepath)

    # Convert the image_filepath String to a Path,
    # which is more convenient to create the output files
    image_filepath = pathlib.Path(image_filepath)

    image, axes, scales, units = open_ij_tiff(image_filepath)

    # Binarize the image using auto-thresholding
    threshold = threshold_otsu(image)
    print("Threshold:", threshold)
    binary_image = image > threshold

    # Perform connected components analysis (i.e create labels)
    # Note that label returns 32 bit data which save_ij_tif below can't handle.
    # We can safely convert to 16 bit as we know that we don't have too many objects
    label_image = label(binary_image).astype('uint16')

    # Save the labels
    label_image_filepath = output_folder / f"{image_filepath.stem}_labels.tif"
    save_ij_tiff(label_image_filepath, label_image, axes, scales, units)

    # Measure calibrated (scaled) nuclei shapes
    df = pd.DataFrame(regionprops_table(
        label_image,
        properties={'label', 'area', 'centroid'},
        spacing=scales))

    # Round all measurements to 2 decimal places.
    # This increases the readability a lot,
    # but depending on your scientific question,
    # you may not want to round that much!
    df = df.round(2)

    # Add the image and label filepaths to the data-frame
    df['image'] = image_filepath
    df['labels'] = label_image_filepath

    # Return the data-frame
    return df
    

# %%
# Assign an output folder 
# Note: This uses your current working directory; you may want to change this to another folder on your computer
output_dir = Path.cwd()


# %%
# Create a list of the paths to all data
image_paths = [output_dir / "xy_8bit__mitocheck_incenp_t1.tif",
               output_dir / "xy_8bit__mitocheck_incenp_t70.tif"]
# Create an empty list for the measurement results
result_dfs = []


# %%
# The loop which performs the analysis
for image_path in image_paths:

    # Computes the analysis and returns a data-frame with the resulting measurements
    result_df = analyse(image_path, output_dir)

    # Append the label image path to the list initialized before the loop
    result_dfs.append(result_df)


# %%
# Concatenate the result data-frames to a single one which contains all results
final_df = pd.concat(result_dfs, ignore_index=True)
# Save the final results to disk
final_df.to_csv(output_dir / 'batch_processing_results.csv', sep='\t', index=False)






Assessment

Fill in the blanks

  1. If you have thousands of images to process you should consider using a ___ .
  2. Batch processing refers to __ processing many data sets.

Solution

  1. computer cluster (HPC)
  2. automatically




Follow-up material

Recommended follow-up modules:

Learn more: