Batch processing


Scientific discovery is based on reproducibility. Thus, it is very common to apply the same analysis workflow to a number of images, possibly comprising different biological conditions. To achieve this, it is very important to know how to efficiently “batch process” many images.


Prerequisites

Before starting this lesson, you should be familiar with:

Learning Objectives

After completing this lesson, learners should be able to:
  • Automatically process a number of images

Concept map

graph TD I1("Image 1") --> S("Analysis workflow") I2("Image 2") --> S IN("...") --> S S --> R1("Result 1") S --> R2("Result 2") S --> RN("...")

Figure


Batch processing of several images, yielding as many segmentations and object measurement tables.



Activities

Batch analysis of nuclear shapes

FUNCTION analyse(image_path, output_folder)
    PRINT "Analyzing:", image_path
END FUNCTION

FOR each image_path in image_paths
    CALL analyse(image_path, output_dir)
END FOR

Show activity for:  

ImageJ Macro

/**
 * 2D Nuclei area measurement
 * 
 * Requirements:
 *   - Update site: IJPB-Plugins (MorpholibJ)
 */


// Get input and output directories from the user
// Note that one could also just specify the folders in the script, without the GUI elements
// Pro: Better for documentation 
// Con: Maybe less user friendly
inputDir = getDirectory("Choose Input Directory");
if (inputDir == "") exit("User cancelled input directory selection.");
outputDir = getDirectory("Choose Output Directory");
if (outputDir == "") exit("User cancelled output directory selection.");

// Processing parameters
// Note that one could decide to expose the below parameters in the above UI, too.
// But then one would need to somehow document their values for each run of the script.
// Like this, one could save the whole script with the analysis results for documentation purposes.
threshold = 25;


// Clean up and avoid popping up of image windows during run
run("Close All");
run("Clear Results");
setBatchMode(true);

// init options
run("Options...", "iterations=1 count=1 black do=Nothing");

// Get list of files in the input directory
list = getFileList(inputDir);

// Loop through all files in the directory
for (i = 0; i < list.length; i++) {
    fileName = list[i];
    // Skip directories and non-image files (basic check)
    if (File.isDirectory(inputDir + fileName) || startsWith(fileName, ".") || !(endsWith(fileName, ".tif") || endsWith(fileName, ".tiff"))) {
        print("Skipping: " + fileName);
        continue;
    }
    
    print("Processing: " + inputDir + fileName);

    // open and process
    //
    open(inputDir + fileName);

    // extract image name to create output file names (s.b.)
    imageName = File.getNameWithoutExtension(fileName);

    // segment
    setThreshold(threshold, 65535);
    run("Convert to Mask");
    run("Connected Components Labeling", "connectivity=4 type=[8 bits]");
    run("glasbey_on_dark");
    // save segmentation
    saveAs("Tiff", outputDir + File.separator + imageName + "_labels.tif");
    // measure
    run("Analyze Regions", "area");
    // save measurements
    saveAs("Results", outputDir + File.separator + imageName + ".txt");
    run("Clear Results"); // Clear results for the next image
    close(); // Close the current image
}

setBatchMode(false); // Restore normal display mode
run("Close All"); // Close any remaining windows (like results table if not cleared properly)
print("Batch processing finished.");

ImageJ SciJava Macro

/**
 * 2D Nuclei area measurement
 * 
 * Requirements:
 *   - Update site: IJPB-Plugins (MorpholibJ)
 */

// Scijava script UI parameters
// Use the [ Batch ] button in the Fiji script editor to automatically analyse multiple files
#@ File (label="Input image") inputImageFile
#@ File (label="Output directory", style="directory") outputDir

// Processing parameters
// Note that one could decide to expose the below parameters in the above UI, too.
// But then one would need to somehow document their values for each run of the script.
// Like this, one could save the whole script with the analysis results for documentation purposes.
threshold = 25;

// Clean up and avoid popping up of image windows during run
run("Close All");
run("Clear Results);
setBatchMode(true);

// init options
run("Options...", "iterations=1 count=1 black do=Nothing");

// open and process
//
open(inputImageFile);

// extract image name to create output file names (s.b.)
imageName = File.getNameWithoutExtension(inputImageFile);

// segment
setThreshold(threshold, 65535);
run("Convert to Mask");
run("Connected Components Labeling", "connectivity=4 type=[8 bits]");
run("glasbey_on_dark");
// save segmentation
saveAs("Tiff", outputDir + File.separator + imageName + "_labels.tif");
// measure
run("Analyze Regions", "area");
// save measurements
saveAs("Results", outputDir + File.separator + imageName + ".txt");

run("Close" ); // close results table

skimage python

# %% 
# Batch analysis of 2D nuclei shape measurements


# %%
# Import python modules
from OpenIJTIFF import open_ij_tiff, save_ij_tiff
from skimage.measure import label, regionprops_table
from skimage.filters import threshold_otsu
import pandas as pd
import pathlib
from pathlib import Path


# %%
# Create a function that analyses one image
# Below, this function will be called several times, for all images
def analyse(image_filepath, output_folder):

    # This prints which image is currently analysed
    print("Analyzing:", image_filepath)

    # Convert the image_filepath String to a Path,
    # which is more convenient to create the output files
    image_filepath = pathlib.Path(image_filepath)

    image, axes, scales, units = open_ij_tiff(image_filepath)

    # Binarize the image using auto-thresholding
    threshold = threshold_otsu(image)
    print("Threshold:", threshold)
    binary_image = image > threshold

    # Perform connected components analysis (i.e create labels)
    # Note that label returns 32 bit data which save_ij_tif below can't handle.
    # We can safely convert to 16 bit as we know that we don't have too many objects
    label_image = label(binary_image).astype('uint16')

    # Save the labels
    label_image_filepath = output_folder / f"{image_filepath.stem}_labels.tif"
    save_ij_tiff(label_image_filepath, label_image, axes, scales, units)

    # Measure calibrated (scaled) nuclei shapes
    df = pd.DataFrame(regionprops_table(
        label_image,
        properties={'label', 'area', 'centroid'},
        spacing=scales))

    # Round all measurements to 2 decimal places.
    # This increases the readability a lot,
    # but depending on your scientific question,
    # you may not want to round that much!
    df = df.round(2)

    # Add the image and label filepaths to the data-frame
    df['image'] = image_filepath
    df['labels'] = label_image_filepath

    # Return the data-frame
    return df
    

# %%
# Assign an output folder 
# Note: This uses your current working directory; you may want to change this to another folder on your computer
output_dir = Path.cwd()


# %%
# Create a list of the paths to all data
image_paths = [output_dir / "xy_8bit__mitocheck_incenp_t1.tif",
               output_dir / "xy_8bit__mitocheck_incenp_t70.tif"]
# Create an empty list for the measurement results
result_dfs = []


# %%
# The loop which performs the analysis
for image_path in image_paths:

    # Computes the analysis and returns a data-frame with the resulting measurements
    result_df = analyse(image_path, output_dir)

    # Append the label image path to the list initialized before the loop
    result_dfs.append(result_df)


# %%
# Concatenate the result data-frames to a single one which contains all results
final_df = pd.concat(result_dfs, ignore_index=True)
# Save the final results to disk
final_df.to_csv(output_dir / 'batch_processing_results.csv', sep='\t', index=False)






Assessment

Fill in the blanks

  1. If you have thousands of images to process you should consider using a ___ .
  2. Batch processing refers to __ processing many data sets.

Solution

  1. computer cluster (HPC)
  2. automatically




Follow-up material

Recommended follow-up modules:

Learn more: