mednet.data.segment.jsrt

Japanese Society of Radiological Technology dataset for lung segmentation.

The database includes 154 nodule and 93 non-nodule images. It contains a total of 247 resolution of 2048 x 2048 pixels, issued from original digitized Radiographies (laser scanner). One set of ground-truth lung annotations is available.

  • Database references:

Important

Raw data organization

The JSRT base datadir, which you should configure following the Setup instructions, must contain at least the following directories:

  • All247images/ (directory containing the CXR images, in raw format)

  • scratch/ (must contain masks downloaded from JSRT-Annotations)

Data specifications:

  • Raw data input (on disk):

    • Original images encoded in proprietary 12-bit RAW format. A PNG-converted set of images is provided at JSRT-Kaggle for your reference. Input resolution is 2048 x 2048 pixels.

    • Masks: encoded as GIF files with separate portions for left and right lungs, with a resolution of 1024 x 1024 pixels

    • Total samples: 247

  • Output sample:

    • Image: Load raw image from folder All247images/ using numpy.fromfile(), then applies a simple histogram equalization to the 8-bit representation of the image, to obtain something along the lines of the PNG (unofficial) version distributed at JSRT-Kaggle. Output images have a size of 1024 x 1024 pixels, achieved by resizing the original input with bilinear interpolation.

    • Labels for each of the lungs are read from the provided GIF files and merged into a single output image.

The default split contains 172 samples for training, 25 for validation and 50 for test.

This module contains the base declaration of common data modules and raw-data loaders for this database. All configured splits inherit from this definition.

Module Attributes

DATABASE_SLUG

Pythonic name to refer to this database.

CONFIGURATION_KEY_DATADIR

Key to search for in the configuration file for the root directory of this database.

Classes

DataModule(split_path)

Japanese Society of Radiological Technology dataset for lung segmentation.

RawDataLoader()

A specialized raw-data-loader for the jsrt dataset.

mednet.data.segment.jsrt.DATABASE_SLUG = 'jsrt'

Pythonic name to refer to this database.

mednet.data.segment.jsrt.CONFIGURATION_KEY_DATADIR = 'datadir.jsrt'

Key to search for in the configuration file for the root directory of this database.

class mednet.data.segment.jsrt.RawDataLoader[source]

Bases: RawDataLoader

A specialized raw-data-loader for the jsrt dataset.

datadir: Path

This variable contains the base directory where the database raw data is stored.

load_pil_raw_12bit_jsrt(path)[source]

Load a raw 16-bit sample data.

This method was designed to handle the raw images from the JSRT dataset. It reads the data file and applies a simple histogram equalization to the 8-bit representation of the image to obtain something along the lines of the PNG (unofficial) version distributed at JSRT-Kaggle.

Parameters:

path (Path) – The full path leading to the image to be loaded.

Return type:

Image

Returns:

A PIL image in RGB mode, with width`x`width pixels.

sample(sample)[source]

Load a single image sample from the disk.

Parameters:

sample (Any) – A tuple containing the path suffix, within the dataset root folder, where to find the image to be loaded, and an integer, representing the sample label.

Return type:

Mapping[str, Any]

Returns:

The sample representation.

target(sample)[source]

Load only sample target from its raw representation.

Parameters:

sample (Any) – A tuple containing the path suffix, within the dataset root folder, where to find the image to be loaded, and an integer, representing the sample target.

Return type:

Tensor

Returns:

The label corresponding to the specified sample, encapsulated as a torch float tensor.

class mednet.data.segment.jsrt.DataModule(split_path)[source]

Bases: CachingDataModule

Japanese Society of Radiological Technology dataset for lung segmentation.

Parameters:

split_path (Path | Traversable) – Path or traversable (resource) with the JSON split description to load.