mednet.data.segment.shenzhen¶
Shenzhen DataModule for computer-aided semantic sementation of lungs.
The standard digital image database for Tuberculosis was created by the National Library of Medicine, Maryland, USA in collaboration with Shenzhen No.3 People’s Hospital, Guangdong Medical College, Shenzhen, China. The Chest X-rays are from out-patient clinics, and were captured as part of the daily routine using Philips DR Digital Diagnose systems.
The database includes 336 cases with manifestation of tuberculosis, and 326 normal cases. It contains a total of 662 images. Image size varies for each X-ray. It is approximately 3K x 3K. One set of ground-truth lung annotations is available for 566 of the 662 images.
Important
Raw data organization
The Shenzhen base datadir, which you should configure following the Setup instructions, must contain at least these two subdirectories:
CXR_png/
(directory containing the CXR images)mask/
(contains masks downloaded from Shenzhen Annotations)
Data specifications:
Raw data input (on disk):
PNG 8-bit RGB images issued from digital radiography machines (grayscale, but encoded as RGB images with “inverted” grayscale scale requiring special treatment).
Original resolution: variable width and height of 3000 x 3000 pixels or less
Samples: 566 images and associated labels
Output image:
Transforms:
Load raw PNG with
PIL
Torch center cropping to get square image
Final specifications:
Grayscale, encoded as a 3-plane plane tensor, 32-bit floats, square with varying resolutions, depending on the input image
Labels: Binary mask with annotated lungs (1 where lungs are; 0 otherwise)
Mask: Binary mask with all ones
This module contains the base declaration of common data modules and raw-data loaders for this database. All configured splits inherit from this definition.
Module Attributes
Pythonic name to refer to this database. |
|
Key to search for in the configuration file for the root directory of this database. |
Classes
|
Shenzhen database for lung segmentation. |
A specialized raw-data-loader for the shenzhen dataset. |
- mednet.data.segment.shenzhen.DATABASE_SLUG = 'shenzhen'¶
Pythonic name to refer to this database.
- mednet.data.segment.shenzhen.CONFIGURATION_KEY_DATADIR = 'datadir.shenzhen'¶
Key to search for in the configuration file for the root directory of this database.
- class mednet.data.segment.shenzhen.RawDataLoader[source]¶
Bases:
RawDataLoader
A specialized raw-data-loader for the shenzhen dataset.
- class mednet.data.segment.shenzhen.DataModule(split_path)[source]¶
Bases:
CachingDataModule
Shenzhen database for lung segmentation.
- Parameters:
split_path (
Path
|Traversable
) – Path or traversable (resource) with the JSON split description to load.