Getting Started

System Requirements

Training requires at least one NVIDIA GPU and sufficient memory to handle 3D medical images.

Inference (mist_predict) runs on any machine, including CPU-only systems and Macs, and does not require an NVIDIA GPU.

Install

Inference only (CPU-compatible)

To run mist_predict on any machine — including laptops and Macs without an NVIDIA GPU — install the base package:

pip install mist-medical

Training (NVIDIA GPU required)

To train models, install the train extra, which includes NVIDIA DALI for GPU-accelerated data loading:

pip install "mist-medical[train]"

Development install

To install MIST and customize the underlying code (e.g., add a loss function or new architecture), clone the repo and install in editable mode. Add [train] if you need to run training:

git clone https://github.com/mist-medical/MIST.git
cd MIST
pip install -e .          # inference only
pip install -e ".[train]" # training

Data Format

The MIST pipeline assumes that your train and test data directories are set up in the following structure.

data/
    patient_1/
        image_1.nii.gz
        image_2.nii.gz
        ...
        image_n.nii.gz
        mask.nii.gz
    patient_2/
        image_1.nii.gz
        image_2.nii.gz
        ...
        image_n.nii.gz
        mask.nii.gz    

Note

The naming convention is for this example only. MIST does not enforce any specific naming conventions for the files inside of your dataset — only that filenames are consistent across patient directories and that each file can be identified by at least one unique substring (e.g., "t1n.nii.gz" to match all T1 images, or "seg.nii.gz" to match all mask files).

MIST offers support for MSD and CSV formatted datasets via mist_convert_msd and mist_convert_csv. For more details, please see Converting CSV and MSD Data.

Once your dataset is in the correct format, the final step is to prepare a small JSON file containing the details of the dataset. We specifically ask for the following key-value pairs.

Key Value
task Name of task (i.e., brats, lits, etc.).
modality Options are ct, mr, or other.
train-data Path to training data directory. Can be absolute or relative to the dataset JSON file.
test-data Path to test data directory (optional). Can be absolute or relative to the dataset JSON file.
mask List containing identifying strings for the segmentation mask (ground truth) files.
images Dictionary where each key is an image type (i.e., T1, T2, CT, etc.) and each value is a list containing identifying strings for that image type.
labels List of labels in dataset (starting with 0).
final_classes (optional) Dictionary where each key is the name of the final segmentation class (i.e., WT, ET, TC for BraTS) and each value is a list of the labels in that class. If omitted, each label is evaluated as its own class.

Here is an example for the BraTS 2023 dataset using absolute paths.

{
    "task": "brats2023",
    "modality": "mr",
    "train-data": "/full/path/to/raw/data/train",
    "test-data": "/full/path/to/raw/data/validation",
    "mask": ["seg.nii.gz"],
    "images": {"t1": ["t1n.nii.gz"],
               "t2": ["t2w.nii.gz"],
               "tc": ["t1c.nii.gz"],
               "fl": ["t2f.nii.gz"]},
    "labels": [0, 1, 2, 3],
    "final_classes": {"WT": [1, 2, 3],
                      "TC": [1, 3],
                      "ET": [3]}
}

The same dataset JSON using relative paths:

{
    "task": "brats2023",
    "modality": "mr",
    "train-data": "relative/to/dataset/json/train",
    "test-data": "relative/to/dataset/json/validation",
    "mask": ["seg.nii.gz"],
    "images": {"t1": ["t1n.nii.gz"],
               "t2": ["t2w.nii.gz"],
               "tc": ["t1c.nii.gz"],
               "fl": ["t2f.nii.gz"]},
    "labels": [0, 1, 2, 3],
    "final_classes": {"WT": [1, 2, 3],
                      "TC": [1, 3],
                      "ET": [3]}
}

Note

Relative paths in the dataset JSON are resolved relative to the location of the JSON file itself, not the working directory from which you run MIST. This means the JSON and its data directories can be moved together to a new location (or a different machine) without needing to edit the paths, as long as the relative structure between the JSON file and the data directories is preserved.