Python Client Library

Warning

In active development. Currently pre-alpha -- API may change significantly in future releases.

Getting Started

Import the mdai library

import mdai

# check version
print(mdai.__version__)

Create an mdai client

The mdai client requires an access token, which authenticates you as the user. To create a new token or select an existing token, navigate to the "Personal Access Tokens" tab on your user settings page at the specified MD.ai domain (e.g., public.md.ai).

Important: keep your access tokens safe. Do not ever share your tokens.

mdai_client = mdai.Client(domain='public.md.ai', access_token="$YOUR-PERSONAL-TOKEN")

Example output:

    Successfully authenticated to public.md.ai.

Define a project

Define a project you have access to by passing in the project id. The project id can be found in the URL in the following format: https://public.md.ai/annotator/project/{project_id}. For example, project_id would be LxR6zdR2 for https://public.md.ai/annotator/project/LxR6zdR2. Specify optional path as the data directory (if left blank, will default to current working directory).

Example:

p = mdai_client.project('LxR6zdR2', path='./lesson3-data')

A project object will attempt to download the dataset (images and annotations separately) and extract images, annotations to the specified path. However, if the latest version of images or annotations have been downloaded and extracted, then cached data is used.

    Using path './lesson3-data' for data.
    Preparing annotations export for project LxR6zdR2...
    Preparing images export for project LxR6zdR2...
    Using cached images data for project LxR6zdR2.
    Using cached annotations data for project LxR6zdR2.

Create Project if annotations_only=True

If you are unable to export images from within the Annotator, this feature was turned off in order to save money for your institution/company. To use the MD.ai api further we need to do two things.

  • First, your images need to be in a specific format. Project requires the images to be in StudyInstanceUID/SeriesInstanceUID/SOPInstanceUID.dcm format.
  • Second, download the annotations and create the project:
PATH_TO_IMAGES = 'path to my data goes here'
PROJECT_ID = 'get the project id from the Annotator'
mdai_client.project(PROJECT_ID, path=PATH_TO_IMAGES,  annotations_only=True)

This downloads the json only. Grab the name of the json file and create the project with:

p = mdai.preprocess.Project(
                annotations_fp=json_path,
                images_dir=PATH_TO_IMAGES
            )

Important

The images in PATH_TO_IMAGES need to be in a specific format to work with the rest of the api. Convert your images to StudyInstanceUID/SeriesInstanceUID/SOPInstanceUID.dcm format first.

Prepare data

Show available label groups

p.show_label_groups()

Set label ids and their corresponding class id

Label ids and corresponding class ids, must be explicitly set by Project#set_label_dict method in order to prepare datasets.

Example:

# this maps label ids to class ids
labels_dict = {
    'L_ylR0L8': 0, # background
    'L_DlqEAl': 1, # lung opacity
}
p.set_labels_dict(labels_dict)

Show available datasets

p.show_datasets()

Get dataset by id, or by name

dataset = p.get_dataset_by_id('D_ao3XWQ')
dataset.prepare()

Shows each label's label id, their class id and class name

dataset.show_classes()

Example output:

    Label id: L_ylR0L8, Class id: 0, Class text: No Lung Opacity
    Label id: L_DlqEAl, Class id: 1, Class text: Lung Opacity

Split data into traing and validation datasets

train_dataset, valid_dataset = mdai.common_utils.train_test_split(dataset)

By default, dataset is shuffled, and the train/validation split ratio is 0.9 to 0.1.

Visualization

Visualization of different annotation modes

Display images