Python Client Library

Warning

In active development. Currently pre-alpha -- API may change significantly in future releases.

How-To Guides

Quick start: setup project

import mdai

# Get variables from project info tab and user settings
DOMAIN = 'public.md.ai'
YOUR_PERSONAL_TOKEN = 'a1s2d3f4g4h5h59797kllh8vk'
PROJECT_ID = 'LxR6zdR2' #project info
DATASET_ID = 'D_ao3XWQ' #project info
PATH_FOR_DATA = '.'
PATH_TO_IMAGES = './mydata' #location of images if not downloaded from project

mdai_client = mdai.Client(domain=DOMAIN, access_token=YOUR_PERSONAL_TOKEN)

# download images and annotation data
p = mdai_client.project(PROJECT_ID, path=PATH_FOR_DATA)
# or, give path to images and download only the annotation data
p = mdai_client.project(PROJECT_ID, path=PATH_TO_IMAGES,  annotations_only=True)
p = mdai.preprocess.Project(annotations_fp=JSONPATH_FROM_FUNCTION_ABOVE, images_dir=PATH_TO_IMAGES)

# show labels to get desired label ids for project
p.show_label_groups()
# create class labels_dict from desired labels and give class value
labels_dict = {
    'L_ylR0L8': 0, # background
    'L_DlqEAl': 1, # lung opacity
}
# initiate project with labels_dict
p.set_labels_dict(labels_dict)
#prepare dataset to instantiate annotations and image ids
dataset = p.get_dataset_by_id(DATASET_ID)
dataset.prepare()

Display label classes

dataset.show_classes()

Example output:

Label id: L_ylR0L8, Class id: 0, Class text: No Lung Opacity
Label id: L_DlqEAl, Class id: 1, Class text: Lung Opacity

Display images

mdai.visualize.display_images(image_ids)

# additional arguments
mdai.visualize.display_images(image_ids, titles=None, cols=3, cmap="gray", norm=None, interpolation=None)

Get DICOM pixel array

pixel_array = mdai.visualize.load_dicom_image(image_id, to_RGB=False, rescale=True)

to_RGB returns a 3D array, rescale returns uint8 scaled to 255

Get image mask

mask = mdai.visualize.load_mask(image_id, dataset)
image_plus_mask = mdai.visualize.apply_mask(image, mask, color, alpha=0.3)

Get image with all annotations and masks

image, class_ids, bboxes, masks = mdai.visualize.get_image_ground_truth(image_id, dataset)

Display image and masks

mdai.visualize.display_annotations(
    image,
    boxes,
    masks,
    class_ids,
    scores=None,
    title="",
    figsize=(16, 16),
    ax=None,
    show_mask=True,
    show_bbox=True,
    colors=None,
    captions=None,
)

Import Annotations

We can use the mdai python client to quickly import a list of annotations into a project. For example, we may wish to load the output results of a machine learning model as annotations, or quickly populate project metadata labels.

Import the mdai library and create a client mdai_client similar to the "Getting Started" section above. Ensure the correct domain and provide your access token.

import mdai
mdai_client = mdai.Client(domain='$DOMAIN.md.ai', access_token="$YOUR-PERSONAL-TOKEN")

Resource IDs

We will be using the import_annotations(annotations, project_id, dataset_id, model_id=None, chunk_size=100000) method on the client. To obtain the project and dataset IDs, click on the "Info" button in the bottom left sidebar. When more than one dataset is available within the project, select the dataset you will be loading your annotations into.

Imported annotations can be attached to a machine learning model. To create a new model, click the "Models" button in the top left sidebar, and click the "Create Model" button. You will then find the model ID in the newly created model card.

For annotations from metadata labels, model_id does not need to be provided; you will be the creator of these annotations. However, only project admins are allowed to import annotations from metadata labels.

Example IDs:

project_id = "LxR6zdR2"
dataset_id = 'D_vBnWb1'
model_id = 'M_qvlG84'

Annotations format

The annotations variable must be a list of dicts with the set of required fields dependent on the different label types. All annotations must contain its corresponding labelId. For exam-scoped labels, StudyInstanceUID is required. For series-scoped labels, SeriesInstanceUID is required. For image-scoped labels, SOPInstanceUID is required. For local image-level labels, an additional field data is required, with the format of data specified based on the example below:

annotations = [
  # global label (exam-scoped)
  {
    'labelId': 'L_jZRPN2',
    'StudyInstanceUID': '1.2.276.0.7230010.3.1.2.8323329.19529.1513659878.548505',
    'note': 'This is an annotation note that can be imported as well',
  },
  # global label (series-scoped)
  {
    'labelId': 'L_nZYWNg',
    'SeriesInstanceUID': '1.2.276.0.7230010.3.1.3.8323329.19529.1513659878.548504',
  },
  # global label (image-scoped)
  {
    'labelId': 'L_3EMmEY',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.19529.1513659878.548506',
  },
  # local label (must always be image-scoped) where annotation mode is 'bbox' (Bounding Box)
  {
    'labelId': 'L_xZgpEM',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.25053.1513659982.870619',
    'data': {'x': 200, 'y': 200, 'width': 200, 'height': 400}
  },
  # local label where annotation mode is 'polygon' (Polygon)
  {
    'labelId': 'L_nEK9AV',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.14861.1513659866.150292',
    'data': {'vertices': [[300,300],[500,500],[600,700],[200,800]]}
  },
  # local label where annotation mode is 'freeform' (Freeform)
  {
    'labelId': 'L_9ZymZx',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.20913.1513659971.911318',
    'data': {'vertices': [[300,300],[500,500],[600,700],[200,800]]}
  },
  # local label where annotation mode is 'location' (Location)
  {
    'labelId': 'L_BA3JN0',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.2010.1513660008.830854',
    'data': {'x': 400, 'y': 400}
  },
  # local label where annotation mode is 'line' (Line)
  {
    'labelId': 'L_WN98Eg',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.29889.1513659996.912413',
    'data': {'vertices': [[300,300],[500,500],[600,700],[200,800]]}
  },
]

annotations will also be validated during actual processing and importing. If the size of annotations is very large, it is recommended to specify a reasonble chunk_size to import iteratively in chunks.

mdai_client.import_annotations(annotations, project_id, dataset_id, model_id)

Example output:

    Importing 8 annotations into project LxR6zdR2, dataset D_vBnWb1, model M_qvlG84...
    Successfully imported annotations into project LxR6zdR2.

Metadata format

Similarly, to import annotations from metadata labels:

metadata_annotations = [
  # metadata label (exam-scoped)
  {
    'labelId': 'L_7E5pMZ',
    'StudyInstanceUID': '1.2.276.0.7230010.3.1.2.8323329.19529.1513659878.548505',
    'note': 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Enim sed faucibus turpis in eu mi bibendum neque egestas. Mi quis hendrerit dolor magna. Euismod nisi porta lorem mollis aliquam ut. Massa sed elementum tempus egestas sed. Nunc aliquet bibendum enim facilisis. Est placerat in egestas erat. Quis commodo odio aenean sed adipiscing. Sed vulputate odio ut enim blandit volutpat maecenas volutpat. Fames ac turpis egestas sed. Adipiscing enim eu turpis egestas pretium aenean pharetra magna. Non consectetur a erat nam at lectus urna. Fringilla ut morbi tincidunt augue. Nunc lobortis mattis aliquam faucibus purus. Enim eu turpis egestas pretium aenean pharetra magna. In tellus integer feugiat scelerisque. Tristique et egestas quis ipsum suspendisse ultrices gravida. Praesent elementum facilisis leo vel. Donec enim diam vulputate ut pharetra sit. Vel pretium lectus quam id leo in vitae turpis mass',
  },
  # metadata label (series-scoped)
  {
    'labelId': 'L_pEnelZ',
    'SeriesInstanceUID': '1.2.276.0.7230010.3.1.3.8323329.19529.1513659878.548504',
    'note': 'Sit amet purus gravida quis blandit turpis. Luctus accumsan tortor posuere ac ut consequat semper. Viverra mauris in aliquam sem. Placerat in egestas erat imperdiet sed euismod. Bibendum est ultricies integer quis auctor elit sed vulputate. Lacinia quis vel eros donec. Convallis tellus id interdum velit laoreet id. Morbi quis commodo odio aenean sed adipiscing. Ut aliquam purus sit amet luctus. Leo integer malesuada nunc vel risus commodo viverra maecenas accumsan. Velit scelerisque in dictum non consectetur a erat. Id faucibus nisl tincidunt eget nullam non nisi. Ac turpis egestas maecenas pharetra. Congue mauris rhoncus aenean vel. Massa tincidunt dui ut ornare lectus sit amet est placerat. Fringilla phasellus faucibus scelerisque eleifend donec. Magna eget est lorem ipsum dolor sit amet. Quam vulputate dignissim suspendisse in. Orci ac auctor augue mauris augue neque gravida.',
  },
  # metadata label (image-scoped)
  {
    'labelId': 'L_yZBWwE',
    'SOPInstanceUID': '1.2.276.0.7230010.3.1.4.8323329.19529.1513659878.548506',
    'note': 'Faucibus interdum posuere lorem ipsum dolor sit amet. Risus in hendrerit gravida rutrum. Faucibus in ornare quam viverra orci sagittis eu volutpat odio. Nisl rhoncus mattis rhoncus urna neque viverra justo. Quis varius quam quisque id diam vel. Ullamcorper a lacus vestibulum sed arcu non odio euismod. In mollis nunc sed id semper risus. Tempor id eu nisl nunc mi ipsum faucibus vitae. Justo donec enim diam vulputate ut. Neque convallis a cras semper. Tellus orci ac auctor augue mauris augue neque gravida. Ullamcorper velit sed ullamcorper morbi tincidunt ornare massa eget egestas. Pellentesque nec nam aliquam sem et tortor. Gravida arcu ac tortor dignissim convallis aenean et tortor. Ac tortor vitae purus faucibus ornare suspendisse sed nisi lacus. Sit amet consectetur adipiscing elit ut aliquam purus. Dui vivamus arcu felis bibendum ut tristique.',
  },
]
mdai_client.import_annotations(metadata_annotations, project_id, dataset_id)

Example output:

    Importing 3 annotations into project LxR6zdR2, dataset D_vBnWb1...
    Successfully imported annotations into project LxR6zdR2.

Getting UIDs from your original files

Use this code on your original data to create dictionaries of the UIDs from the image filenames

from pathlib import Path
import pydicom as py

images_path = Path('MY_PATH')
original_fn = list(images_path.glob('**/*.dcm'))

file_dict_sop = dict()
file_dict_series = dict()
file_dict_study = dict()

for f in original_fn:
    d = py.dcmread(str(f))
    file_dict_sop[f] = d.SOPInstanceUID
    file_dict_series[f] = d.SeriesInstanceUID
    file_dict_study[f] = d.StudyInstanceUID

Convert json file to dataframe

Obtain the json file either by the export tab in the Annotator tool or by using mdai_client.project(PROJECT_ID, annotations_only=True)

JSON is the path and name of the resulting json file. You can choose a dataset or it will default to all datasets

Simple

results = mdai.common_utils.json_to_dataframe(json_path)
anno_df = results['annotations']
studies_df = results['studies']
labels_df = results['labels']

or

Optional choose datasets

results = mdai.common_utils.json_to_dataframe(json_path, datasets=[NAMES_OF_DATASETS])

Custom annotations dataset

  1. Create a project and dataset using the quickstart section
  2. Get annotation data
  3. Edit annotations and feed back into dataset data
  4. Initialize custom dataset with edited data
annotations = dataset.all_annotations

# edit annotations...
annotations_edited = annotations

# load back in dataset
dataset.dataset_data['annotations'] = annotations_edited
dataset_custom = Dataset(dataset_data, images_dir)

# now use this new dataset for creating training/testing datsets with train_test_split

Split data into training and validation datasets

train_test_split(dataset, shuffle=True, validation_split=0.1)

train_dataset, valid_dataset = mdai.common_utils.train_test_split(dataset)

DataGenerator

DataGenerator(dataset, batch_size=32, dim=(32, 32), n_channels=1, n_classes=10, shuffle=True, to_RGB=True, rescale=False)

mdai.utils.keras_utils.DataGenerator(dataset)

Write to TFRecords

mdai.utils.tensorflow_utils.write_to_tfrecords(output_path, dataset)