MD.ai

Create Dataset

Every project should contain at least one dataset. A common use case for having more than one dataset in a project is separating training and test sets. After creating a new project, click on the New Dataset button to create a new dataset. This dataset will serve as the container into which your data will be loaded. Note the project and dataset IDs.

Create Dataset

For each dataset you can also choose to either keep the Dataset access Project-Wide or Restricted.

We currently support DICOM, as well as images (JPEG/PNG/BMP) and videos (MP4/MPG/MOV/AVI/WEBM/OGG/H264). Learn more about annotating videos here.

Supported DICOM modalities

  • AR: Autorefraction
  • CR: Computed Radiography
  • CT: Computed Tomography
  • DX: Digital Radiography
  • ES: Endoscopy
  • IO: Intra-oral Radiography
  • IOL: Intraocular Lens Data
  • IVOCT: Intravascular Optical Coherence Tomography
  • KER: Keratometry
  • LEN: Lensometry
  • MG: Mammography
  • MR: Magnetic Resonance Imaging (MRI)
  • NM: Nuclear Medicine
  • OAM: Ophthalmic Axial Measurements
  • OCT: Optical coherence tomography (non-Ophthalmic)
  • OP: Ophthalmic Photography
  • OPM: Ophthalmic Mapping
  • OPT: Ophthalmic Tomography
  • OPV: Ophthalmic Visual Field
  • OT: Other
  • PT: Positron emission tomography
  • RF: Radio Fluoroscopy
  • RG: Radiographic imaging (conventional film/screen)
  • SRF: Subjective Refraction
  • US: Ultrasound
  • XA: X-Ray Angiography
  • XC: External-camera Photography

Load external data to a dataset

Choose a dataset type eg. DICOM. There are multiple ways (Dataset Sources) to add external data to your project's dataset:

In addition to these methods for creating new datasets using external data, you can also create new datasets from subsets of already existing exams/series/images in your dataset after applying label filters. Read more about it here.

Loading images and videos

Because our platform is built primarily for DICOM, images and videos are wrapped as virtual DICOM objects. In order to group images or videos together within the DICOM hierarchy, you can upload them as a folder organized in the following way:

  • videos: [root_folder/]<study_folder_name>/<video_file_name>.ext
  • images: [root_folder/]<study_folder_name>/<series_folder_name>/<image_file_name>.ext

Note: [root_folder] is optional. Without this folder structure, each image or video will be created as a separate study.

Preprocessing options

For all dataset types, we support additional preprocessing options such as ignoring secondary captures, setting a filename preset and defining a default ordering of the exams either using Patient ID, Study Date/Time, Study Description or Random.

Dataset Preprocessing Options

Ignore secondary captures

For every new dataset, you can choose to ignore secondary captures so that they are not uploaded to MD.ai for viewing. This can be useful for de-identification purposes if there is a concern for burned-in PHI data.

Ignore Secondary Capture

Video frame rate

For videos, the frame rate can be adjusted during processing.

Video Frame Rate