Add Datasets
Every project should contain at least one dataset. A common use case for having more than one dataset in a project is separating training and test sets. After creating a new project, click on the New Dataset
button to create a new dataset. This dataset will serve as the container into which your data will be loaded. Note the project and dataset IDs.
For each dataset you can also choose to either keep the Dataset access Project-Wide
or Restricted
.
Supported Medical Modalities
- CR: Computed Radiography
- CT: Computed Tomography
- DX: Digital Radiography
- IVOCT: Intravascular Optical Coherence Tomography
- MG: Mammography
- MR: Magnetic Resonance Imaging (MRI)
- NM: Nuclear Medicine
- OCT: Optical coherence tomography (non-Ophthalmic)
- OPT: Ophthalmic Tomography
- OT: Other
- PT: Positron emission tomography
- RF: Radio Fluoroscopy
- RG: Radiographic imaging (conventional film/screen)
- US: Ultrasound
- XA: X-Ray Angiography
Load external data to a dataset
Choose a dataset type eg. DICOM
. There are multiple ways to add external data to your project's dataset.
Upload
If your dataset source is set to Upload
, there are two ways to load external data:
-
Use the web UI directly (drag-and-drop a folder containing your DICOM images, or use the upload files/upload folder buttons). The files are detected recursively from within the folders.
-
Use the MD.ai CLI tool. This is highly recommended for larger datasets (>100 GB). See the CLI Usage page for command descriptions.
We also support uploading zip files using this method.
Once uploading is complete, the uploaded data will be processed in the background and the DICOM series thumbnails will appear on completion.
DICOM Push (C-STORE)
You can choose to stream images to the project via the C-STORE/DICOM Push protocol. The Hostname
, Port
and Remote AE
values will be provided for each dataset.
By default the dataset will be Unlocked
and allow all incoming DICOM pushes. You can choose to Lock
the dataset to stop incoming connections.
Google Cloud Storage
You can easily attach a Google Cloud Storage bucket to your project -
- Add the bucket name.
- Add a folder prefix (optional)
- Add permissions in your GCS Bucket as outlined
- Confirm Bucket permissions once added
- Press
Connect
Google Healthcare API
You can also connect to the Google Cloud Healthcare API DICOM Store to add data to your project.
- Add the
GCP Project ID
,GCP Region
,Dataset ID
andDICOM Store ID
- Optionally add a GCP Annotation Store by activating and adding the Annotation Store ID. If the annotation store already exists, we will attempt to import annotation records within the store. Otherwise, the annotation store will be created.
Amazon S3
You can easily attach an Amazon S3 bucket to your project -
- Add the bucket name.
- Add a folder prefix (optional)
- Add permissions in your Amazon S3 bucket as outlined
- Confirm Bucket permissions once added
- Press
Connect
Troubleshooting Processing is Stuck
If the processing of files gets stuck at a number divisible by 1000, it's likely due to an issue with your internet stability. You can cancel the current processing task and reload your images.
Cancel Processing
Go to the Project Card and click the three horizontal dots on the dataset card for which you want to cancel and restart processing. Choose Cancel Processing
and then try again with either the CLI tool or the UI.
Turn off sleep mode
Turn off your computer's sleep mode temporarily for loading large datasets. When the computer goes to sleep, it will disconnect the processing. If that happens, choose Cancel Processing
on the Edit page of the project and reload the data.