Instance Segmentation pipeline reference
Pipeline root properties
classes
type: integer
Number of classes that should be segmented.
Example:
configPath
type: string
Path to MMDetection config file. Should be absolute or relative to the musket config file.
dataset
type: complex object
Key is a name of the python function in scope, which returns training data set. Value is an array of parameters to pass to a function.
Example:
dataset:
getTrain: [false,false]
datasets
type: map containing complex objects
Sets up a list of available data sets to be referred by other entities.
For each object, key is a name of the python function in scope, which returns training dataset. Value is an array of parameters to pass to a function.
Example:
datasets:
test:
getTest: [false,false]
folds_count
type: integer
Number of folds to train. Default is 5.
Example:
holdout
type: ````
Example:
imagesPerGpu
type: integer
Number of images in a batch to be processed by single GPU.
MMDetection does not allow specifying batch size directly,
it only allows setting how much images are processed by each GPU at a time.
Thus, batch size is imagesPerGpu
multiplied by gpus_per_net
.
Example:
imagesPerGpu: 2
imports
type: array of strings
Imports python files from modules
folder of the project and make their properly annotated contents to be available to be referred from YAML.
Example:
imports: [ layers, preprocessors ]
this will import layers.py
and preprocessors.py
multiscaleMode
type: string
Can be range
or value
(default). Setting value to range
allows using two dimensional integer arrays as shape
values for specifying possible ranges of train shapes.
num_seeds
type: integer
If set, training process (for all folds) will be executed num_seeds
times, each time resetting the random seeds.
Respective folders (like metrics
) will obtain subfolders 0
, 1
etc... for each seed.
Example:
preprocessing
type: complex
Preprocessors are the custom python functions that transform dataset.
Such functions should be defined in python files that are in a project scope (modules
) folder and imported.
Preprocessing functions should be also marked with @preprocessing.dataset_preprocessor
annotation.
preprocessing
instruction then can be used to chain preprocessors as needed for this particular experiment, and even cache the result on disk to be reused between experiments.
Preprocessors contain some of the preprocessor utility instructions.
Example:
preprocessing:
- binarize_target:
- tokenize:
- tokens_to_indexes:
maxLen: 160
- disk-cache:
random_state
type: integer
The seed of randomness.
Example:
shape
type: one or two dimensional array of integers
Shape of the model input. All images are automatically scaled to this shape before being processed by the model. The exact meaning of the parameter can be:
-
One dimensional array is simply understood as as
[height, width]
for train, validation and infering shapes. -
Two dimensional array is understood as array of shapes. Train shape is chosen randomly from the array for each train sample, and the first shape is always taken on validation and infering.
-
With the multiscaleMode parameter set to
range
a two element two dimensional array[[h1,w1], [h2,w2]]
is understood as possible range for train shapes: train height and width are randomly chosen from[h1, h2]
and[w1, w2]
intervals respectively. Like in the previous case, the first shape is always taken on validation and infering.
stages
type: complex
Sets up training process stages. Contains YAML array of stages, where each stage is a complex type that may contain properties described in the Stage properties section.
Example:
stages:
- epochs: 6
- epochs: 6
lr: 0.01
stratified
type: boolean
Whether to use stratified strategy when splitting training set.
Example:
testSplit
type: float 0-1
Splits the train set into two parts, using one part for train and leaving the other untouched for a later testing. The split is shuffled.
Example:
testSplit: 0.4
testSplitSeed
type: ````
Seed of randomness for the split of the training set.
Example:
validationSplit
type: float
Float 0-1 setting up how much of the training set (after holdout is already cut off) to allocate for validation.
Example:
weightsPath
type: string
Path to the model pretreined weights. Should be absolute or relative to the musket config file.
Stage properties
epochs
type: integer
Number of epochs to train for this stage.
Example:
Preprocessors
type: complex
Preprocessors are the custom python functions that transform dataset.
Such functions should be defined in python files that are in a project scope (modules
) folder and imported.
Preprocessing functions should be also marked with @preprocessing.dataset_preprocessor
annotation.
Preprocessors
instruction then can be used to chain preprocessors as needed for this particular experiment, and even cache the result on disk to be reused between experiments.
Example:
preprocessing:
- binarize_target:
- tokenize:
- tokens_to_indexes:
maxLen: 160
- disk-cache:
cache
Caches its input.
Properties:
- name - string; optionally sets up layer name to refer it from other layers.
- inputs - array of strings; lists layer inputs.
Example:
disk-cache
Caches its input on disk, including the full flow. On subsequent launches if nothing was changed in the flow, takes its output from disk instead of re-launching previous operations.
Properties:
- name - string; optionally sets up layer name to refer it from other layers.
- inputs - array of strings; lists layer inputs.
Example:
preprocessing:
- binarize_target:
- tokenize:
- tokens_to_indexes:
maxLen: 160
- disk-cache:
split-preprocessor
An analogue of split for preprocessor operations.
Example:
split-concat-preprocessor
An analogue of split-concat for preprocessor operations.
Example:
seq-preprocessor
An analogue of seq for preprocessor operations.
Example:
augmentation
Preprocessor instruction, which body only runs during the training and is skipped when the inferring.
augmentation:
Fliplr: 0.5
Affine:
translate_px:
x:
- -50
- +50
y:
- -50
- +50
In this example, Fliplr
key is automatically mapped on Fliplr agugmenter,
their 0.5
parameter is mapped on the first p
parameter of the augmenter.
Named parameters are also mapped, in example translate_px
key of Affine
is mapped on translate_px
parameter of Affine augmenter.
fit script arguments
fit.py project
type: string
Folder to search for experiments, project root.
Example:
-m musket_core.fit --project "path/to/project"
fit.py name
type: string or comma-separated list of strings
Name of the experiment to launch, or a list of names.
Example:
-m musket_core.fit --name "experiment_name"
-m musket_core.fit --name "experiment_name1, experiment_name2"
fit.py num_gpus
type: integer
Default: 1
Number of GPUs to use during experiment launch.
Example:
-m musket_core.fit --num_gpus=1
fit.py gpus_per_net
type: integer
Default: 1
Maximum number of GPUs to use per single experiment.
Example:
-m musket_core.fit --gpus_per_net=1
fit.py num_workers
type: integer
Default: 1
Number of workers to use.
Example:
-m musket_core.fit --num_workers=1
fit.py allow_resume
type: boolean
Default: False
Whether to allow resuming of experiments, which will cause unfinished experiments to start from the best saved weights.
Example:
-m musket_core.fit --allow_resume True
fit.py force_recalc
type: boolean
Default: False
Whether to force rebuilding of reports and predictions.
Example:
-m musket_core.fit --force_recalc True
fit.py launch_tasks
type: boolean
Default: False
Whether to launch associated tasks.
Example:
-m musket_core.fit --launch_tasks True
fit.py only_report
type: boolean
Default: False
Whether to only generate reports for cached data, no training occurs.
Example:
-m musket_core.fit --only_report True
fit.py cache
type: string
Path to the cache folder. Cache folder will contain temporary cached data for executed experiments.
Example:
-m musket_core.fit --cache "path/to/cache/folder"
fit.py folds
type: integer or comma-separated list of integers
Folds to launch. By default all folds of experiment will be executed, this argument allows launching only some of them.
Example:
-m musket_core.fit --folds 1,2
task script arguments
task.py project
type: string
Folder to search for experiments, project root.
Example:
task.py --project "path/to/project"
task.py name
type: string or comma-separated list of strings
Name of the experiment to launch, or a list of names.
Example:
task.py --name "experiment_name"
task.py --name "experiment_name1, experiment_name2"
task.py task
type: string or comma-separated list of strings
Default: all tasks.
Name of the task to launch, or a list of names.
Example:
task.py --task "task_name"
task.py --task "task_name1, task_name2"
task.py --task "all"
task.py num_gpus
type: integer
Default: 1
Number of GPUs to use during experiment launch.
Example:
task.py --num_gpus=1
task.py gpus_per_net
type: integer
Default: 1
Maximum number of GPUs to use per single experiment.
Example:
task.py --gpus_per_net=1
task.py num_workers
type: integer
Default: 1
Number of workers to use.
Example:
task.py --num_workers=1
task.py allow_resume
type: boolean
Default: False
Whether to allow resuming of experiments, which will cause unfinished experiments to start from the best saved weights.
Example:
task.py --allow_resume True
task.py force_recalc
type: boolean
Default: False
Whether to force rebuilding of reports and predictions.
Example:
task.py --force_recalc True
task.py launch_tasks
type: boolean
Default: False
Whether to launch associated tasks.
Example:
task.py --launch_tasks True
task.py cache
type: string
Path to the cache folder. Cache folder will contain temporary cached data for executed experiments.
Example:
task.py --cache "path/to/cache/folder"
analyze script arguments
analyze.py inputFolder
type: string
Folder to search for finished experiments in. Typically, project root.
Example:
analyze.py --inputFolder "path/to/project"
analyze.py output
type: string
Default: report.csv
in project root.
Output report file path.
Example:
analyze.py --output "path/to/project/report/report.scv"
analyze.py onlyMetric
type: string
Name of the single metric to take into account.
Example:
analyze.py --onlyMetric "metric_name"
analyze.py sortBy
type: string
Name of the metric to sort result by.
Example:
analyze.py --sortBy "metric_name"