Handling of cross-folding
Decide how to perform cross-validation. Either we create 5 different experiments for each of the 4 models (which makes 20 different configuration in total), and run everything separately and manually, or we extend the training script to do the 5 folds in a row, with the testing step. In the latter, job duration and number of epochs should be carefully calibrated.