Creating a Parameterized Training Script


Start course
1h 23m

Learn how to operate machine learning solutions at cloud scale using the Azure Machine Learning SDK. This course teaches you to leverage your existing knowledge of Python and machine learning to manage data ingestion, data preparation, model training, and model deployment in Microsoft Azure.

If you have any feedback related to this course, please contact us at

Learning Objectives

  • Create an Azure Machine Learning workspace using the SDK
  • Run experiments and train models using the SDK
  • Optimize and manage models using the SDK
  • Deploy and consume models using the SDK

Intended Audience

This course is designed for data scientists with existing knowledge of Python and machine learning frameworks, such as Scikit-Learn, PyTorch, and Tensorflow, who want to build and operate machine learning solutions in the cloud.


  • Fundamental knowledge of Microsoft Azure
  • Experience writing Python code to work with data using libraries such as Numpy, Pandas, and Matplotlib
  • Understanding of data science, including how to prepare data and train machine learning models using common machine learning libraries, such as Scikit-Learn, PyTorch, or Tensorflow


The GitHub repo for this course, containing the code and datasets used, can be found here: 


We can increase the flexibility of our training experiment, by adding parameters to our script which will allow us to repeat the same Training experiment with different settings. So in this case, we'll add a parameter for the regularization rates used by the logistic regression, of when we're training the model.

So again, let's start by creating a folder for the parameterized script and the training data. So we need an os model and our shutil model. As before we create a folder for the experiment and then using shutil, we copy the diabetes CSV to the folder.

So now let's create a script considering the parameter for the realization rate, a hyper barometer. So using a magic command, right file. You specify the folder and the script we're writing into. We need to import the following libraries. So we need run. We need our pandas. We need numpy and we need joblib. We need os. And we need argparse, which allows us to pass in arguments into a script from the command line. We need a train test split. We need our logistic regression algorithm. We need our metrics, here under the curve and accuracy.

With our imports done, let's get the experiment context. So to set our realization, hyper parameter, we invoked ArgumentParser off argparse. That's created a parser object. We invoke add argument. We pass in the name, type, and then the name of the attribute and a default value, but then load the diabetes data set within sep, separate the features and labels, split the training set, and test sets. And then we train our logistic regression model and log the realization rates.

After training, we can make a prediction and then we can measure the accuracy of our prediction and load the information using our run and log function within predict probability. And we can use that information to calculate our area under the curve and also store that information using our log functional run object. And finally, we can store our model and then complete our run.

About the Author

Kofi is a digital technology specialist in a variety of business applications. He stays up to date on business trends and technology and is an early adopter of powerful and creative ideas.
His experience covers a wide range of topics including data science, machine learning, deep learning, reinforcement learning, DevOps, software engineering, cloud computing, business & technology strategy, design & delivery of flipped/social learning experiences, blended learning curriculum design and delivery, and training consultancy.