Forecast with Azure Auto ML

Rahul Agarwal
3 min readDec 23, 2021

--

Photo by Lance Asper on Unsplash

Previously I have used AWS and GCP auto ML for forecasting and I am trying to understand Azure so thought I would do a similar experiment. Using the same dataset as previously to forecast API request rate for a hypothetical service.

After logging into Azure Portal goto “Machine Learning”. First step is to create a ML workspace that enables Azure Machine Learning Studio which is equivalent of SageMaker Studio in AWS.

Create machine learning workspace
Create machine learning workspace

Once deployed, launch Azure Machine Learning Studio and using the “+” add a Dataset (I did it from local and uploaded the old request-rate CSV file). It is a very nice wizard and even shows a nice data profile like pandas describe once done.

Create an Auto ML Experiment, its fairly self-explanatory and tooltips answer any questions. Setup a time-series experiment. I have hourly data and looking for 7 days forecast so 7*24=168 is the forecast horizon. Start the run.

I setup a new 1 node compute cluster (and it shows up under “Compute” in ML studio only and not under regular Azure resources). Make sure to stop it (more in cleanup later).

It takes a while to get going and under “Child runs” it has various jobs queued up. There are a lot of details for each as it names the model being applied and various metrics and all logs etc. After a while it is all done and VotingEnsemble is the best model. I have no idea what that is but there are a lot of new ones in the Models list so good learning opportunity.

Model training result
Model training result

Now let us use this model to run a test and generate predictions so we can compare with observed. Click on the model you want to use (VotingEnsemble in this case) and do a “Test model (preview)” and use the test data set. The affable_scooter is the best model and icy_ring is the test result.

Prediction result
Prediction result

Click on the output dataset and will take you to the blobstore and give a predictions.csv file. It contains a Point_orig and Point_predicted for each hour. Then plugging into previous notebook result is the following which looks similar to AWS.

Observed vs forecast values
Observed vs forecast values

For cleanup simplest is to delete the entire Resource Group which includes the storage account and ML studio. I still don’t see cost in the cost manager yet (it is next day) but hopefully not much.

If these topics interest you then reach out to me, and I will appreciate any feedback. If you would like to work on such problems you will generally find open roles as well! Please refer to LinkedIn.

--

--