This page is no longer being updated and will be replaced when Second Tier pd4castr subscription is released that will include multiple models.

Model updated 5th February 2024

Summary

The retraining process resulted in two new models, one for PD and one for 7day price forecast. After experimentation we retrained the price forecasting models using existing features and hyper parameter settings. The increase in accuracy (using Absolute Mean Error, AME) was approximately 5%. (see validation results table below comparing the previous models with the new models).

Training period

Both PD and 7day price forecast models were trained on data between 01-01-2021 and 15-01-2024.

The market suspension period and a period after the Callide explosion were removed from the training data.

Validation set

The model was validated by predicting the price between 16-01-2024 and 30-01-2024

New features (input data) exceptions

The feature set is the same as the previous model. The main benefit of the new model is that more recent market data are consumed in the training process. The training does not weight the most recent market data.
- A number of experiments were conducted by changing the feature list and some hyper parameters however this didn’t not achieve a significant benefit to the model tests. Hence for consistency we have not made changes to the model other than extending the training period to the most recent market data. Unlike the August model update included significant changes to both the feature set and the hyper parameters.
Validation set is from the 1st of November to the 7th of November. Oddly enough the impact of the Melbourne Cup did not have a significant impact on the performance metrics.

	Predispatch model	p7day model

	Predispatch model	p7day model
Model_id	20231108_142510_CB_2	20231108_140809_CB_7D_2
Previous model release date	9th November 2023	9th November 2023
Latest release date (after retraining)	5th February 2024	5th February 2024
Machine Learning Algorithm	Ensemble predictor: gradient boosting	Ensemble predictor: gradient boosting
Training Period	1st January 2021 to 15th January 2024	1st January 2021 to 15th January 2024
Hyper-parameter type	A	A
Hyper-parameter notes	No weighting across time periods	No weighting across time periods
Target	Dispatchprice (30min resolution)	Dispatchprice (30min resolution)
Input (and training) variables	PREDISPATCHPRICE rrp DISPATCHPRICE last 3 day average actual RRP for same time PREDISPATCHREGIONSUM (intervention = 0) hours_out time_of_day day_type (weekday_or_weekend) dispatchableload netinterchage (availablegeneration - uigf) (dispatchablegeneration - scheduled_clearedmw) ss_solar_clearedmw ss_solar_uigf ss_wind_clearedmw ss_wind_uigf totaldemand totalintermittentgeneration	STPASA_REGIONSOLUTION hours_out time_of_day day_type (weekday_or_weekend) aggregatepasaavailability aggregatescheduledload netinterchangeunderscarcity aggregatecapacityavailable ss_solar_cleared ss_wind_uigf ss_wind_cleared ss_solar_uigf demand10 demand50 totalintermittentgeneration demand_and_nonschedgen

Validation Results

These are the results of the best model generated through our experimentation process and compares the results with the previous model across our validation set (our test period).

Note that errors are considerably higher because the validation set was unusually volatile.

Average error is equal to Predication minus real prices.

All error metrics are in $/MWhr.

PD Results

Metric/Region	Actual Mean Price	Mean Prediction New Model	Mean Prediction Previous Model	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/Region	Actual Mean Price	Mean Prediction New Model	Mean Prediction Previous Model	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model
NSW1	92.22	92.37	91.70	83.87	85.02	0.15	-0.52
QLD1	214.73	238.09	202.77	225.16	206.22	23.36	-11.95
SA1	36.86	51.16	46.38	46.37	45.49	14.30	9.52
TAS1	37.43	42.49	39.79	19.24	19.92	5.06	2.36
VIC1	17.34	25.25	21.75	25.92	27.50	7.91	4.41

7 Day Results

Metric/ Region	Actual Mean Price	Mean Prediction New Model	Mean Prediction Previous Model	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/ Region	Actual Mean Price	Mean Prediction New Model	Mean Prediction Previous Model	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/ Region	Actual Mean Price	Mean Prediction New Model	Mean Prediction Previous Model	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/ Region	Actual Mean Price	Mean Prediction New Model	Mean Prediction Previous Model	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model
NSW1	89.0	70.8	60.6	72.2	74.6	-18.2	-28.4
QLD1	203.4	175.6	424.8	202.6	391.5	-27.8	221.3
SA1	31.1	45.4	79.0	51.8	83.0	14.3	47.9
TAS1	35.4	44.5	32.2	22.6	23.3	9.1	-3.1
VIC1	13.0	30.5	24.8	37.1	38.8	17.4	11.7

Variables that imply binding interconnectors (such as limits and marginal values) are currently not included as features.

Model updated 9th November 2023

Summary

The retraining process resulted in two new models, one for PD and one for 7day price forecast. After experimentation we retrained the price forecasting models using existing features and hyper parameter settings. The increase in accuracy (using Absolute Mean Error, AME) was between 10% and 15% (with the exception of SA which was 50%). (see validation results table below comparing the previous models with the new models).

Training period

Both PD and 7day price forecast models were trained on data between 01-11-2020 and 01-11-2023.
- The market suspension period and a period after the Callide explosion were removed from the training data.

New features (input data) exceptions

The feature set is the same as the previous model. The main benefit of the new model is that more recent market data are consumed in the training process. The training does not weight the most recent market data.
- A number of experiments were conducted by changing the feature list and some hyper parameters however this didn’t not achieve a significant benefit to the model tests. Hence for consistency we have not made changes to the model other than extending the training period to the most recent market data. Unlike the August model update included significant changes to both the feature set and the hyper parameters.
Validation set is from the 1st of November to the 7th of November. Oddly enough the impact of the Melbourne Cup did not have a significant impact on the performance metrics.

	Predispatch model	p7day model

	Predispatch model	p7day model
Model_id	20231108_142510_CB_2	20231108_140809_CB_7D_2
Previous model release date	8th August 2023	8th August 2023
Latest release date (after retraining)	9th November 2023	9th November 2023
Machine Learning Algorithm	Ensemble predictor: gradient boosting	Ensemble predictor: gradient boosting
Training Period	1st October 2020 to 31st October 2023	1st October 2020 to 31st October 2023
Hyper-parameter type	A	A
Hyper-parameter notes	No weighting across time periods	No weighting across time periods
Target	Dispatchprice (30min resolution)	Dispatchprice (30min resolution)
Input (and training) variables	PREDISPATCHPRICE rrp DISPATCHPRICE last 3 day average actual RRP for same time PREDISPATCHREGIONSUM (intervention = 0) hours_out time_of_day day_type (weekday_or_weekend) dispatchableload netinterchage (availablegeneration - uigf) (dispatchablegeneration - scheduled_clearedmw) ss_solar_clearedmw ss_solar_uigf ss_wind_clearedmw ss_wind_uigf totaldemand totalintermittentgeneration	STPASA_REGIONSOLUTION hours_out time_of_day day_type (weekday_or_weekend) aggregatepasaavailability aggregatescheduledload netinterchangeunderscarcity aggregatecapacityavailable ss_solar_cleared ss_wind_uigf ss_wind_cleared ss_solar_uigf demand10 demand50 totalintermittentgeneration demand_and_nonschedgen

Validation Results

These are the results of the best model generated through our experimentation process and compares the results with the previous model across our validation set (our test period).

Average error is equal to Predication minus real prices.

All error metrics are in $/MWhr.

PD Results

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model
NSW1	18.83028	19.57689	0.89392379	1.70619021
VIC1	23.46394	24.8628	-0.865354636	-2.249786354
QLD1	19.29253	19.63574	-2.421301777	-0.030032336
SA1	26.06217	27.25936	0.876947057	-1.019642282
TAS1	12.37019	13.42848	-1.026951774	2.360821585

7 Day Results

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model
NSW1	20.45870012	22.08112427	-8.94232	0.67945
VIC1	28.11448622	29.9156995	-3.37011	1.569229
QLD1	22.74275858	30.79893723	-13.7703	15.93677
SA1	31.05123392	69.518694	5.905758	48.31805
TAS1	18.28810034	21.29027034	-2.81489	-2.97437

8th August 2023 release

Summary

The retraining process resulted in a new models, one for PD and one for 7day price forecast. After considerable experimentation we retrained the price forecasting models with a considerable increase in accuracy (see validation results table below comparing the previous models with the new models).

Training

Both PD and 7day price forecast models were trained on data between 01-10-2020 and 20-08-2023.
- The market suspension period and a period after the Callide explosion were removed from the training data.

New features (input data) exceptions

A new “recent prices feature” was added to give the model some context for the current pricing environment. This feature is the median price for a window around the forecast time for the most recent three days. For example, for a forecast occurring in 2 days at 11am, the model generated the median value for prices between 10am-12pm for the three days before the forecast was run.
A number of additional experiments were conducted on the 7 day forecast model, with the major modifications including:
- The 50th percent probability of demand features were augmented with 10th percent probability of demand data
- The new recent prices features mentioned previously
- A new time of day feature
- A new weekday/weekend feature.
e Cup did not have a significant impact on the performance metrics.

	Predispatch model	p7day model

	Predispatch model	p7day model
Model_id	20230729_213406_CB_2	20230213_012319_CB_7D_1
Previous model release date	20th April 2023	20th April 2023
Latest release date (after retraining)	8th August 2023	8th August 2023
Machine Learning Algorithm	Ensemble predictor: gradient boosting	Ensemble predictor: gradient boosting
Training Period	1st October 2020 to 20th July 2023	1st October 2020 to 20th July 2023
Hyper-parameter type	A	A
Hyper-parameter notes	No weighting across time periods	No weighting across time periods
Target	Dispatchprice (30min resolution)	Dispatchprice (30min resolution)
Input (and training) variables	PREDISPATCHPRICE rrp DISPATCHPRICE last 3 day average actual RRP for same time PREDISPATCHREGIONSUM (intervention = 0) hours_out time_of_day day_type (weekday_or_weekend) dispatchableload netinterchage (availablegeneration - uigf) (dispatchablegeneration - scheduled_clearedmw) ss_solar_clearedmw ss_solar_uigf ss_wind_clearedmw ss_wind_uigf totaldemand totalintermittentgeneration	STPASA_REGIONSOLUTION hours_out time_of_day day_type (weekday_or_weekend) aggregatepasaavailability aggregatescheduledload netinterchangeunderscarcity aggregatecapacityavailable ss_solar_cleared ss_wind_uigf ss_wind_cleared ss_solar_uigf demand10 demand50 totalintermittentgeneration demand_and_nonschedgen

Validation Results

These are the results of the best model generated through our experimentation process and compares the results with the previous model across our validation set (our test period).

Average error is equal to Predicted Prices minus Actual Prices.

All error metrics are in $/MWhr.

PD Results

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model
NSW1	37.16954	63.55691	19.63544	38.2957
VIC1	29.91769	37.21212	6.468355	2.998254
QLD1	36.46522	132.3778	15.79482	112.1593
SA1	45.14167	65.13762	19.56986	35.5676
TAS1	19.53866	17.71054	9.396361	6.877853

7 Day Results

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model

Metric/Region	Absolute Mean Error New Model	Absolute Mean Error Previous Model	Average Error New Model	Average Error Previous Model
NSW1	34.06431	48.10104	15.01693	37.08693
VIC1	31.66891	44.83715	1.215206	19.20936
QLD1	39.10014	53.91976	17.32682	35.32128
SA1	45.46252	104.0136	21.62486	78.54663
TAS1	24.07958	45.70449	14.92562	39.71673

Variables that imply binding interconnectors (such as limits and marginal values) are currently not included as features.

Previous release history

6th June 2023

Summary

The current model was trained using data up until 30th March 2023. Retraining the model included a months data after the retirement of Liddell. We also conducted many experiments by changing the feature set (input variables) as well as boosting (weighting) more recent training periods.

Much to our surprise the performance of all retrained and new models demonstrated that the original model either performed better or more or less equal to all new models when measured against the validation set (which was a week of recent data not used in the training set). As a result we felt it best to keep the current model. Therefore we did not change the model as scheduled for the 6th of June.

Why retraining did not result in a better model (for 6th June)

There are a number of reasons why the original model performed better than all other retrained models however we believe that the this was due to the volatile market dynamics post Liddell retirement and that the market behaviour has since settled. Models that included the weeks after the retirement of Liddell are less representative of current market dynamics and now overstate forecast prices relative to actual market outcomes.

Model Release History

Model updated 5th February 2024

Summary

Training period

Validation set

New features (input data) exceptions

Validation Results

PD Results

7 Day Results

Model updated 9th November 2023

Summary

Training period

New features (input data) exceptions

Validation Results

PD Results

7 Day Results

8th August 2023 release

Summary

Training

New features (input data) exceptions

Validation Results

PD Results

7 Day Results

Previous release history

6th June 2023

Summary

Why retraining did not result in a better model (for 6th June)