Summary

pd4castr 7day price forecasts are significantly more accurate than AEMO’s forecast. Primarily this is due to the fact that participants are only obligated to submit accurate available capacity in their initial bids. Price band volumes are not restructured until within, or just prior to, the predispatch time horizon.

pd4castr PD price forecasts are generally more accurate than AEMO forecasts when the forecast period is many hours into the future. AEMO price forecasts change considerably from the first predispatch to later predispatch runs as the forecast period approaches. This is primarily caused by participant's unit commitments and the repricing of volume that best matches a participants contract/portfolio position.

pd4castr forecasts are derived in a matter of seconds by a model which is formulated by Machine Learning Algorithm (MLA) trained with historical data. This model effectively anticipates changes in participant behaviour based on known market conditions and prior to actual changes in participant behaviour in real time.

Current model details

Machine learning algorithm details

Algo type:

Random Forest Regression

Hyper parameters:

No weighting across time periods. It is likely that we’ll add weighting to more recent training periods.

To be disclosed with later release stages.

Input variables (features) for pd4castr predispatch

Note that more variables are used as inputs into the model than are presented in the Model Input Chart in the Home page of pd4castr. Chart variables are representative of the key variables of the model but changes in other variables will also impact the model forecast.

Variables that imply binding interconnectors (such as limits and marginal values) are currently not included as features.

Target:

dispatchprice by region

Training period

From March 2018 to 27th of November 2022.

List of features:

From Predispatch tables

HOURS_OUT,
PD_RRP_NSW1, PD_RRP_QLD1, PD_RRP_SA1, PD_RRP_TAS1, PD_RRP_VIC1,
DISPATCHABLELOAD_NSW1, DISPATCHABLELOAD_QLD1, DISPATCHABLELOAD_SA1, DISPATCHABLELOAD_TAS1, DISPATCHABLELOAD_VIC1, NETINTERCHANGE_NSW1, NETINTERCHANGE_QLD1, NETINTERCHANGE_SA1, NETINTERCHANGE_TAS1, NETINTERCHANGE_VIC1,
RRP_ROLLING_NSW1, RRP_ROLLING_QLD1, RRP_ROLLING_SA1, RRP_ROLLING_TAS1, RRP_ROLLING_VIC1,
SCHEDULED_AVAIL_NSW1, SCHEDULED_AVAIL_QLD1, SCHEDULED_AVAIL_SA1, SCHEDULED_AVAIL_TAS1, SCHEDULED_AVAIL_VIC1,
SCHEDULED_GEN_NSW1, SCHEDULED_GEN_QLD1, SCHEDULED_GEN_SA1, SCHEDULED_GEN_TAS1, SCHEDULED_GEN_VIC1,
SS_SOLAR_CLEAREDMW_NSW1, SS_SOLAR_CLEAREDMW_QLD1, SS_SOLAR_CLEAREDMW_SA1, SS_SOLAR_CLEAREDMW_TAS1, SS_SOLAR_CLEAREDMW_VIC1,
SS_SOLAR_UIGF_NSW1, SS_SOLAR_UIGF_QLD1, SS_SOLAR_UIGF_SA1, SS_SOLAR_UIGF_TAS1, SS_SOLAR_UIGF_VIC1,
SS_WIND_CLEAREDMW_NSW1, SS_WIND_CLEAREDMW_QLD1, SS_WIND_CLEAREDMW_SA1, SS_WIND_CLEAREDMW_TAS1, SS_WIND_CLEAREDMW_VIC1,
SS_WIND_UIGF_NSW1, SS_WIND_UIGF_QLD1, SS_WIND_UIGF_SA1, SS_WIND_UIGF_TAS1, SS_WIND_UIGF_VIC1,
TOTALDEMAND_NSW1, TOTALDEMAND_QLD1, TOTALDEMAND_SA1, TOTALDEMAND_TAS1, TOTALDEMAND_VIC1,
TOTALINTERMITTENTGENERATION_NSW1, TOTALINTERMITTENTGENERATION_QLD1, TOTALINTERMITTENTGENERATION_SA1, TOTALINTERMITTENTGENERATION_TAS1, TOTALINTERMITTENTGENERATION_VIC1, IS_5MS

Formulating a model

A MLA is used to formulate a model. A model is basically a giant equation that consumes real time data, the output being the price forecast. Formulating a model takes a long time and retraining a known model (using new historical information) takes many hours. Running a model takes only a few seconds.

The steps taken to formulate a model follows:

The MLA: First you must understand the plethora of available MLAs and choose one that best matches the problem you’re trying to solve.
The Features: Features is the name given to the input variables used to predict the target (price forecast). Once you’ve selected your MLA you need to experiment with and determine the set of features you will use to train the MLA.
Training periods: Once you’ve selected the MLA and Features then you need to determine the historical time periods to train the MLA. You can also weight time periods, for example you may weight recent time periods more than distant periods.
Tuning: There are a number of Hyper Parameters that you need to experiment with and tune in order to derive a meaningful model. These hyper parameters are part of the MLA.
Performance: Finally you need to assess the performance of each model by testing the model and comparing it with other models and finally with AEMO predispatch.
Release: Once you’re satisfied with a model you can release it to production. In time we expect there to be more than one model that the user may select which will produce a price forecast in real time.

Retraining the model

Typically the retraining will simply include the most recent historical market information. I.e. market outcomes that has come to pass since the last retraining effort. However retraining could involve any of the steps listed in “formulating a model” (above).

The diagramme below shows the re-training process.

pd4castr knowledge base > Machine Learning Algorithm Process > image-20221130-023431.png