Summary
pd4castr 7day price forecasts are significantly more accurate than AEMO’s forecast. Primarily this is due to the fact that participants are only obligated to submit accurate available capacity in their initial bids. Price band volumes are generally not restructured to represent genuine intent until within, or just prior to, the predispatch time horizon.
pd4castr PD price forecasts are generally more accurate than AEMO forecasts when the forecast period is many hours into the future. AEMO price forecasts may change considerably from the first predispatch to later predispatch runs as the forecast period approaches. This is primarily caused by participant's unit commitments and the repricing of volume that best matches a participants contract/portfolio position.
pd4castr forecasts are derived in a matter of seconds by a model which is formulated by a Machine Learning Algorithm (MLA) trained with historical data. This model effectively anticipates changes in participant behaviour based on known market conditions and prior to the actual changes in participant behaviour in real time.
Current model details
Machine learning algorithm details
Algo type:
Random Forest Regression
Hyper parameters:
No weighting across time periods. It is likely that we’ll add weighting to more recent training periods.
To be disclosed with later release stages.
Input variables (features) for pd4castr predispatch
Note that more variables are used as inputs into the model than are presented in the Model Input Chart in the Home page of pd4castr. Chart variables are representative of the key variables of the model but changes in other variables will also impact the model forecast.
Info |
---|
Variables that imply binding interconnectors (such as limits and marginal values) are currently not included as features. |
Target:
dispatchprice by region
Training period
From March 2018 to 27th of November 2022.
List of features:
From Predispatch tables
HOURS_OUT,
PD_RRP_NSW1, PD_RRP_QLD1, PD_RRP_SA1, PD_RRP_TAS1, PD_RRP_VIC1,
DISPATCHABLELOAD_NSW1, DISPATCHABLELOAD_QLD1, DISPATCHABLELOAD_SA1, DISPATCHABLELOAD_TAS1, DISPATCHABLELOAD_VIC1, NETINTERCHANGE_NSW1, NETINTERCHANGE_QLD1, NETINTERCHANGE_SA1, NETINTERCHANGE_TAS1, NETINTERCHANGE_VIC1,
RRP_ROLLING_NSW1, RRP_ROLLING_QLD1, RRP_ROLLING_SA1, RRP_ROLLING_TAS1, RRP_ROLLING_VIC1,
SCHEDULED_AVAIL_NSW1, SCHEDULED_AVAIL_QLD1, SCHEDULED_AVAIL_SA1, SCHEDULED_AVAIL_TAS1, SCHEDULED_AVAIL_VIC1,
SCHEDULED_GEN_NSW1, SCHEDULED_GEN_QLD1, SCHEDULED_GEN_SA1, SCHEDULED_GEN_TAS1, SCHEDULED_GEN_VIC1,
SS_SOLAR_CLEAREDMW_NSW1, SS_SOLAR_CLEAREDMW_QLD1, SS_SOLAR_CLEAREDMW_SA1, SS_SOLAR_CLEAREDMW_TAS1, SS_SOLAR_CLEAREDMW_VIC1,SS_SOLAR_UIGF_NSW1, SS_SOLAR_UIGF_QLD1, SS_SOLAR_UIGF_SA1, SS_SOLAR_UIGF_TAS1, SS_SOLAR_UIGF_VIC1,
SS_WIND_CLEAREDMW_NSW1, SS_WIND_CLEAREDMW_QLD1, SS_WIND_CLEAREDMW_SA1, SS_WIND_CLEAREDMW_TAS1, SS_WIND_CLEAREDMW_VIC1,
SS_WIND_UIGF_NSW1, SS_WIND_UIGF_QLD1, SS_WIND_UIGF_SA1, SS_WIND_UIGF_TAS1, SS_WIND_UIGF_VIC1,
TOTALDEMAND_NSW1, TOTALDEMAND_QLD1, TOTALDEMAND_SA1, TOTALDEMAND_TAS1, TOTALDEMAND_VIC1,
TOTALINTERMITTENTGENERATION_NSW1, TOTALINTERMITTENTGENERATION_QLD1, TOTALINTERMITTENTGENERATION_SA1, TOTALINTERMITTENTGENERATION_TAS1, TOTALINTERMITTENTGENERATION_VIC1, IS_5MS
...
Formulating a model
A An MLA is used to formulate a model. A model is basically effectively a giant big equation that consumes real time data, the output being the price forecastpredispatch and STPASA data to forecast price. Formulating a model takes a long time and retraining a known model (using new historical information) takes many hours. Running However once a model has been formulated, running it only takes only a few seconds.
The steps taken components required to formulate a model follows:
The MLA: First you must understand the plethora of available MLAs and choose one that best matches the problem you’re trying to solve. For example the MLA “random forest regression” is great for forecasting a value for a system that follows a well behaved distribution. However spot prices do not follow a well behaved distribution hence the selected MLA must be flexible enough to predict relatively frequent outlier outcomes.
The Features: Features is the name given to the input variables used to predict the target (price forecast). Once you’ve selected your MLA you need to then experiment with and determine the set various sets of features you will use to train the MLA.
Training periods: Once you’ve selected the MLA and Features then you need to determine the historical time periods to train the MLA. You can also weight time periods, for example you may weight recent time periods more than distant periods.
Tuning: There are a number of Hyper Parameters that you need to experiment with and tune in order to derive a meaningful model. These hyper parameters are part of the MLA.
Performance: Finally you need to assess the performance of each model by testing the model and comparing it with other models and finally with AEMO predispatch.
Release: Once you’re satisfied with a model you can release it to production. In time, we expect there to be more than one model that the user may select which will produce a price forecast in real time.
...
Retraining the model
Typically the retraining will simply include Retraining improves the price forecast because it includes the most recent historical market information. I.e. market outcomes that has come to pass since the last retraining effort. However retraining could involve any of the steps listed in “formulating a model” (above).
...