
Dear all,
I worked at this as my semestral project in econometrics. I know predicting the market sucks, but that's what econometrics is all about so I could not avoid it :o) and so, here we go!
Quant Trading: Averaging Point Estimates of Linear Regressions as a Way for More Accurate Predictions
The goal of this article is to find out whether using average of more point estimates of different linear regressions will lead to smaller value of RMSE. (root-mean-square error - the lower the value, the better predictions we have). All is programmed in Python3.5 and can be shared upon request. I used AUD/JPY 1-minute data for this estimations, just randomly chosen to avoid any prejudice of research.
First way of realisation of this project was to use multiplies of probability density function (which for our predictions should be of student‘s t-distribution, but for bigger data samples can be aproximated by normal dist.) of normal distribution, but it has not met success because of the data with heteroskedasticity and so our model would not be BLUE anymore (just LUE). And so that is why the point estimates are used.
Classical Assumptions:
Most of the classical assumptions are not met, but as long as we use point estimates we are fine, because the model will not become biased. If anyone were interested more about this topic I can add more. :)
The Economic Model:
There is not much to write about here because if you read this, you are probably experienced forexmospherian (if not, start reading all the great articles around here) and so you know the background of ideas and fxmospherians philosophy.
The way of calculations explained in one simple picture:

Econometrical model:
Simple regression with deterministic trend is used for every prediction.
Yt = α + δt + εt
And for decision making which model is the best one RMSE is used.

We will move every model n-times (which is equal to number_of_data – max_lin_reg_len) and before moving forward, p predictions will be made (for time T+1 to T+5) and its errors calculated into RMSE1 ... RMSEp. In any data window before moving forward, predictions of m models will be made with different linear regression lengths.
i.e.
Mm1 – Model with the longest linear regression. Reffered as T = 100, in the 1st pic.
...
Mmm- Model with the shortest linear regression. Reffered as T = 50, in the 1st pic.
M0: Naive model, where the value of prediction is equal to last close value of given time series.
M1: Model with average of two predictions. For p-th predictions equal:

M2: Model with average of all m predictions from each linear regression model.

Data Tested:
AUD/JPY, from 18th to 23rd December.

Testing model defined as M1
It is described in more details in my project, unfortunately in Czech, so if you were interested I can send original version. So just briefly in English.
This graph shows the robustness of choosing average of two models as a better point estimator. It goes through all parametrs combinations of the length of linear regression. If M1 is better than the two predictions it is calculated from, it wins and is calculated into our graph lower.

In conclusion the average of estimates of two regressions is reasonable when we want to predict the market behavior for time T+3 and further. For T+3 it is better estimater in more than 50% combinations, in T+4 it is better estimator in more than 80% combinations and for T+5 over 90%.
Testing model defined as M2
The same approach as used in previous chapter. But instead of using 2 linear regression, m of them is used and their predictions averaged.

The conclusion here is that using average of m models as a better prediction method does not pays off. Two was optimal.
Comparison with naive Model
Our expectation from what we have been working at is that we will beat the RMSE of naive model.

For our big surprise, I was not able to beat the naive model :o) Functions trying to do so were implemented in code (check_if_beta_stat_signif, check_all_betas_direction, check_for_mean_rev_trading), tested, but without any success.
Conclusion
We found out that predicting behavior on currency pairs for time T+3 is better to use the average of two point predictions of two linear regression with different length.
Estimates based on average of more than two linear regression showed up as useless.
Naive model rulez. hehe
I hope you enjoyed this article. You can also check my latest about interventions on EUR/CZK. Everything You Need to Know About Peg on EUR/CZK and How to Trade it Afterwards
Regards,
Daniel
EDIT: Some fonts are not drawn correctly, so sorry for that.
Comments
But I doubt in real life trading how much this will work, for me market is random.
Regards
My pleasure and I am glad you like it :)
Surely, there is randomness in prices but simultaneously there is none, that's the beauty of it ;))
I would be very surprised and it would certainly indicate that it is a good model, but only when the Naive Model was beaten and unfortunately it was not, so the overall conlusion is what you say, not many chances to work in real.
But for the next research purpose it has brought a great message for others, because it could be a way to reach a better interval of confidence. Because by multiplying two probability density functions of normal distribution with sigma_1 and sigma_2, your new sigma_12 will be smaller than (sigma_1 && sigma_2), which will lead to more accurate intervals of confidence. But just found out it could be a good path for 2 different regressions and in time T+3 (and further) only. Which is a little bit limiting.
nice work, regression always will differentiate depending on starting point in time
Thanks Viktor :)