Hi, I just read this post and, as a fellow author myself, I hope my comments are helpful and constructive.
This post is poorly structured and not reviewed (nor re-read by the author before posting):
- The table of contents doesn’t properly show what the reader is about to go through (e.g. the introduction is not seen anywhere)
- Titles are inconsistent (i.e. why does the 3rd and 5th section have a number but the 4th doesn’t?)
- The content is repeated (the code snippets)
Also, I think it provides 0 value to the reader. While the theory explained made make sense and it’s written in an understandable manner, that doesn’t mean it teaches how to use machine learning and statistical arbitrage on pairs (as ANYONE would expect when coming across this post). The code itself is worthless: we’re taught to get the data, transform it and build a RF. Many other posts do so with better explanations.
If you’re talking about mean regression strategies, stationarity, cointegration… Why don’t you use Python to compute those? “Live-test” the pair you’ve used and show the results (even if they’re bad—we expect to learn, not to get the perfect trading pair).
Also, if you build a model that’s money-related, you must ALWAYS compute any kind of USEFUL metric to assess its performance. Accuracy is cool and insightful, but the purpose is to make money and that’s what should be optimized IMO.
Hope my humble opinion helps to make future contents better.