Tuesday, 27 November 2018

Notes on Reinforcement Learning and Machine Learning for Cryptocurrency Trading

Last week I gave a talk at the Google Developer Group in Reading about some work I've been doing investigating the possibilities around using machine learning to drive an automated trading platform.

The talk walked through the basics of reinforcement learning, technical analysis and the basic ideas behind trading for a profit and some ideas around using technical analysis in feature engineering for the model and how to architect a system as a collection of microservices processing messages from PubSub to predict on live data and execute trades.

One of the key problems touched on was how an unfiltered stream of data from the trading platform means that trying to potentially classify trends as 'good' or 'bad' means that small mistakes incur the costs of buying and selling and the bot can end up thrashing it's account balance away.
But the main problem that makes reinforcement learning ineffective when employed in a straightforward manner to automated trading is that RL learns that an action causes a transition between some state to another, which is why memories are stored as State, Action, Next State, Reward. The agent however, is never actually affecting the state of the system. The state is the same whether we buy, sell, do nothing, or close a trade that's successful or otherwise.
As such, because of the limited number of actions possible, the training can be unrolled into a simple standard learning problem where you potentially predict the 'reward' of an action on a specific state.

After picking apart the data and using principal component analysis, I saw that there was so much overlap between the states where it was profitable and unprofitable that they were indistinguishable in the dataset that I examined (ETHBTC for the previous 12 months at 1H candles, using a variant of crossover as the entry points and looking to sell around 10 hours later).

It makes more sense in retrospect, that this problem would be better approached by finding a passable simple strategy and then using machine learning to try and reduce the number of unsuccessful trades. However the process was useful in learning how to engineer features, construct models, run them in production, and the techniques to assess their accuracy and suitability.


  1. Precise, yet not moving – Analytics must be exact. Notwithstanding, there are regions where the outcomes probably won't move due certainty.Data Analytics Course


Google Developer Group MK at Bletchley Park

The Milton Keynes GDG hosted their December meetup at The National Museum of Computing inside Bletchley Park. We had a detailed demons...