Of course there’s a hype around bitcoin and block-chain so I decided to try to make my own predictor. First thing is understanding the data available.
Here’s a graph of the value since 2010 till currently 2018, with the price value on the y axis you can see when it was worth less than a dollar to when it was worth thousands of dollars and the continued upward trend and the bubble bursting and it starts having regular peaks and lows.
Viewing this also using pandas...
Next step is start some work on a predictor, first thing I need to do is compare today’s price to yesterday’s. So I create a new column of the Value but shift it up by 1.
df['Value 2'] = df['Value USD'].shift(-1) df.head()
I decided not to go with
pct_change because it didn’t predict well an increase from yesterday’s price and instead went with the below that aggregates the change, the mean change for the past 7 days and also a count of how many times in the same period the price has dropped or increased.
With this I could make a column that predicts what is likely happening; 0 means things are going ok and you should invest hard or keep the investments if you’ve invested enough. 50 means there’s some sign of something bad coming, to alert the program to be prepared. -50 means there’s a clear sign things are going down and you should invest more on withdrawing some of you’re investment, -75 means clearly things are going “really” bad and you should pull out hard before you’re stock losses value too much.
The code that creates this is below that used the drop 7, up 7, mean change 7 and change to determine how to classify this. It’s a bit crude but does the job “will prove this further down”.
With that here’s the prediction column
You can also preview the latest version of this code also on a python notebook.
Using this I can start work on a Machine Learning program that predicts with this. But first to satisfy my curiosity before any model gives me an accuracy I can live with and the fact I’m interested in Reinforment Learning like AI playing games that get really good after playing a game multiple times, you can get one of this on openai gym which is conviniently in python so I got a good idea on how one of the game works and designed my own bitcoin game.
The code might seem intimidating but it shouldn’t be, it basically used the data from the pandas dataframe to supply some info, receive some action to take for the designated player and returns results of the next day’s trade with the new actions done like buying more bitcoins or converting the money spent as bitcoins back to cash.
The initial amount given to each player is 1000, I’ve not seen the need to increase or decrease this amount since if you take the same actions the profit will scale linearly.
The inputs to the bitcoin game is the amount; default 1000, user_key; if none is given the same as starting a new game so it creates a new player and returns their user key and the first days trade info. Based on amount invested and next day’s value I can calculate how much profit or loss you got and with this returning the amount invested, amount the player still has and whether the game is still on == True else False the player should not ask for the next day info. Using the dataframe’s index I can tell if I’ve reached the last index and return the game has ended. Also if you go broke on both bitcoins and amount in hand somehow the game also ends, same applies if you go 5 days with 0 investment in bitcoins. Also keeping a record on investment history and withdrawal history for graphing pursoses I can see how this all breaks down. Better than reviewing thousands of print lines on how things went on.
The above is my design for player_1 who like some of the advice I’ve heard for investing would be quick to withdraw if he gets profit, more than his initial amount of 1000 and then play this back and forth with the game. This is not the wisest move to make huge winnings though.
The graph below shows three lines; the red one is amount invested/bitcoins bought, the blue line is the amount the user has “starting with initial amount + withdrawals throughout & minus investments in bitcoins”
As you can tell the player does make gradual profits but the chart for investment in bitcoins never peaks because if it goes beyond a point he’s quick to withdraw and “make a killing”. I’m also going to point one a key thing; from 2017 to early 2018 the chart skyrockets, this was a good chance to make a killing but with little invested in bitcoin and it’s price ever becoming higher the returns kept dwindling and as a result the player never makes that much. On the other hand from 2018 onwards as it crashes he remains immune to the effect.
Note if you invest from 2018 you’re likely in a chart that has a downward trend so any way you invest you’re still going to make losses, the only logical reason to invest is if you think it’s going to pick up and if it was like stocks you’re buying out those being sold off, but they’re not the same thing, it would be wiser to start investing only when there’s a trend upwards and be cautious of a dip too much from the initial investment amount.
To put this into perspective, here’s investing from the start of 2017
Less profit margin, the code also does output the final amount and invested bitcoin amount if you’re curious but here I’m only sticking to the chart.
Then in 2018 it’s hard to notice from the chart but the player ends up making a loss of around $150.
I’ve tried also some variations to it with some other withdraw_max and invest_max but the end result doesn’t deviate as much from the original. For example:
player_1(plot=True, hold_max=10000, output=False, withdraw_max=2000)
The next version is lazy investor who invest the amount given minus $100 and simply does nothing else for the rest of the game to see also the possible maximum profit if you really went all in.
One more key thing is I can get back information on where the prediction was off, to know whether the program relying on the prediction is getting bad advice or it’s itself making wrong decisions.
player_1(miss_plot=True, miss_output=False, output=False)
I silenced the output which returns the day’s trading info and miss_output though maybe useful since it returns only the day’s info with a misleading predict info but the same info is applied to this graph with miss_plot that plots the chart with the red dots as point where the predict value didn’t match with the calculated profit/loss i.e. it would predict a loss but infact there was a profit or vice versa. There’s no need to plot this again for any other player since it’s the same info based on the predict value and profit loss value which remain the same regardless of player moves i.e. will they make a profit or loss if they have any amount invested.
The code below is for the lazy_investor
With the red line as the amount invested and blue as value which almost looks flatlined compared to the millions of the invested bitcoin values. The last output
day: 7/22/2018 profit: 0 loss: 384595.5799999982 amount: 100 invested: 37813472.26
basically $37 million and a loss of $384k miniscule compared to the amount earned but still there was a peak which I can estimate based on the graph would be $100 million from an initial investment of $900. Not bad!
The lazy_investor still suffers the same way if he invested more later on, for example in the start of 2017 his chart and profit is almost in line with the value of bitcoin
and from 2018 no one is immune from losses
So if you’re keeping track with this you’ll get the gist of it, no I wouldn’t advice you to invest in the market right now, it’s volatile and until it starts to pick up you’re likely to make losses regardless.
My final player, for now, player_2 takes the mistakes of player_1 and good principles of lazy_investor and incorporates the two of these with a cautious investor.
The player keeps a record of the maximum value ever traded and if it drops uses it as a reference of loss henceforth ontop of the other loss provided by the bitcoin_game. Using this plus other things like the number of negative and positive predictions in the week note that even though there are some wrong predictions they don’t affect the peaks nor the troughs as much and are small amount of the actual good predictions. Using these it makes some investments or withdrawals relative to the amount on hand or invested i.e. withdraw 1/4 or 1/10 or invest 1/4 or 1/10 of the amount in bank. So this way there’d be a gradual increase of both investment and amount withdrawn relative to the state of things in the chart. This is the output of running it from day 1
Reached the final day, game ending day: 7/22/2018 profit: 0 loss: 97605.4140491 amount: 10956981.2764 invested: 9596573.45405
About $20 million in total a good amount and not too far from the lazy investor but far far much better from player_1 who ends up with roughly $6,400 in total.
player_2(plot=True, show_exception=False, output=False)
You can see a clear correlation between the two with a balance between “greed” and “smarts”.
Still putting it through the same future scenarios
player_2(plot=True, show_exception=False, output=False, start_index=2064)
You can see the player going all in as things pick up till they have practically nothing left to invest and in the same heart when things start to go down it starts withdrawing not all at once but a sizable portion depending on the stats till things pick up, which they do before it has a chance to fully withdraw everything.
A final analysis of 2018, the year no one can make a profit the player struggles to make a valid move as the total amount continues to dwindle.
player_2(show_exception=False, start_index=2422, output=False, plot=True)
The next step now is making an ML predictor that doesn’t play the game or predict the graph but the predict column given a sample.
Starting with an SGD from scikit-learn
The accuracy is pretty low, highest I got it was 66% i.e. 0.66 not good enough.
Next try is random forest
With all the columns it get’s 79% accuracy, with only the columns I used to work out predict I get 97% accuracy, amazing! Note that SGD performed even worse with just the trimmed columns with a best of 60% accuracy but it varies greatly per run unlike Random Forest which even rerun returns the exact same accuracy down to the decimals.
I will likely keep updating this post instead of writing a new one for any further improvements in this project, unless something I find out is good enough to stand alone as it’s own article, the likely candidate for that would be an ML player with Reinforcement Learning.
That’s what I have so far, you can checkout the most recent code here.