Thursday, 12 October 2017

Can we predict the future price of the Bitcoin?

Hello guys, 



I hope you are doing well. 

In this new post I would like to explain the aim of my research by using a real life example. Whenever my friends or family (who don't know much about machine learning, statistics, or data analysis) ask me to explain what I am doing with mathematical models and methods in my PhD, I often reply by saying 

"I am a kind of medical doctor.... I analyse batches of data, which represent the behavior of a system, which can be a bridge, a tunnel, a nuclear power plant component, a steam generator, a train, a car, a sensor, a financial product, or whatever you like (also a human-being if you can provide me with some data), and based on the characteristic of these batches, I assess the health condition of the system, and consequently infer if the system is healthy or needs to take some drugs". 

Although the explanation is a bit vague, it does the job and people sort of get an idea about my research. However, as  you know, people are skeptical usually and do not believe that similar mathematical approaches (such as clustering techniques, machine learning methods, etc.) can be used in such wide areas... from bridges, to nuclear power plant components, finance, etc. 

Therefore, in what follows I am going to perform a data analysis of the price of the Bitcoin, by using the Empirical Mode Decomposition (EMD) method, which is used to analyse the behavior of a system. This way, I am going to try to predict the price of the Bitcoin by relying on its historical price and trend. 
Bitcoin is the most famous cryptocurrency, and it can be the currency of the future. Nowadays, a single Bitcoin worth more than 4000$!! 



Do not worry, I am not going to explain the mathematical background of the method, and I promise that you won't see any formulas!! 



  • EMD description



The idea of the EMD method, which was initially developed and proposed by Norden Huang et al., is to decompose a signal into a set of basic modes, which are different frequency components of the signal. Essentially breaking down one signal to a sum of different "simpler" signals with different frequencies. Each non-stationary and nonlinear time varying signal can be represented by a series of signals, where each individual signal has a different frequency. Each frequency component is called Intrinsic Mode Function (IMF). The EMD also provides a residual function, which is monotonic or constant and represents the overall behavior of the signal during the time of the analysis. You can find a good discussion here.




  • Analysis of the Bitcoin price

Disclaimer: this post has been drafted on Tuesday the 3rd of October, and therefore the analysis of the Bitcoin price is based on the data available at that time!! 



Let's consider the closure price of the Bitcoin in the 2107. We are going to firstly analyse the period from January 1st, 2017 to August 28th, 2017. This way, we hope to draw some conclusions regarding the future price of the Bitcoin, and then we can verify our conclusion with the Bitcoin price from August the 29th to today (3rd of October). 


The price of the Bitcoin is available at coinmarketcap.com


It is worth noting that during this step of the analysis, the price of Bitcoin from the January the 1st, 2017 to August the 28th, 2017 is used as the ONLY input to the EMD method!! This means that during the first step of the analysis, we DO NOT KNOW the Bitcoin price after the 28th of August. 




In Figure 1, you can see the price of a Bitcoin in the interval between January the 1st, 2017 (Day 1) and August the 28th, 2017 (Day 243). Figure 1 shows that the price of the Bitcoin has increased a lot in the last year, as you all may know. 

Figure 1. Bitcoin price from January to August 
If the Bitcoin price of Figure 1 is used as input to the EMD method, we obtain Figure 2, which shows all the frequency components of the Bitcoin price. Particularly, the first IMF (IMF 1), which shows the highest frequency, depicts a variation of the Bitcoin price that is very similar to the actual variation of the price of the Bitcoin (as shown in Figure 1). Indeed, the first IMF (IMF 1) represents the fastest variations of the Bitcoin price, and consequently it represents the daily variation of the Bitcoin price. The "slower" IMFs, such as IMF 2 and IMF 3, show a trend analogous to IMF 1, but slower in frequency. The analysis of the slowest frequency component of the Bitcoin price, which is IMF 4, is very interesting. In fact, Figure 2 shows that IMF 4 (fourth plot from the left end of Figure 2) has 6 extreme values (3 peaks, and 3 valleys), which are almost equally spaced on time, and therefore a more detail analysis may lead to some interesting conclusion. Finally, the analysis of the residuals shows that the price of the Bitcoin has definitely increased during the first 8 months of the year.


Figure 2. EMD analysis of the Bitcoin price in the period January to August
A deeper analysis of the IMF 4, which shows the slowest variation of the Bitcoin price, is shown in Figure 3. Figure 3 shows the price of the Bitcoin on the top plot, whilst the IMF 4 is depicted on the bottom plot. The red dotted vertical lines are depicted at the time of the extreme values of the IMF 4. It is straightforward to note that the peak values of IMF 4 are usually earlier in time than the corresponding peak values of the Bitcoin price. For example, the IMF 4 shows a peak at day 159, which corresponds to the peak of the Bitcoin price at day 162, and consequently by analyzing the behavior of IMF 4 we might predict a peak value of the Bitcoin price!! A similar behavior is observed also for the peak at day 54, which is not observed due to the large time scale of Figure 3. By following this conclusion, we can claim that the final peak at day 238 (August the 26th) is anticipating a peak value of the Bitcoin price, which will then decrease in the near future.
It should be noted that when IMF 4 decreases towards a valley, the Bitcoin price is lower than the previous peak value. This means that the Bitcoin price usually reaches a peak value, and then starts to decrease. Finally, it is worth noting that the valleys of the IMF 4, which are the minimum values of the IMF 4, are usually reached a couple of days later than the actual local minimum value of the Bitcoin price, as shown in Figure 3 around day 200.





Figure 3. Bitcoin price vs IMF4


Before moving to the analysis of the Bitcoin price from August the 29th to October the 3rd, which has NOT been used so far, due to the fact the EMD analysis has been carried out by using only the Bitcoin price from January to August, we can summarize the results of the EMD analysis as follows: 

1.     The EMD analysis has shown that the Bitcoin price can be decomposed into several frequency components
2.     The fastest frequency components follow the daily/weekly variability of the Bitcoin price
3.     The slowest frequency component (IMF 4) seems able to predict a peak in the Bitcoin price, which would then decreases for a period of time that on average lasts for 35/40 days from the peak value of IMF 4.
4.     Given these results, we can expect that the Bitcoin price will have a peak during the first days of September, which will be then followed by a period of time where the Bitcoin price decreases, i.e. the Bitcoin price will be lower than the peak value reached at the beginning of September. After the period where the Bitcoin price will decrease, we can expect a new increase of the Bitcoin price after the local minimum of IMF 4, that should be reached around 35/40 days after the peak at day 238.

At this point, we can verify our results by analyzing the Bitcoin price from August the 29th. Figure 4 shows the Bitcoin price for January the 1st to October the 3rd, where the red line depicts the Bitcoin price from August the 29th to October the 3rd. Again, it should be noted that the price of the Bitcoin during this period, red line in Figure 4, is not used in the previous EMD analysis, and consequently the results and conclusion of the EMD analysis have been achieved WITHOUT knowing the Bitcoin price during September. 



Figure 4 shows that the EMD analysis results are correct: a peak value of the Bitcoin price, which is followed by a decreasing trend, is verified. Indeed, the red line, which shows the unknown Bitcoin price, shows that the Bitcoin price reaches a peak value at day 244 (September the 1st), and then decreases. Particularly, it is worth noting that the peak value is not reached anymore during September. 

Figure 4. Price of Bitcoin from January to October. Red line is the price of the Bitcoin that is not used during the EMD analysis



Finally, we can use the bitcoin price during the whole period, from January to October, as input to the EMD method with the aim of analyzing the Bitcoin price during the year so far, and trying to predict the future Bitcoin price. 

Figure 5 shows the EMD results of the EMD analysis of the price of the Bitcoin during the whole year. We can see that again the Bitcoin price can be decomposed into 4 frequency components: the fastest components (IMF 1, 2 and 3) follow the variability of the Bitcoin price, whilst the slowest component (IMF 4) shows the variation of the Bitcoin price over a long period of time. IMF 4 shows 7 extreme values: a) the first 5 extreme values are those of Figure 2 and 3, which represent the maximum and minimum values of the Bitcoin price during the period of time that has been analysed previously;  b) the 6th extreme value, which is the peak at day 238, represent the maximum value of the Bitcoin price that has been predicted by the EMD analysis; c) the 7th extreme value, which is the minimum value at day 268, represent the drop of the Bitcoin price at day 257 (September the 14th). It should be noted that this last minimum values occurs 30 days after the peak value at day 238, and therefore a bit earlier than the expected time which has been obtained by the EMD during the first step of the analysis (35/40 days of average decreasing time)

The EMD analysis of the Bitcoin price during the period from January to October seems to confirm the discussion that has been presented in above, and consequently we can expect that the Bitcoin price will stay above the value of September the 14th and  continue to grow during the next weeks... 


Figure 5. EMD analysis of the Bitcoin price in the period January to October


The brief analysis of the Bitcoin price, which has been presented, is a good example of data analysis of a system. Indeed, consider a system where we have multiple evidences of the system behavior continuously, for example, measurements of the acceleration of a bridge, the temperature of a water supply system, the current of an electrical component, etc. we can analyse those data by computing their frequency components, and consequently if the frequency components of the system change suddenly, we can spot a unexpected behavior of the system!! However, a more detail and complex data analysis is usually requested in order to adequately assess the health state of a system.



Finally, I would like to mention that the analysis of the Bitcoin price is only a brief and superficial analysis, carried out for a bit of fun and as an interesting example of how my work can be used in many different aspects of real life, and consequently it is not an advice to invest some money on the Bitcoin. In fact, I have only considered the closure price of the Bitcoin, without considering other variables (such as the market cap, the daily news from newspapers or governments, events, holidays, market decisions of companies, daily trade volume, etc.) that can influence the Bitcoin price. A more detailed analysis of the Bitcoin price could be done by considering those variables that can influence the price of the Bitcoin, and at the same time, a more reliable prediction of the price of the Bitcoin could potentially be achieved by using machine learning methods, such as ANN, or forecasting algorithms. 




I hope you enjoyed this post!! 

Stay tuned, 



Matteo