Thursday, 12 October 2017

Can we predict the future price of the Bitcoin?

Hello guys, 



I hope you are doing well. 

In this new post I would like to explain the aim of my research by using a real life example. Whenever my friends or family (who don't know much about machine learning, statistics, or data analysis) ask me to explain what I am doing with mathematical models and methods in my PhD, I often reply by saying 

"I am a kind of medical doctor.... I analyse batches of data, which represent the behavior of a system, which can be a bridge, a tunnel, a nuclear power plant component, a steam generator, a train, a car, a sensor, a financial product, or whatever you like (also a human-being if you can provide me with some data), and based on the characteristic of these batches, I assess the health condition of the system, and consequently infer if the system is healthy or needs to take some drugs". 

Although the explanation is a bit vague, it does the job and people sort of get an idea about my research. However, as  you know, people are skeptical usually and do not believe that similar mathematical approaches (such as clustering techniques, machine learning methods, etc.) can be used in such wide areas... from bridges, to nuclear power plant components, finance, etc. 

Therefore, in what follows I am going to perform a data analysis of the price of the Bitcoin, by using the Empirical Mode Decomposition (EMD) method, which is used to analyse the behavior of a system. This way, I am going to try to predict the price of the Bitcoin by relying on its historical price and trend. 
Bitcoin is the most famous cryptocurrency, and it can be the currency of the future. Nowadays, a single Bitcoin worth more than 4000$!! 



Do not worry, I am not going to explain the mathematical background of the method, and I promise that you won't see any formulas!! 



  • EMD description



The idea of the EMD method, which was initially developed and proposed by Norden Huang et al., is to decompose a signal into a set of basic modes, which are different frequency components of the signal. Essentially breaking down one signal to a sum of different "simpler" signals with different frequencies. Each non-stationary and nonlinear time varying signal can be represented by a series of signals, where each individual signal has a different frequency. Each frequency component is called Intrinsic Mode Function (IMF). The EMD also provides a residual function, which is monotonic or constant and represents the overall behavior of the signal during the time of the analysis. You can find a good discussion here.




  • Analysis of the Bitcoin price

Disclaimer: this post has been drafted on Tuesday the 3rd of October, and therefore the analysis of the Bitcoin price is based on the data available at that time!! 



Let's consider the closure price of the Bitcoin in the 2107. We are going to firstly analyse the period from January 1st, 2017 to August 28th, 2017. This way, we hope to draw some conclusions regarding the future price of the Bitcoin, and then we can verify our conclusion with the Bitcoin price from August the 29th to today (3rd of October). 


The price of the Bitcoin is available at coinmarketcap.com


It is worth noting that during this step of the analysis, the price of Bitcoin from the January the 1st, 2017 to August the 28th, 2017 is used as the ONLY input to the EMD method!! This means that during the first step of the analysis, we DO NOT KNOW the Bitcoin price after the 28th of August. 




In Figure 1, you can see the price of a Bitcoin in the interval between January the 1st, 2017 (Day 1) and August the 28th, 2017 (Day 243). Figure 1 shows that the price of the Bitcoin has increased a lot in the last year, as you all may know. 

Figure 1. Bitcoin price from January to August 
If the Bitcoin price of Figure 1 is used as input to the EMD method, we obtain Figure 2, which shows all the frequency components of the Bitcoin price. Particularly, the first IMF (IMF 1), which shows the highest frequency, depicts a variation of the Bitcoin price that is very similar to the actual variation of the price of the Bitcoin (as shown in Figure 1). Indeed, the first IMF (IMF 1) represents the fastest variations of the Bitcoin price, and consequently it represents the daily variation of the Bitcoin price. The "slower" IMFs, such as IMF 2 and IMF 3, show a trend analogous to IMF 1, but slower in frequency. The analysis of the slowest frequency component of the Bitcoin price, which is IMF 4, is very interesting. In fact, Figure 2 shows that IMF 4 (fourth plot from the left end of Figure 2) has 6 extreme values (3 peaks, and 3 valleys), which are almost equally spaced on time, and therefore a more detail analysis may lead to some interesting conclusion. Finally, the analysis of the residuals shows that the price of the Bitcoin has definitely increased during the first 8 months of the year.


Figure 2. EMD analysis of the Bitcoin price in the period January to August
A deeper analysis of the IMF 4, which shows the slowest variation of the Bitcoin price, is shown in Figure 3. Figure 3 shows the price of the Bitcoin on the top plot, whilst the IMF 4 is depicted on the bottom plot. The red dotted vertical lines are depicted at the time of the extreme values of the IMF 4. It is straightforward to note that the peak values of IMF 4 are usually earlier in time than the corresponding peak values of the Bitcoin price. For example, the IMF 4 shows a peak at day 159, which corresponds to the peak of the Bitcoin price at day 162, and consequently by analyzing the behavior of IMF 4 we might predict a peak value of the Bitcoin price!! A similar behavior is observed also for the peak at day 54, which is not observed due to the large time scale of Figure 3. By following this conclusion, we can claim that the final peak at day 238 (August the 26th) is anticipating a peak value of the Bitcoin price, which will then decrease in the near future.
It should be noted that when IMF 4 decreases towards a valley, the Bitcoin price is lower than the previous peak value. This means that the Bitcoin price usually reaches a peak value, and then starts to decrease. Finally, it is worth noting that the valleys of the IMF 4, which are the minimum values of the IMF 4, are usually reached a couple of days later than the actual local minimum value of the Bitcoin price, as shown in Figure 3 around day 200.





Figure 3. Bitcoin price vs IMF4


Before moving to the analysis of the Bitcoin price from August the 29th to October the 3rd, which has NOT been used so far, due to the fact the EMD analysis has been carried out by using only the Bitcoin price from January to August, we can summarize the results of the EMD analysis as follows: 

1.     The EMD analysis has shown that the Bitcoin price can be decomposed into several frequency components
2.     The fastest frequency components follow the daily/weekly variability of the Bitcoin price
3.     The slowest frequency component (IMF 4) seems able to predict a peak in the Bitcoin price, which would then decreases for a period of time that on average lasts for 35/40 days from the peak value of IMF 4.
4.     Given these results, we can expect that the Bitcoin price will have a peak during the first days of September, which will be then followed by a period of time where the Bitcoin price decreases, i.e. the Bitcoin price will be lower than the peak value reached at the beginning of September. After the period where the Bitcoin price will decrease, we can expect a new increase of the Bitcoin price after the local minimum of IMF 4, that should be reached around 35/40 days after the peak at day 238.

At this point, we can verify our results by analyzing the Bitcoin price from August the 29th. Figure 4 shows the Bitcoin price for January the 1st to October the 3rd, where the red line depicts the Bitcoin price from August the 29th to October the 3rd. Again, it should be noted that the price of the Bitcoin during this period, red line in Figure 4, is not used in the previous EMD analysis, and consequently the results and conclusion of the EMD analysis have been achieved WITHOUT knowing the Bitcoin price during September. 



Figure 4 shows that the EMD analysis results are correct: a peak value of the Bitcoin price, which is followed by a decreasing trend, is verified. Indeed, the red line, which shows the unknown Bitcoin price, shows that the Bitcoin price reaches a peak value at day 244 (September the 1st), and then decreases. Particularly, it is worth noting that the peak value is not reached anymore during September. 

Figure 4. Price of Bitcoin from January to October. Red line is the price of the Bitcoin that is not used during the EMD analysis



Finally, we can use the bitcoin price during the whole period, from January to October, as input to the EMD method with the aim of analyzing the Bitcoin price during the year so far, and trying to predict the future Bitcoin price. 

Figure 5 shows the EMD results of the EMD analysis of the price of the Bitcoin during the whole year. We can see that again the Bitcoin price can be decomposed into 4 frequency components: the fastest components (IMF 1, 2 and 3) follow the variability of the Bitcoin price, whilst the slowest component (IMF 4) shows the variation of the Bitcoin price over a long period of time. IMF 4 shows 7 extreme values: a) the first 5 extreme values are those of Figure 2 and 3, which represent the maximum and minimum values of the Bitcoin price during the period of time that has been analysed previously;  b) the 6th extreme value, which is the peak at day 238, represent the maximum value of the Bitcoin price that has been predicted by the EMD analysis; c) the 7th extreme value, which is the minimum value at day 268, represent the drop of the Bitcoin price at day 257 (September the 14th). It should be noted that this last minimum values occurs 30 days after the peak value at day 238, and therefore a bit earlier than the expected time which has been obtained by the EMD during the first step of the analysis (35/40 days of average decreasing time)

The EMD analysis of the Bitcoin price during the period from January to October seems to confirm the discussion that has been presented in above, and consequently we can expect that the Bitcoin price will stay above the value of September the 14th and  continue to grow during the next weeks... 


Figure 5. EMD analysis of the Bitcoin price in the period January to October


The brief analysis of the Bitcoin price, which has been presented, is a good example of data analysis of a system. Indeed, consider a system where we have multiple evidences of the system behavior continuously, for example, measurements of the acceleration of a bridge, the temperature of a water supply system, the current of an electrical component, etc. we can analyse those data by computing their frequency components, and consequently if the frequency components of the system change suddenly, we can spot a unexpected behavior of the system!! However, a more detail and complex data analysis is usually requested in order to adequately assess the health state of a system.



Finally, I would like to mention that the analysis of the Bitcoin price is only a brief and superficial analysis, carried out for a bit of fun and as an interesting example of how my work can be used in many different aspects of real life, and consequently it is not an advice to invest some money on the Bitcoin. In fact, I have only considered the closure price of the Bitcoin, without considering other variables (such as the market cap, the daily news from newspapers or governments, events, holidays, market decisions of companies, daily trade volume, etc.) that can influence the Bitcoin price. A more detailed analysis of the Bitcoin price could be done by considering those variables that can influence the price of the Bitcoin, and at the same time, a more reliable prediction of the price of the Bitcoin could potentially be achieved by using machine learning methods, such as ANN, or forecasting algorithms. 




I hope you enjoyed this post!! 

Stay tuned, 



Matteo







Tuesday, 19 September 2017

A busy schedule of a Marie Curie PhD

Hi Fellows,

Have you enjoyed the summer? I hope you had fun!!

In this post I would like to talk a bit about the Marie Curie PhD life. Indeed, I am often asked to answer the following questions:

How is the life of a Marie Curie PhD student? Do you really work?

These questions are asked by both master students, who think about applying for a Marie Curie PhD, and common people that work in industry and have no idea about PhD academic life. 

For this reason, I am going to talk about the last three months of my PhD life, in order to explain how the life of a Marie Curie PhD student can be.

Firstly, on a daily base, I have to merge technical and non-technical work, such as the coding process for data analysis and drafting conference and journal papers, respectively. Although, the technical work is the main core of the research, due to the fact that it aims to push further the current state-of-the-art of the research topic, and it requires strong and wide background of statistic, engineering problem-solving, data analysis techniques, physics, coding in different languages, etc., the non-technical work is absolutely crucial. Indeed, the non-technical work allows to: i) investigate possible new directions for the research, by reading the work of other researchers; ii) describe and discuss the results of your own research, by drafting journal and conference papers.

However, the daily work, which has to be carried out accurately by adequately scheduling all the work activities, can be jeopardized by all the side activities that a research programme might have, for example, conferences, training courses and meetings.

With this respect, as I mentioned before, during the last three months, I worked daily on my research by continuing the development of innovative data-drive fault detection methods in order to monitor the health state of civil infrastructure, and by drafting and submitting journal and conference papers (here you can find the list of all my publications), but at the same time I have traveled quite a lot in order to present the results of my research at three international conferences and to attend an awesome training course.

The three conferences where I have presented and discussed my work are:

  1. the 52nd European Safety, Reliability & Data Association (ESReDA) conference, which was held in Kaunas, Lithuania, 30-31 May 2017. Here, I presented a paper titled "Towards a real-time structural health monitoring of railway bridges", which aimed to discuss how the real-time monitoring of bridges can improve the safety, reliability and availability of the whole transportation network, by showing an example of a real-time monitoring method that I have developed during my PhD, Figure 1. 
  2. the European Safety and Reliability (ESREL) conference, which was held in Portorož, Slovenia, 18-22 June 2017. During this conference, I showed a paper titled "A fuzzy-based Bayesian Belief Network approach for railway bridge condition monitoring and fault detection". The paper discussed a method to assess the health state of a bridge by relying on the analysis of the measurements of the bridge behavior, which are provided by the sensors that are installed on the bridge, and by considering the knowledge of the bridge managers and engineers, Figure 2.
  3. the 11th International Workshop on Structural Health Monitoring (IWSHM), which was held at the Stanford University, California, USA, 12-14 September 2017. In Stanford, I presented the results of a work that I carried out for AECOM, which is one of the most important engineering firm of the world, in a paper titled "A data mining tool for detecting and predicting abnormal behavior of railway tunnels". In this paper, a data mining method was developed in order to analyse a vast database of measurement of the behavior of a railway tunnel. The aim of the data analysis was to point out the most critical area of the tunnel and to predict the future behavior of the tunnel by the means of an Artificial Neural Network (ANN), Figure 3. 


Figure 1. Presenting my research at the ESReDA conference.  
Figure 2. Presenting at the ESREL conference.



Figure 3. Discussing the results of my research at the IWSHM conference in Stanford.

In order to improve my technical skills, I have attended a summer school in Yokohama, Japan, for three weeks, from mid-July to the beginning of August. The summer school, which is called "Asia-Pacific-Euro Summer school (APESS)", aimed to discuss the most recent advancement for smart structure technology by giving a huge networking opportunity to the researchers. In fact, more than 60 researchers from Europe, Asia and America participated to the summer school, and lecturers from all around the world given talk and classes on the most advanced techniques for structural health monitoring and data analysis. Figure 4 shows the picture of the APESS class 2017!!

Figure 4. APESS class of 2017.

Finally, in the last months I participated to the Open Day of the University of Nottingham, where ESR13, Federico, and I talked to the possible future students of the University in order to show some interesting experiments and give to them some useful tips about the university life. You can find more information about the Open Day here.

That's all Folks!

Hope you enjoyed this post.
Matteo





Monday, 29 May 2017

A conference, a workshop and one seminar: some experiences in order to presents the results of our research

Hi Fellows, 

welcome back! Summer is almost here, finally! Do you feel it? =) 

During April and May, I had the opportunity to attend to a conference, a workshop and a seminar. 
In these situations, I presented the results of our research project to experts and decision-makers. Indeed, I attended the Stephenson conference in London, at the end of April, where I met both experts from industry (such as railway companies, e.g. Bombardier, Network Rail, etc. and consulting firms, e.g. Amey, Mott MacDonald Group, etc.), and professors and researchers from international universities. There, I have given a 20-minutes presentation in order to discuss the first results of the project, which were achieved during the first year of the PhD (Figure 1). 



Figure 1. Presentation at the Stephenson conference.

In a similar way, the 24th and 25th of May, I attended the first TRUSS workshop, where I discussed about my research project with international attendees. During the first day of the workshop, each ESR had a poster presentation in order to explain the objectives of his PhD and discuss possible methods and collaborations with the attendees (Figure 2). On the contrary, during the second day of the workshop, each ESR carried out a presentation with the aim of discussing the last achievements of the research (Figure 3). 



Figure 2. The poster session at the TRUSS workshop


Figure 3. Research progress presentation at the TRUSS workshop.
Finally, in the next days, the 30th and 31st of May, I attend the 52nd Esreda semianr in Kaunas, Lithuania. There, I am going to present my research in front of an international audience that is made of professors, decision-makers and critical infrastructure experts. 

I will keep you updated with the new adventure of the project!! 

Ciao!! 
Matteo

Friday, 10 March 2017

Data-mining and future prediction of railway tunnel behaviours

Hi fellows, 

Here we are again. 
I hope this post finds you well!! Are you ready for the spring? =) 

In this post, I am going to explain why I worked on the monitoring of the health state of a tunnel during my secondment, which has been carried out from September to December 2016 at AECOM. Firstly, the secondment is important during the Marie Curie programme, as it gives the possibility to each Marie Curie fellow to experience new work activities in different frameworks (industries, new academics groups, etc.). 
Particularly, the goals of my secondment were defined with the aim of applying the mathematical methods that I have developed at the university, into the real daily world. 

Mathematical methods? Yes, guys, the aim of my PhD is the development of mathematical methods, which are able to automatically monitor the health state of railway bridges by analyzing the data provided by a measuring system (that is sensors) installed on the bridge!! Did you remember? 

However, during the first month of the secondment, the company was monitoring in real time the health state of a railway tunnel due to the fact that the tunnel was requiring some works. Consequently, it was an ideal situation to try my mathematical methods in a real-case study by analyzing and monitoring the tunnel behaviors! 
However, before working on it, I had to convince my bosses by asking to the project-coordinator of the Marie-Curie scholarship the authorization to switching topic of the secondment... and fortunately, during the last week of October, I get the green light!! (Thank you Mr. project-coordinator)

Anyway, AECOM has monitored in real time a railway tunnel (for example, see the figure below) by using a measurement system made by more than 300 sensors for more than 4 months, as the monitoring process started in August. Each sensor provided a value of the tunnel behavior, for example displacement of the tunnel walls, or strain, etc., every second basically, 24/7. Therefore, you can easily understand that the first problem was the data analysis of such big database.


Example of railway tunnel (property of Community Rail Lancashire)

I would like to give you as many information as possible regarding the method that we applied in order to identify the typical behavior of the tunnel and, more important, to point out the unexpected tunnel behavior, but as we are drafting research articles on it, I cannot. I am sorry. 
I can say that we (TRUSS people) developed and applied a data-mining algorithm, followed by a machine-learning method that is able to predict the behavior of the tunnel in the future, and, as this was pretty good luckily, AECOM asked us for the copyright of the codes in order to embed them into their analysis methods. Not to bad, isn't?

Finally, yes, I know what you are thinking, and I agree, 100%. However, you have to seek your fortune sometimes... =) 

See you soon!! 





Monday, 30 January 2017

Introduction of the TRUSS project to the University of Nottingham students

Hello fellows!

I hope you are doing well.
In the last post, we discussed the new challenges of the new year, and we have some great news!!

Firstly, I have worked at AECOM in the last months in order to perform my secondment. During these months, some really interesting results have been achieved and I am going to post them soon. Trust me, very soon!

Then, during November 2016 a new research group has been launched by the university of Nottingham, the Resilience engineering research group (RERG), which is the group where I am working right now. In order to introduce the RERG to the university staff members and students, a workshop was organized on the 9th of November.

4 Academics, 11 research fellows and 16 PhD students compose the operative brain of the RERG group, that aim to develop innovative and efficient methods for asset management, system monitoring (fault detection and diagnosis, prognostic methods), reliability, safety and risk analysis of systems.

The presentation of the group has been held at the conference center of the University of Nottingham on the 9th of November 2016. There, I gave a 20 minutes talk introducing the TRUSS project by explaining its goals, partners, beneficiaries and research programs around Europe. Then, I explained the goals of my PhD (http://trussitn.eu/research/rail-and-road-infrastructure/esr9/) by showing a case study, which has been developed during the first year of the PhD and that will be presented at two conferences during the next months. The audience was mainly formed by students and academics of the University of Nottingham.

Presenting at the RERG workshop

Finally, I would like to give you a quick preview of the next posts:

1. one post will discuss the results of the secondment explaining the reason why I worked on the monitoring of a railway tunnel. Yes, I know, a tunnel is different from a bridge, and my PhD analyses  railway bridges. However, we develop mathematical methods, and Mathematics does not care about the nature of the data, She (meaning her majesty the Mathematics) can assess the health state of every kind of system (or infrastructure) by simply analyzing a continuous flow of data, which are provided by a monitoring system installed on the infrastructure of interest.

2. in another post, I am going to talk about my experience as inspector of a railway bridge during a visual inspection program of a 170-years old railway bridge!

That's all folks!!
Thanks for reading!