Abstract: The outbreak of the new coronavirus (COVID-19) has been increasing throughout the world since December 2019 when the first case was reported in Wuhan, China. To make the policymakers be able to make vital decisions against the spread of the virus, data scientists have continuously been working on analyzing the trends and providing various prediction models for different places around the globe. Utilizing different regression methods to model the spread of the diseases is one among many other approaches to predict the future of the outbreak. In this work, the entire globe as a whole, Iran, Canada, and Canada's provinces in the time window of 22~Jan to 7~Apr,~2020 were our main focus. We have used data provided by CSSE at Johns Hopkins University and first come up with various visualizations and their corresponding explanations for each of the regions with the intention of extracting interesting information about the overall trends of different factors such as cumulative number of infected, recovered, and dead cases (both absolute and per population numbers), as well as the mortality rate and recovery rate over time. As the second part of our attempt, we considered three well-known regression models of Polynomial Regression, Bayesian Ridge Regression, and Support Vector Regression to model the total number of infected cases separately for the globe, Iran, and Canada. Apart from the prediction of the future situation for each area, we have compared the performance of three prediction models with each other to show the strength of different models on different data.
Project's Full Paper: Click Here
Project's Source Codes (in the form of a Jupyter Notebook): Click Here