Statistics-Based Predictions of Coronavirus Epidemic Spreading in Mainland China




Coronavirus epidemic in China, Coronavirus COVID-19, Coronavirus 2019-nCoV, Mathematical modeling of infection diseases, SIR model, Parameter identification, Statistical methods


Background.  The epidemic outbreak caused by coronavirus COVID-19 is of great interest to researches because of the high rate of the infection spread and the significant number of fatalities. A detailed scientific analysis of the phenomenon is yet to come, but the public is already interested in the questions of the epidemic duration, the expected number of patients and deaths. Long-time predictions require complicated mathematical models that need a lot of effort to identify and calculate unknown parameters. This article will present some preliminary estimates.

Objective. Since the long-time data are available only for mainland China, we will try to predict the epidemic characteristics only in this area. We will estimate some of the epidemic characteristics and present the dependen­cies for victim numbers, infected and removed persons versus time.

Methods. In this study we use the known SIR model for the dynamics of an epidemic, the known exact solution of the linear differential equations and statistical approach developed before for investigation of the children disease, which occurred in Chernivtsi (Ukraine) in 1988–1989. 

Results. The optimal values of the SIR model parameters were identified with the use of statistical approach. The numbers of infected, susceptible and removed persons versus time were predicted and compared with the new data obtained after February 10, 2020, when the calculations were completed.

Conclusions. The simple mathematical model was used to predict the characteristics of the epidemic caused by coronavirus in mainland China. Unfortunately, the number of coronavirus victims is expected to be much higher than that predicted on February 10, 2020, since 12289 new cases (not previously included in official counts) have been added two days later. Further research should focus on updating the predictions with the use of up-to-date data and using more complicated mathematical models.


Timeline of the 2019–20 Wuhan coronavirus outbreak [Internet]. 2020 [cited 2020 Feb 15]. Available from:

Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study. Lancet. 2020 Jan 31;(1):1. DOI: 10.1016/S0140-6736(20)30260-9

Zhao S, Lin Q, Ran J, Musa SS, Yang G, Wang W, et al. Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak. Int J Infect Dis. 2020 Jan 30;1:1. DOI: 10.1016/j.ijid.2020.01.050

Nesteruk I. Statistics based models for the dynamics of Chernivtsi children disease. Naukovi Visti NTUU KPI. 2017;5:26-34. DOI: 10.20535/1810-0546.2017.5.108577

Kermack WD, McKendrick AG. A Contribution to the mathematical theory of epidemics. J Royal Stat Soc Ser A. 1927;115:700-21.

Murray JD. Mathematical Biology I/II. New York: Springer; 2002.

Bailey NTJ. The mathematical theory of epidemics. Griffin Book Co.; 1957.

Langemann D, Nesteruk I, Prestin J. Comparison of mathematical models for the dynamics of the Chernivtsi children disease. Mathematics in Computers and Simulation. 2016;123:68-79. DOI: 10.1016/j.matcom.2016.01.003

Nesteruk I. Maximal speed of underwater locomotion. Innov Biosyst Bioeng. 2019;3(3):152-67. DOI: 10.20535/ibb.2019.3.3.177976

Draper NR, Smith H. Applied regression analysis. 3rd ed. John Wiley; 1998.

Nesteruk I. Statistics based predictions of coronavirus 2019-nCoV spreading in mainland China. MedRxiv. 2020 Feb 13;1:1. DOI: 10.1101/2020.02.12.20021931




How to Cite

Nesteruk I. Statistics-Based Predictions of Coronavirus Epidemic Spreading in Mainland China. Innov Biosyst Bioeng [Internet]. 2020Feb.18 [cited 2024May28];4(1):13-8. Available from: