COVID-19 vaccine dosages and government factors role on the global variation in COVID-19 mortality: A statistical and regression analysis

The objective of our study was to explore the influence of the current vaccination program and other relevant government factors to explain the variation in COVID-19 mortality in the world. The study involves a cross-sectional survey of COVID-19 related and government factors from 161 countries. We retrieved and processed publically available coronavirus pandemic data (July 17, 2021) from several online databases, excluding countries' data violating correlation and regression analysis assumptions. In addition, partial correlations studies and multivariate analysis were performed to explore the influence current vaccination program and other relevant government factors on the relationship between the explanatory variable and the total deaths due to COVID-19. The partial-correlation studies revealed that controlling for a complete dosage of COVID-19 vaccine per 100 people in the population had a significant (P<0.001) impact on the strength of the relationship between some explanatory variables and the response variable (total COVID-19 mortality). Furthermore, the Stepwise Linear Regression (SLR) model shows that the covariates, namely total_cases, hospital patients per million, hospital beds per thousand, male smokers, and people fully vaccinated per hundred, added significantly (P<0.001) to the prediction of the response variable. Our SLR model validation study revealed that the observed total COVID-19 mortality was highly correlated with the predicted total COVID-19 mortality in various countries (r = 0.977, P<0.001). Our Stepwise Linear Regression model performs significantly better with an R-squared value of 0.958 and adjusted R-squared value of 0.956 than other related regression models built to predict COVID-19 mortality. Based on our current findings, we conclude that governments with better hospital infrastructure and people with complete dosages of the COVID-19 vaccine will have minimal COVID-19 fatalities. © 2022 The Authors.


Introduction
*The highly infectious COVID-19 has infested over 190,565,973 people, and more than 4,095,485 people have died across the globe as of July 17, 2021 (Lipsitch et al., 2020). However, the total COVID-19 based death differs significantly at the country level. For instance, in July 2021, the total deaths attributed illnesses, hospital care, clinical symptoms, prior immunity, and mutations in virus Huang et al., 2020;Rubino et al., 2020;Wang et al., 2020;Zhou et al., 2020). Moreover, as mentioned above, the clinical factors help the clinicians classify people at high risk of COVID-19 infection.
Nevertheless, the factors mentioned earlier for explaining the mortality of individuals from COVID-19 are insufficient to support effective policymaking by governments where the COVID-19 related mortality is very high (Liang et al., 2020). Several studies have addressed this gap in COVID-19 research. Few academics have discussed the effect of government policies, namely lockdown or quarantine, to control the spread of COVID-19 (Hou et al., 2020;Iacobucci, 2020).
Additionally, an increase in mass COVID-19 testing and vaccination has been promoted by various countries to decrease the spread of COVID-19 (Dagan et al., 2021;Peto, 2020). However, few researchers have also favored utilization of the hospital resources through the COVID-19 pandemic as it assures the sufficiency of means available to provide treatment to many patients suffering from COVID-19 (Moghadas et al., 2020). Scholars have recently analyzed the correlation between healthcare resources and mortality readiness because of COVID-19 (Ji et al., 2020). Yet, the available evidence in the literature to curtail the spread of COVID-19 has not been applied to accurately explain the significant country-wise deviation in the COVID-19 attributed deaths. Moreover, countries differ extensively in capabilities to detect, prevent, and respond to pandemics (Kandel et al., 2020).
Therefore, Liang et al. 2020, aimed to explore factors associated with COVID-19 associated mortalities at the country level. However, several limitations were related to assessing the total COVID-19 mortality at the country level. The selection of a limited number of COVID-19 related and country-based factors that possibly determine the COVID-19 mortality rate in a country was one of the significant limitations of the study conducted by Liang et al. (2020). Therefore, we aim to explore and explain the influence of several new governments and COVID-19 related factors on COVID-19 based mortalities in the world. In this context, in the current study, cross-sectional data comprising various COVID-19 related and country-based attributes of 181 countries were retrieved and analyzed to screen the most informative factors explaining the variation in the COVID-19 mortalities in the world. Furthermore, the partial correlation studies keeping the partial and complete dosages of COVID-19 vaccines as controlling attributes were performed to understand the possible impact of the partial and complete dosages of the COVID-19 vaccine on the total deaths attributed to COVID-19 in different countries. Specifically, the study examined the association of covariates such as critical cases, new COVID-19 cases per million, aged 65 older, aged 70 older, and total confirmed COVID-19 cases with the response variable (total deaths attributed to , while controlling for partial and complete dosage of COVID-19 vaccine. In addition, we have also examined and discussed the role of attributes, namely total confirmed cases of COVID-19, hospital patients per million, hospital beds per thousand, male smokers, and people fully vaccinated per hundred, explaining the discrepancy in total COVID-19 based mortalities across the globe. Further, the factors screened in the present study might help countries severely affected by COVID-19 formulate policies to attenuate the higher fatality of COVID-19.

Materials and methods
The flow diagram for implementing the Machine learning-based approaches to explain the attributes related to discrepancies in total COVID-19 based fatality across countries is shown in Fig. 1.

Data source and variable description
The COVID-19 cross-sectional dataset is a collection of attributes, namely Vaccinations, Tests and positivity, Hospital and ICU, Confirmed cases, Confirmed deaths, Policy responses, and other variables of interest were being retrieved from 'https://ourworldindata.org/coronavirus", an openaccess database for Coronavirus Pandemic . Furthermore, the data regarding the variables, namely "the number of variant sequences" and "total sequences since the first variant sequence," were retrieved from https://cov-lineages.org/ (O'Toole et al., 2021).

Multiple imputations
Missing data is a usual occurrence in crosssectional datasets. Multiple Imputations (Chang et al., 2020;Buuren, 2018) due to their easiness of usage are possibly the most popular approaches for addressing missing data from sample data. Multiple Imputation (MI) techniques for imputing data can be used where the data are missing at random, missing completely at random, or still when the data are not missing at random. In the present study, understudy cross-sectional data of 181 countries comprising COVID-19 related and country-specific attributes were processed for multiple missing values using the Multiple Imputation by Chained Equations (MICE) package in Azure Machine studio, assuming that the data are missing at random (MAR). MICE has been widely accepted for data imputation and has displayed better performance in practice (Buuren and Groothuis-Oudshoorn, 2011). The imputed cross-sectional data was further tested to satisfy the five regression analysis assumptions (Casson and Farmer, 2014).

Descriptive statistics for sample data
Descriptive and inferential statistics are employed in scientific data analysis and are crucial in statistics. In the present study, we have discussed descriptive statistics measures to describe the data and popular methods to test the normality of the data (Kim, 2013;Mishra et al., 2019). Three primary types of descriptive statistics, namely 1) measures of frequency (e.g., Number of occurrences, percent), 2) measure of central tendency (e.g., mean), and 3) measures of variation (variance, standard error, and Standard Deviation) were assessed to provide an understanding of the simple statistical measures of the cross-sectional data. An evaluation of the normality of data is essential because normally distributed data is a basic assumption for parametric testing. Therefore, the most popular normality testing methods of continuous data are Kolmogorov-Smirnov test (Stephens, 1974), Shapiro-Wilk test (Shapiro and Wilk, 1965), kurtosis and skewness (Kim, 2017), Zscore (Ghasemi and Zahediasl, 2012), and mean with SD (Davis, 2008). The factors under study were analyzed to detect the distribution pattern factors. Normality tests (Shapiro-Wilk test and Kolmogorov-Smirnov test) were performed using the statistical "SPSS" software package (analyze→descriptive statistics→explore).

Probability-based inverse distribution function transformations
Inverse distribution Function Normal (IDF. Normal) transforms a continuous variable's sample distribution to appear more normally distributed (Beasley et al., 2009). Upon checking for the normality of the sample distribution, the IDF-based transformation was performed using SPSS on the cross-sectional data of 181 countries to control the skewness and satisfy the normality assumption.

2.4.3.
Multivariate outlier detection, multicollinearity, homoscedasticity, and normality of the error distribution In cross-sectional data with multiple factors, the chance of the unusual observation increases, and only a few outliers are enough to alter the mean performance, thereby distorting data results. Therefore, in multivariate statistics, the Mahalanobis distance is one of the most famous methods for detecting outliers in multivariate data (Maesschalck et al., 2000;Grentzelos et al., 2021). Finally, all the outliers detected were removed, and a pruned crosssectional dataset was created for further analysis. Multicollinearity happens when there is a high correlation between the independent attributes in the dataset (Vatcheva et al., 2016). Multicollinear features need to be removed as multicollinearity destabilizes the statistical significance of a predictor variable in a model. Multicollinearity between the variables in the cross-sectional data was assessed using the following methods (James et al., 2013) (1) correlation between two independent variables with a cutoff of 0.7. A higher correlation (r≥±0.7), i.e., closer to a positive or negative one between the predictor variables, indicates multicollinearity, (2) Variance Inflation Factor (VIF) with a cutoff value less than five, and (3) Tolerance (TOI) of each variable with a cut off value less than 0.2. Breusch-Pagan Test (Breusch and Pagan, 1979) was performed to test the assumption that the residuals are distributed independently with the predictor variables (homoscedasticity). The Breusch-Pagan Test uses the following hypotheses: Null Hypothesis (Ho)=Residuals are distributed with equal error variance (Homoscedasticity is present) Alternative Hypothesis (HA)= the residual is not distributed with equal variance (Heteroscedasticity is present) The normality of the error distribution (unstandardized residuals) was tested by performing Kolmogorov-Smirnov and Shapiro-Wilk tests (Ghasemi and Zahediasl, 2012). The Kolmogorov-Smirnov test and Shapiro-Wilk tests use the following hypotheses: Null Hypothesis (Ho)= the unstandardized residuals are normally distributed Alternative Hypothesis (HA)= the unstandardized residuals are not normally distributed If the p-value of the Kolmogorov-Smirnov and Shapiro-Wilk normality tests are more than the significance level of 0.05, accept the null hypothesis and conclude that unstandardized residuals are normally distributed.

Partial correlation: The relationship between total deaths and the independent variable
Partial correlation was used to estimate the strength of the relationship between the predictors and the response variable (total deaths attributed to , keeping dosages of the COVID-19 vaccines as controlling factors. The COVID-19 vaccines approved in the United States of America and European countries have shown effectiveness against hospitalization and death in trials and realtime across the globe (Breusch and Pagan, 1979;Chodcik et al., 2021). Thus, there is consensus that the COVID-19 vaccination may end the COVID-19 pandemic by 2022. However, in the current setting, the influence of immunization on the relationship between total deaths attributed to COVID-19 (response variable) and government factors (predictor variable) is still not apparent. Therefore, in this study, we intend to explore the influence of the vaccination dosages (single and complete dosages of COVID-19 vaccine) on the relationship between the total death attributed to COVID-19 and the COVID-19 related government factors. The Partial Correlation between the predictors and the response variables while controlling for dosages of the COVID-19 vaccine was performed using the statistical "SPSS" software package (Correlate>partial correlation).

The stepwise linear regression model
Stepwise Linear regression analyses (Hocking, 1976) were performed to study the relationship between the total deaths attributed to COVID-19 and seventeen normally distributed and non-collinear predictors as tabulated in Table 3.

Validation study of the stepwise regression model
The validity of our final stepwise regression model was considered by evaluating the observed total COVID-19 deaths against the predicted total deaths attributed to COVID-19 for each country. First, we drew a graph with observed and predicted total deaths due to COVID-19 on the axis of the twodimensional graph. We anticipated seeing the sample points distributed around the 45-degree cross line on the two-dimensional chart provided the model fits well.

Multiple imputations using mice
The missing data in our cross-sectional dataset were artificially generated, and the imputed crosssectional dataset was tested for the five assumptions of the regression analysis.

Normality tests and descriptive statistics for sample data
The cross-sectional data of 181 countries comprising COVID-19 related and country-based factors exhibited marked skewness. Therefore, using the inverse distribution function normal transformation, the skewed cross-sectional data of 181 countries was transformed to correct the skewness and fulfill the assumptions of normalized data. In addition, the normality of the transformed cross-sectional data was assessed using the different normality testing methods. Table 1 summarizes the sample characteristics of the predictors (Standard error, mean, 95% Confidence Interval for Mean, Lower and upper bound, and standard deviation value). In addition, Table 2 summarizes the transformed cross-sectional data's normality test and kurtosis, skewness, and normality test values. The results observed in Table 1 and 2, respectively, showed that the transformed data of various COVID-19 related and country-based variables were normally distributed.

Regression assumptions analysis of the cross-sectional data
The multivariate outliers detected using the Mahalanobis distances were removed, and the pruned dataset was processed further for regression assumption analysis. The pruned cross-sectional dataset with thirty-one variables was further checked for multicollinearity, homoscedasticity, and normality of the error distribution of the residuals. As summarized in Table 3, we can observe that the values of the VIF and Tolerance for the seventeen explanatory variables are below five (1<VIF<5), and more than 0.2, respectively; this shows that the features space as shown in Table 3 are low to moderately correlated. Upon performing the Breusch-Pagan Test, a p-value of 0.165954 corresponds to a Chi-Square (χ2) 21.336 with 16 degrees of freedom was obtained on the pruned cross-sectional data. Since the p-value is larger than the significance level (α=0.05), we accept the null hypothesis, which states that the residuals are distributed with equal variance (homoscedasticity is present in the cross-sectional data with 16 predictor variables). Graphically the homoscedasticity of the crosssectional data is also represented in Fig. 2. As depicted in Fig. 2, the variance of the residuals is constant across all the standardized predicted values. Thus, the constant residual variance signifies the homoscedastic nature of the cross-sectional data. Finally, the normality of the error distribution (unstandardized residuals) was tested by performing Kolmogorov-Smirnov and Shapiro-Wilk tests. The p-value of the unstandardized residuals was 0.200 and 0.892, respectively. Thus, the p-values of the unstandardized residual calculated for the Kolmogorov-Smirnov and Shapiro-Wilk tests were greater than the significance level of 0.05. Therefore, we accept the null hypothesis, which states "the unstandardized residuals are normally distributed," and reject the alternate hypothesis.

Partial correlation: The relationship between total deaths and the explanatory variables
The relationship between the total death attributed to COVID-19 and the seventeen independent variables in the pruned cross-sectional data was explored using partial and zero-order correlation while controlling for partial and complete dosages of the COVID-19 vaccine. Partial correlation of independent variable(s), namely critical cases, new COVID-19 cases per million, aged_65_older, aged_70_older, and total confirmed COVID-19 cases, using a partial and complete dosage of COVID-19 vaccine dosage, respectively as the controlling factor, presented a significant relationship with the total deaths attributed to COVID-19 (dependent variable) as shown in Fig. 3a-3j and Fig. 4a-4j, respectively. Fig. 3 (a-j): Where Fig. 3a represents the Zero-order and Fig. 3b displays a Partial Correlation between total deaths attributed to COVID-19 and critical cases while controlling for a partial dosage of COVID-19 vaccine; While Fig. 3c shows the Zero-order and Fig. 3d represents a Partial Correlation between total deaths attributed to COVID-19 and new COVID-19 cases per million while controlling for a partial dosage of COVID-19 vaccine; Fig 3e depicts the Zero-order and Fig. 3f displays a Partial Correlation between total death attributed to COVID-19 and aged_65_older while controlling for a partial dosage of COVID-19 vaccine; Fig. 3g represents the Zero-order and Fig. 3h illustrates a Partial Correlation between total death attributed to COVID-19 and aged_70_older while controlling for a partial dosage of COVID-19 vaccine and total confirmed COVID-19 cases; Fig. 3i displays the Zero-order and Fig. 3j represents a Partial Correlation between total deaths attributed to COVID-19 and total cases while controlling for a partial dosage of COVID-19 vaccine. Linear Lines in Figures 3(a-j) are linear predictions of the dependent and the specific independent variable. The area between the two red fitted lines represents the 95% confidence intervals of the fitted values across the cross-sectional data (r: correlation coefficient)  Fig. 4j illustrates the Partial Correlation between total death attributed to COVID-19 and total cases while controlling for the complete dosage of the COVID-19 vaccine. Linear Lines in Figures 4 (a-j) are linear predictions of the dependent and the specific independent variable. The area between the two red fitted lines represents the 95% confidence intervals of the fitted values across the cross-sectional data (r: correlation coefficient)

Stepwise linear regression analysis
Stepwise linear regression was performed using a set of attributes tabulated in Table 4 to build a regression model with variables that significantly predict the response variable (total death attributed to . Table 4 summarizes the findings of the stepwise linear regression model. The variables total_cases, hospital patients per million, hospital beds per thousand, male smokers, and people fully vaccinated per hundred in combination significantly predicted the total deaths attributed to COVID-19, (F(6, 161)=617.691, p<0.001), R-Square=0.9584.  Table 5 summarizes the final stepwise regression model predictors for predicting the total death attributed to COVID-19. In addition, the absolute value of the standardized regression coefficients can be compared, thereby providing a rough indication of the importance of the variables. For example, among the COVID-19 related factors, the most significant absolute standardized value is 0.9503 for the total confirmed cases of COVID-19 (95% CI 0.0178 to 0.0193, P <0.001), suggesting that the total confirmed cases of COVID-19 are the most important of the five model predictors in predicting the total deaths attributed to COVID-19. Next is hospital patients per million, with a positive and statistically significant standardized coefficient value of 0.307 (95% CI 51.165510 to 72.318343, P <0.001). Then again, among the government-related factors, the Hospital bed per thousand is the most critical factor with a negative standardized coefficient value of -0.194 (95% CI -6010.567860 to -3431.576739, P <0.001), suggesting that a decrease of one standard deviation in hospital beds per thousand will result in an expected increase of total deaths attributed to COVID-19 by 0.194 standard deviations. Next most contributing government-related factors, with a negative standardized coefficient value of -0.114297 for fully vaccinated people per hundred (95% CI -1381.021441 to -579.197698, P <0.001), suggesting that a decrease of one standard deviation in hospital beds per thousand will result in an expected rise of total deaths attributed to COVID-19 by 0.114297 standard deviations. Finally, the smallest absolute value is -0.112518=0.113 for male smokers (95% CI -647.070996 to -315.907098, P <0.001), suggesting that the male smokers are the least among the five predictors in predicting the total deaths attributed to COVID-19 for an individual country.

Validation of the stepwise regression model
The validation of the current stepwise regression model is pictorially represented in Fig. 5 by plotting the standardized predicted values of the total death on the x-axis and the observed total deaths due to COVID-19 on the y-axis. The standardized predicted value of the total deaths was positively and significantly correlated with the observed total deaths due to COVID-19 (r = 0.977, P < 0.001).

Comparison with similar studies
The performance of our stepwise linear regression model in predicting the total mortality of countries was compared to model performance from a related study conducted by Liang et al. (2020), shown in Table 5. We can observe from Table 6 that the R-squared value and the adjusted R-squared value of our stepwise MLR model are 0.958 and 0.9568, respectively. Our model's R-squared value and the adjusted R-squared value are significantly better than the proposed regression model proposed by Liang et al. (2020) (R-squared value was 0.58; adjusted R-squared value was 0.54). Moreover, the correlation coefficient value (r = 0.977, P < 0.001) between the predicted value and the observed total death of our model was comparatively better than the model proposed by Liang et al. (2020) ((r = 0.77; P < 0.001).

Discussion
Cross-sectional data from 181 countries with various COVID-19 and Government-related factors were preprocessed for the missing values in the cross-sectional data of 18 countries. First, the missing values were artificially created using the MICE package in Azure Machine Learning Studio Classic. Then, the imputed cross-sectional data were further processed to satisfy the correlation and regression analysis assumptions, namely normality, multivariate outliers, multicollinearity, linearity, homoscedasticity, and normality of the error distribution of the cross-sectional data. Finally, the pruned preprocessed cross-sectional dataset comprising sixteen attributes from 161 countries was further used for correlation and regression studies.
Notably, the zero-order and partial correlation analyses as illustrated in Fig. 3a-3j and Fig. 4a-4j showed that controlling for a complete dosage of COVID-19 vaccine over a partial dosage of COVID-19 vaccine per 100 people in the population had a moderate but significant influence on the strength of correlation between the total death attributed to COVID-19 (response variable) and the independent variables, namely critical cases, new COVID-19 cases per million, aged_65_older, and aged_70_older.
However, controlling for each partial and complete dosage of the COVID-19 vaccine had a statistically significant but negligible effect on the relationship between the total deaths due to COVID-19 (dependent variable) and the total confirmed COVID-19 cases (independent variable). The negligible impact of the partial and complete dosage of the COVID-19 vaccine can be attributed to the emergence of novel variants of COVID-19 (Grubaugh et al., 2021;Winblad et al., 2004) that have a selective advantage of enhancing trans-mission dynamics (Davies et al., 2021a;Dorp et al., 2021) and the capability to reduce prompt neutralization by the host (Planas et al., 2021;Singh et al., 2021). Thus, for example, the present restrictions in India and Europe (Priesemann et al., 2021;Singh et al., 2021) are in because of a more communicable (Davies et al., 2021a;Dorp et al., 2021) and principally more pathogenic (Challen et al., 2021;Davies et al., 2021b) B.1.1.7 variant that originated in the United Kingdom and is rapidly gaining dominance in a country such as India (Singh et al., 2021).
Therefore, based on our current observation of the overall results of the partial correlation, we recommend that governments of different countries should emphasize more on immunizing their residents with all doses prescribed by the vaccination protocol rather than a partial vaccine dose per 100 people in the total population to counter the surge in the total death attributed to  Additionally, the current country-level study comprehensively examines the association of various COVID-19 related and country-specific factors with COVID-19 mortality. The stepwise multiple regression model analysis reveals that the estimated regression line (predictors) of the model explains 95.8 % (R-square=0.958) variation in the response variable (total COVID-19 mortality). A statistically significant and higher F-statistics of the final stepwise regression model shows that the independent variables help explain total deaths attributed to COVID-19 (response variable) with a confidence greater than 99.999. The detailed analysis of the attributes contributing significantly to the stepwise multiple regression model has shown that total deaths attributed to COVID-19 are positively associated with the total confirmed cases of COVID-19 and the number of COVID-19 patients in the hospital on a given day per 1,000,000 people. Therefore more confirmed cases of COVID-19 are associated with an increase in COVID-19 mortality (Sarkodie and Owusu, 2020).
Moreover, more COVID-19 patients admitted to a hospital on a given day are associated with increased total deaths attributed to COVID-19. Thus, an increase in total deaths of patients admitted to hospital with COVID-19 infection may be attributed to a rapid emergence of new cases, leading to rapidly increasing demand for patients' facilities and multiple negative factors and symptoms (end-stage renal diseases, age, diabetes, heart diseases, etc.) associated with hospitalized COVID-19 patients (Alwafi et al., 2021;Rossman et al., 2021). Alternatively, our stepwise regression model also showed a negative association between total deaths attributed to COVID-19 (response variable) and the independent variables (hospital beds per thousand, male smokers, and people fully vaccinated).
The people fully vaccinated have a significant negative unstandardized coefficient, which means that in countries where people received all doses of the COVID-19 vaccine prescribed by the vaccination protocol per 100 people in the total population, the total death due to COVID-19 significantly decreases (Haas et al., 2021), offering hopefulness that COVID-19 vaccination will ultimately stem the COVID-19 pandemic as vaccination programs rise across the globe. In addition, a negative coefficient value for a male smoker signifies a negative association between smoking and COVID-19 infection, indicating a reduced mortality risk in smokers due to COVID-19. Several recent studies have shown that smoking reduces the risk of people getting infected with COVID-19 disease (Ward et al., 2021). However, the current findings that smoking is inversely associated with COVID-19 infection challenge the fact that smokers are more susceptible to respiratory illnesses, including COVID-19 infection (Hopkinson et al., 2021;Westen-Lagerweij et al., 2021). Therefore, further investigations are needed to clarify the role of cigarette smoking on COVID-19 disease and the reported low prevalence of smokers amongst the current patients diagnosed with COVID-19 infection.
Moreover, the multiple regression analyses indicated a negative association between COVID-19 mortality and hospital beds per thousand people. Thus, the negative association between COVID-19 mortality and hospital beds per thousand people suggests increasing hospital beds per thousand people might serve as a practical methodology to decrease deaths attributed to COVID-19 mortality by governments that were less effective in controlling disease outbreaks when hospital beds were not sufficient in providing proper healthcare to many patients with COVID-19 (Liang et al., 2020;Sen-Crowe et al., 2021). Additionally, from the absolute tvalue of predictors, as shown in Table 2, we can presume that countries should focus more on hospital beds per thousand, people fully vaccinated per hundred, and male smokers, respectively counter the increase in total mortality attributed to COVID-19.
Hence we can conclude from the comprehensive analysis of the stepwise multiple regression that the predictors of total cases, hospital patients per million, hospital beds per thousand, male smokers, and people fully vaccinated per hundred were added statistically to predict the total deaths attributed to COVID-19. Furthermore, the present final stepwise regression model was validated by plotting the standardized predicted values of the total death due to COVID-19 on the x-axis and the observed total deaths due to COVID-19 on the y-axis, as shown in Fig. 5. A very strong positive and significant correlation (r = 0.977, P < 0.001) was observed between the standardized predicted values and the observed values of the response variable (total deaths attributed to . A significant correlation observed between the predicted, and the experimental values of the response variable (total deaths attributed to  can be attributed to the 45° angle line between the observed and predicted total deaths due to COVID-19. Moreover, the comparative model validation study revealed that the correlation between the observed total COVID-19 mortality and the predicted total COVID-19 mortality in various countries of the Stepwise Linear Regression model (r=0.977, P<0.001) was significantly better than Liang et al., 2020 regression model (r = 0.77; P < 0.001). Our Stepwise Linear Regression model performs significantly better in predicting the total COVID-19 mortality of countries with an R-squared value of 0.958 and an adjusted Rsquared value of 0.956 than the regression model proposed by Liang et al. (2020) (R-squared value was 0.58; adjusted R-squared value was 0.54).
However, there are certain limitations to the present study. First, the presence of multivariate outliers limits our analyses to cross-sectional data from 161 countries instead of 183 countries. Second, is the lack of completeness of countries' data (missing values). The missing values in the crosssectional data were artificially created using the MICE module, assuming that the data are missing at random. However, real-time data instead of artificially synthesized data would have provided a more real personification of the factor(s) influencing the total death attributed to COVID-19 (response variable). Third, a sudden surge in COVID-19 related deaths and inaccurate reporting of fatalities attributed to COVID-19 from different countries may significantly influence the predictive nature of our model. However, the prognostic factors and their trends for predicting the total deaths attributed to COVID-19 might not change. Finally, the immunity acquired after the global spread of COVID-19 might influence the prediction accuracy of our stepwise regression model.
This study also has its strengths. This study involves analyzing many COVID-19 and government factors positively and negatively impacting the total deaths attributed to COVID-19 in a country. The results of our research possibly will contribute toward effective policymaking at the country level to counter the sudden surge in the number of deaths attributed to COVID-19 and variants of COVID-19. Based on our current findings, we conclude that governments with better hospital infrastructure and people with complete dosages of the COVID-19 vaccine will have minimal COVID-19 fatalities.

Conclusion
In conclusion, we can conclude that a higher number of fatalities attributed to COVID-19 and its variants are positively associated with the total confirmed cases of COVID-19 and the number of COVID-19 patients in the hospital on a given day per 1,000,000 people. While the total COVID-19 mortality is negatively associated with hospital beds per thousand, male smokers, and people fully vaccinated per 100 people from the total population. So, based on the current regression and partialcorrelation studies, we presume that the countries which focus on improving the infrastructure of the hospitals by providing more hospital beds, providing better governance policies to counter the spread of COVID-19 and its variants, thereby a load of patients in hospitals are reduced. Moreover, providing a complete vaccination dosage of the COVID-19 vaccine to the entire population will undoubtedly lower the COVID-19 related mortalities.

Acknowledgment
This work was supported by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, under Grant G:-186-132-1442. The authors, therefore, gratefully acknowledge DSR's technical and financial support.

Conflict of interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.