Application of grey system theory and ARIMA model to forecast factors of tourism: A case of Binh Thuan Province in Vietnam

Tourism is becoming more and more popular, and this industry continues to develop strongly around the world. Thus, forecasting tourism demand plays an important role in development. In this study, the purpose is to provide some appropriate models for predicting the demand for tourism in Binh Thuan Province in Vietnam. There are five models applied in this study, namely GM (1, 1), DGM (1, 1), DGM (2, 1), Verhulst and ARIMA; the authors try to test these models to find which concise and accurate forecasting models being able to predict the best result about the tourism demand. So as to ensure the precision, the authors collected data of total revenue, domestic visitor, international tourists and top six countries having the biggest numbers of visitors (Russia, Germany, France, Korea, China and USA) in ten years (between 2008 to 2017) from Binh Thuan Department of Culture, Sports and Tourism. We apply MAPE, MSE, RMSE, and MAD to compare the forecasting model results. As a result, GM (1, 1), DGM (1, 1), Verhulst and ARIMA augment excellent results and minimum forecasted errors. In terms of total revenue, ARIMA is the best choice for prediction. About the domestic visitors and international tourists, GM (1, 1), DGM (1, 1) and Verhulst give better calculation than the other models. Besides, the performance of GM (1, 1), DGM (1, 1), Verhulst and ARIMA to forecast the number of visitors of the top six markets (Russia, Germany, France, Korea, China, and the USA) sending the largest number of tourists describes good results. For all the factors, DGM (2, 1) is rejected to predict due to the poor results. Moreover, recently, tourism industry has developed rapidly in Binh Thuan. Hence, the government has to propose suitable policies to develop local tourism industry.


Introduction
*Since the late 1980s, thanks to the policy of reform and opening up of the state, tourism in Vietnam in general and Binh Thuan, in particular, has developed strongly and gained much success. Located in the South Central and Southern tourism area, Binh Thuan province owns strength in tourism potential. In recent years, the number of tourists traveling to Binh Thuan has increased rapidly, so that this "industry without a chimney" more and more contributes to the growth of the local economy.
Binh Thuan had stably maintained a constant innovation and improvement for the province's tourism over a ten-year period (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) which has the increasing figures of tourism indicators. Furthermore, the top six countries having the most outstanding visitors to Vietnam are indicated in Fig. 4. It can be seen that Russia is always the first top nation providing travelers to Binh Thuan province, but this proportion was equaled in 2016 and overtaken in 2017 by the Chinese market; the others following are Germany, Korea, France, and the USA respectively.  Binh Thuan province has to give policies to promote tourism in the most absolute way to attract tourists and occupy a position on the map of Vietnamese tourism in particular and the world in general. To obtain a good strategic vision, Binh Thuan should forecast accurately tourism demand in the future. Tourism experts acknowledge that the improvement and accuracy of forecasting tourism are very necessary to research (Chandra and Menezes, 2001). Hence, the models of GM (1, 1), DGM (1, 1), DGM (2, 1), Verhulst model are demonstrated to find which models forecast exactly the situation. In some journals, for instance, Song and Li (2008) stated that tourism demand forecasting scientists accumulate data from governments or other agencies. Besides, in a study of two Vietnamese researchers, Nguyen and Tran (2019) had to collect data from the Vietnamese Ministry of Tourism. It can be seen that conducting the research is compulsory to have all necessary figures, such as numbers of domestic visitors or also foreign arrivals in a nation and location, also tourist expenditure. In this study, the writer collected data from Binh Thuan Department Culture, Sports and Tourism.
Researchers apply different methods to analyze the forecasting tourism demand; there are some usual models, namely time-series model (such as GARCH), econometric model (such as ECM and VAR), SES model, logistic growth model, neural network, etc. Also, combination methods are considered. According to Nguyen and Tran (2019), the correct approaches are dependent on determinants and separates into a month, quarter or annual demand. Nguyen and Tran (2019) found that tourism demand forecasting supports the nation to catch the number of domestic visitors, also international arrivals, total revenue tourism; thus, that is the data that help to propose appropriate policies. The quantitative method is a common technique being applied to forecasting tourism demand.
Almost the previous papers, Time-series models namely ARIMA and GARCH (Condratov and Stanciu, 2012;Hadavandi et al., 2011;Radha and Thenmozhi, 2006) and econometric models viz. error correction model (ECM) and the vector autoregressive (VAR) models (Song and Witt, 2006) have been popular models using tourism demand forecasting techniques. Besides, Chang and Liao (2010) used a SARIMA model to forecast monthly outbound Taiwanese tourists traveling to Hong Kong, Japan, and the USA. Furthermore, Lin and Lee (2013) indicated econometric models adopting Multivariate Adaptive Regression Splines (MARS), Artificial Neural Network (ANN) and Support Vector Regression (SVR) to forecast monthly total arrivals visiting Taiwan. Huang (2012) researched to find out the appropriate model improving the ability to forecast the demand for health tourism in Asian nations using a GM (1, 1). Nhu Ty Nguyen used Grey System Theory to test the concise models being able to predict the number of visitors in Vietnam. Otherwise, ARIMA illustrated better forecasting performance to predict the international tourism demand from four European nations to Seychelles (Du Preez and Witt, 2003).
The researchers have to apply the most appropriate model to obtain the best forecasting achievement because forecasting is one of the important factors affecting directly policy and decision-making in the future. In this study, the authors put models GM (1, 1), Verhulst, DGM (1, 1), DGM (2, 1) and ARIMA into practice. The goal of using these models is to check which models supervise the best appropriate forecasting the situation of Binh Thuan province's tourism demand.

Data collection and description
The research analyzes four determinants to do the forecasting -a total number of domestic visitors, international arrivals, total revenue and six countries providing the most tourists to travel to Binh Thuan (Russia, China, Germany, Korea, France, and the USA).
We collect data between 2008 and 2017 that are gotten from Binh Thuan Department Culture, Sports and Tourism and Statistics Office of Binh Thuan.
The data composes of Total Revenue Index, Domestic Arrivals, International Tourists and Top Six Countries giving Visitors, etc. (Figs. 1, 2, 3 and 4).
In terms of the number of arrivals, we also obtain 4 variables datasets. They consist of reference sources for a decision, purposes of visiting, and forms of trip and means of transportation. In the context of Binh Thuan, the group reference sources for decision (Fig. 5) answers the question "why visitors decide to arrive in Binh Thuan province", they are recommended by others who have ever gone to Binh Thuan. About the purposes of visiting (Fig. 6), this group wonders the free time, economic and social conditions, etc. Moreover, visitors also consider forms of the trip (Fig. 7) which makes them save much more money for their tours. Besides, the variable-means of transportation indicate that tourists choose transportation which is the most convenient choice for them (Fig. 8).

Leisure
Visit Relatives Business Others of total revenue index, the number of domestic arrivals and the number of international visitors are 227.74, 3.007E6 and 366380, respectively. The top six countries include Russia, Germany, France, Korea, China, and the USA which are presented 104629.5, 31443.5, 15377.5, 25375.1, 50352.8 and 15166.3, respectively. It can be seen that Russia is the biggest market giving tourists to Binh Thuan.

Data analysis and result
The exact information and data sets influence significantly the accuracy of the forecasting process. In this paper, the data were collected from Binh Thuan Department Culture, Sports and Tourism and Statistics Office of Binh Thuanover a period of ten years (2008-2017) and absolutely, these data sets were never revised. It is easy to see that the tourism demand in Binh Thuan had an upward trend during the surveyed years.
In this portion, we use the data gathered from 2008 to 2017 to apply GM (1, 1), DGM (1, 1), DGM (2, 1), Verhulst and ARIMA to test the accuracy level of forecasting the demand of tourism in Binh Thuan:
Domestic tourism is the factor which had an upward trend year by year (Fig. 10). It is clear that these numbers went up from 1,805,535 in 2008 to more than 4,500,000 in 2017 for all models.
Similarly, Fig. 11 represents the proportions of international visitors of all models rose constantly during the examined years from 195,156 to more than 590,000.  657 to 127,474). The others kept going their increases in ten years. Fig. 13 summarizes that the DGM (2,1) model gives negative numbers so they are errors; the actual number of Germany market fluctuated over the entire period shown and the other models climbed slowly during the surveyed period.
France market which is described in Fig. 14 provides that only DGM (2,1) had an upward tendency and it details that the number increased from 17,323 to 2,554,016. Besides, all the lines of the others waved in different years of the period. Fig. 15 outlines that there was an upward trend in DGM (2,1) which shows that the number of Korean visitors upsurged from 15,349 to 76,121 (2008-2017). Verhulst gives the result in 2017 being an error. The others such as actual, GM (1,1), DGM (1,1) and ARIMA had the oscillations in ten years.

Analyzing the ability of forecasting models by MAPE, MSE, RMSE and MAD methods
It is well-known that a variety of methods is used to evaluate the accuracy of forecasting models. First, MAPE (Mean Absolute Percentage Error) is applied as a proportion of merit to recognize whether a data mining method is showing well or not. The MAPE is lower, the data mining method is better performance: n: forecasting number of step.
Meanwhile, the evaluation follows to these results: Next, the Mean Squared Error (MSE) summarizes the way a regression line is next to a set of points. The distances from the points to the regression line are the errors and then square them. It is estimated by squaring the MAD: The last is Mean Absolute Deviation (MAD) is the average distance between actual data sets and forecasted data sets. The forecasting model is more accurate when the MAD's value is lower. Table 7 indicates the efficiency of five models GM (1, 1), DGM (1, 1), DGM (2, 1), Verhulst and ARIMA to forecast tourism revenue. It is clear that GM (1, 1), DGM (1, 1) and ARIMA are good to forecast total revenue with MAPES being lower than 10% and MSE, RMSE, and MAD also being low. Verhulst is only reasonable in the process. According to the results, the evaluation of DGM (2, 1) is poor, so it is chosen. Table 8 presents a similar method because the parameter of MAPE, MSE, RMSE, and MAD are lower than 10%, the performance of GM (1, 1), DGM (1, 1) Verhulst and ARIMA are good to do the forecasting; therefore, they are efficient models for this process. DGM (2, 1) shows a poor calculation, so it is not chosen to forecast this factor. Table 9 illustrates the same method, GM (1, 1), DGM (1, 1), Verhulst and ARIMA are also the most appropriate models since the parameter of MAPE, MSE, RMSE, and MAD are lower than 10%. Also, DGM (2, 1) is rejected to forecast international visitors.
Table 10 also applies the same method, by contrast, Table 9, Verhulst has an excellent evaluation with low MAPE, MSE, RMSE, and MAD (lower than 10%) and it is chosen for forecasting. GM (1, 1), DGM (1, 1), and ARIMA are also useful in this section with low MAPE, MSE, RMSE, and MAD. DGM (2, 1) is not accepted for forecasting.     Table 11 compares the above five models, there are four good models in this situation, viz. GM (1, 1), DGM (1, 1), Verhulst and ARIMA; all of them are accepted to forecast Germany Visitors with MAPE, MSE, MRSE, and MAD are low. Only DGM (2, 1) is rejected with poor results. Table 12 describes the same method, it is obvious that GM (1, 1), DGM (1, 1), Verhulst and ARIMA have low MAPE, MSE, RMSE and MAD (lower 10%), so they are allowed because they give the most accurate results. With the poor calculation, DGM (2, 1) is not accepted for the prediction.   Similarly, Table 14 represents only GM (1, 1) is a good calculation with MAPE, MSE, RMSE, and MAD accepted. DGM (1, 1) belongs to a reasonable level. Besides, there are three models evaluated that they are poor, so they are rejected in this section.
Finally, Table 15 gives information on the ability to forecast USA Visitor. It can be seen that GM (1, 1) and DGM (1, 1) are chosen as excellent results and accurate calculation with low MAPE, MSE, RMSE, and MAD (lower 10%). The models summarizing the good results are Verhulst and ARIMA, so they are accepted. Notwithstanding, DGM (2, 1) is rejected with a poor calculation for forecasting.

Conclusion and discussion
Tourism is defined as an important integrated economic sector with the content of deep culture, interdisciplinary fields, and socialization. Developing tourism means that we respond to the needs of domestic citizens and international tourists for sightseeing, recreation, and relaxation which contribute to improving the intellectual standards of the people, job creation and socio-economic development. Moreover, this topic supports to study the current trend of tourism and proposes the best solutions for the long-term period of the local tourism industry. Tourism is the strongest developing industry all over the world and it also plays a significant role in economic growth (Akama and Kieti, 2007;Cortez, 2008). Vietnam is one of the nations in top of Asian area having developed tourism market, so Binh Thuan -one of the provinces in Vietnam consider that tourism is a key economic sector in province; recently, Binh Thuan has attracted a large number of both domestic visitors and international tourists and these numbers are predicted that they more and more rocker considerably.
Therefore, this study is focused on finding the best method describing the most accurate result easily to forecast tourism demand. In this research, we applied five models, namely GM (1, 1), DGM (1, 1), DGM (2, 1), Verhulst and ARIMA to test and look for the models which augment best results and minimum the forecasting errors. As can be seen from the above tables (Tables 7-15), GM (1, 1), DGM (1, 1), Verhulst and ARIMA are better to predict all the factors, viz. the tourism revenue, the proportion of tourists (both domestic visitors and international arrivals) because the parameter of MAPE, MSE, RMSE, and MAD are accepted for the process. Nevertheless, DGM (2, 1) is a poor model to forecast the demand for tourism in Binh Thuan Province (cf. Chia-Nan and Ty, 2013;Nguyen et al., 2015;Nguyen and Tran, 2018).
According to the results, it is easy to consider a realistic consequence. It is a fact that applying ARIMA for prediction of total revenue is the best choice. Otherwise, about the domestic visitors and international tourists, GM (1, 1), DGM (1, 1) and Verhulst give better calculation than the other models. Besides, the application of GM (1, 1), DGM (1, 1), Verhulst and ARIMA to forecast the number of visitors of top six markets (Russia, Germany, France, Korea, China and USA) sending the largest number of tourists describes good results and these numbers will go up in next 5 years. During the forecasting process, the number of Chinese tourists has the strongest upward trend, the number of Russian and Korean arrivals also increases and the numbers of others fluctuate by year. For all the factors, DGM (2, 1) is rejected to predict due to the poor results. In general, GM (1, 1), DGM (1, 1), Verhulst and ARIMA are concise and accurate models for forecasting tourism demand in Binh Thuan.
In conclusion, it is no doubt that the tourism industry has developed rapidly for recent years in Binh Thuan. Hence, the government has to propose suitable policies to develop the local tourism industry to serve a large number of tourists, also attract investors and invest in construction potential projects.