BOOTSTRAP BASED MULTI-STEP AHEAD JOINT FORECAST DENSITIES FOR FINANCIAL INTERVAL-VALUED TIME SERIES

This study presents two interval-valued time series approaches to construct multivariate multi-step ahead joint forecast regions based on two bootstrap algorithms. The first approach is based on fitting a dynamic bivariate system via a VAR process for minimum and maximum of the interval while the second approach applies for mid-points and half-ranges of interval-valued time series. As a novel perspective, we adopt two bootstrap techniques into the proposed interval-valued time series approaches to obtain joint forecast regions of the lower/upper bounds of the intervals. The forecasting performances of the proposed approaches are evaluated by extensive Monte Carlo simulations and two real-world examples: (i) monthly S&P 500 stock indices; (ii) monthly USD/SEK exchange rates. Our results demonstrate that the proposed approaches are capable of producing valid multivariate forecast regions for interval-valued time series.


Introduction
In classical time series analysis, most of the quantitative forecasting techniques have usually concentrated on single-valued time series in which each variable takes one value at each point over time. Although single-valued data for the purposes of modelling and forecasting are useful in some practical cases, interval-valued data expressed in interval format contain more information compared to classic singlevalued data in many situations which the precision is required. One of the important sources of the interval-valued data is the aggregation of large data-bases in granules such as groups, clusters and classes when the data are observed with lower and upper bounds. Since gathering data has become much easier with the advancement BOO TSTRAP BASED M ULTI-STEP AHEAD JOINT FORECAST DENSITIES 157 of information technology and systems, large and complex data sets have been created in many application areas. As a consequent, interval-valued data representing uncertainty and variability arise in the aggregation of large data-bases to make the data easily manageable by providing an e¤ective summary for practitioners.
Interval-valued data have been examined within symbolic data analysis framework. Some references providing a comprehensive account of the technical details on this …eld include [12], [15], [16], [24], and [49]. Examples of symbolic data cover intervals, lists, histogram, modal-valued data and trees. As pointed out by [46], this area is related to multivariate analysis, pattern recognition and arti…cial intelligence, and deals with the extension of traditional exploratory data analysis and statistical methods to symbolic data. The authors also note that the symbolic data allow multiple values for each variable to tackle variability in the data by introducing new types of variables (multi-valued, set-valued, interval-valued, histogram-valued etc.). In practice, one of the most commonly encountered types of symbolic data is the interval-valued data which has received a remarkable attention in the context of regression, multivariate analysis and time series. However, analysis of these types of data using classic statistical methods may cause unsatisfactory results. Thus, various approaches have been proposed for modelling and analyzing certain attributes (such as lower and upper bounds, centers, ranges etc.) of the interval-valued data; see, for instance, [8], [14] and [52]. Since the paper of [12], various regression models have been developed for …tting interval-valued data. [12] proposed …tting a linear regression model to the centers of the intervals of the dependent and explanatory variables under the assumption of uniformly distributed intervals. Later, two separate regression models for the lower and upper bounds of the intervals proposed by [13] and [17]. Also, [42] considered a new method by …tting two separate regression models on the mid-points and ranges of the interval-valued data. Furthermore, [43] developed a modi…ed version of the linear regression model for the range of the intervals with non-negative constraints on the regression coe¢ cients. When the interval-valued time series which is a particular type of interval-valued data gathered in an ordered sequence over time (daily, weekly, monthly etc.) is considered, appropriate approaches used in the time series analysis are also needed for modelling and forecasting especially in economic and …nancial applications. In this context, the …rst approach is introduced by [55] who uses univariate ARIMA process for the bounds of intervals. Also, [56] proposed space-time autoregressive models considering the dependence between the lower and upper bounds of intervals, and obtained a model for the center and radius of intervals using the appropriate process of the interval bounds. [46] proposed an hybrid methodology by combining ARIMA and arti…cial neural network (ANN) models to forecast time series of interval-valued data and achieved better results by improving forecasting accuracy. Additionally, [20] built an empirical model for the daily highs and lows of three US stock indexes based on the vector error correction models (VECM), which is an extension of vector autoregressive models (VAR). Recently, a few di¤erent approaches within the interval time series forecasting framework have been developed by implementing VAR and VECM; see, for instance, [1], [2] and [29]. The main idea behind applying VAR or VECM is to construct a system consisting of two variables by using the extreme nature of the interval bounds or by center and radius of interval-valued time series so that interval forecasts can be generated with traditional multivariate time series methods.
In …nancial applications, classical time series models for single-valued time series data have been frequently used to describe the temporal evolution of price movements of …nancial assets, the ‡uctuations of stock indices, equities and exchange rates. However, using single-valued time series leads to disregard the intra-day variability or even at lower frequencies variability. On the other hand, if the highest and lowest price series of a …nancial sequence are considered, then the multivariate time series techniques such as VAR or VAR with moving average components (VARMA) can be used for forecasting and data description. As emphasized by [53], one of the most reliable and powerful tools are the vector autoregressions to describe the data and forecast future values. For some recent examples of VAR forecasting in the context of …nancial time series, see [5], [6], [22], [33], [35], [37], [40], [50] and [54]. Most of the early studies in the literature regarding the forecasts from VAR models consider marginal point forecasts; see [26], [44] and [47]. On the other hand, recent studies on VAR forecasting has focused on the forecast densities which include the uncertainty around a point forecast; see, for example [3], [7], [21], [34] and [48]. As stated by [38], the use of point forecasts is seriously ‡awed and meaningless when the extent of the associated uncertainty is unknown. However, interval forecasts allow for more informative inferences considering future uncertainty associated with each forecast; see [19]. Within the multivariate time series framework, one of the important aims is to construct joint forecast regions from VAR models. Such forecast regions are obtained in the shape of ellipsoid and Bonferroni cube by using marginal interval forecasts, see [45]. However, conventional techniques to obtain multivariate forecast densities for each horizon impose strong assumptions such as Gaussianity of forecast errors, known lag order and model parameters, which may not be provided especially in …nancial time series. Also, the constructed forecast densities based on VAR processes can be a¤ected due to any departure from the assumptions thus leading to unreliable results. One of the alternative computational approaches to generate such forecast densities without considering distributional assumptions and uncertainties caused by the model parameters and lag order is to use well-known resampling techniques, such as bootstrap. In this context, the …rst bootstrap technique based on the backward representation of the VAR model is introduced by [38]. Later, [39] proposed bias-corrected forecast regions by combining the bootstrap-after-bootstrap approach with the bias-corrected least-squares estimators of VAR parameters. On the other hand, using backward form of the model increases computational complexity and limits the implementation of bootstrap procedure when generating bootstrap resamples. To overcome these restrictions, a bootstrap method (abbreviated as "FRP") using the forward representation of the VAR model has been suggested by [28].
In this paper, we suggest an extension of the ordered non-overlapping block bootstrap (ONBB) proposed by [9] to construct multivariate forecast regions of …nancial interval-valued time series. Also, this paper presents a well-designed comparison between FRP and ONBB procedures by considering the dynamic structure of the …nancial time series. The main di¤erence of two bootstrap algorithms is that FRP uses forward residuals to generate bootstrap resamples whereas ONBB based algorithm is a combination of the use of ordered-nonoverlapping blocks and residual based bootstrap. Additionally, two interval-valued time series approaches are used to compare both methods in terms of their forecasting performances. The …rst approach (Min-Max) is based on the use of the minimum and maximum values of …nancial interval-valued time series while the second one (Center-Range) uses the mid-points and radius of the intervals. The both approaches can be used to model interval time series by specifying a dynamic bivariate system via VAR processes so that joint forecast regions for the lower and upper bounds of the intervals can be obtained using bootstrap techniques. The novelty of our work is that we adopt two bootstrap techniques into the interval-valued time series to obtain joint forecast densities of the lower/upper bounds of the intervals.
The rest of the paper is organized as follows. In Section 2, we present a detailed information on the interval time series approaches and describe bootstrap procedures considered in this study to obtain multivariate forecast regions of intervalvalued time series. Extensive Monte Carlo simulation under di¤erent data generating processes are conducted and the results are presented in Section 3. Two real-world examples: monthly S&P 500 stock indices and USD/SEK exchange rates are studied to evaluate the …nite sample performances of the proposed methods, and the results are reported in Section 4. Finally, some concluding remarks are given in Section 5.

Methodology
An interval-valued time series fY t g T t=1 is a realization of interval-valued variables collected in consecutive time points and described as a two-dimensional vector of time series repectively, denote the upper and lower boundaries of the interval process. Thus, for t = 1; ; T , the interval-valued time series is expressed as where T denotes the sample size. Then, the interval stochastic process, g is represented in the form of bivariate VAR model of order p (VAR(p)) as follows: ; p denote 2 2 matrix of the parameters satisfying the stationarity restrictions, Y = ( L ; U ) 0 is a vector including intercept terms, and " Y;t = " L t ; " U t 0 is a white noise sequence of 2 1 independent vectors with non-singular covariance matrix Y " . The non-singularity assumption of the covariance matrix " has been discussed by [18], [51], [30] and [57]. [30] investigated the pitfalls in cointegration tests in systems with the singular error covariance matrix. Also, [57] examined the characterization of Granger non-causality tests in VAR models when the assumption of non-singularity of the covariance matrix of the VAR innovations is violated. As noted by [57], although the non-singularity assumption is considered as a mild requirement, the dynamic systems with singular covariance matrix are frequently encountered in economic settings when the number of endogenous variables are greater than the number of unobserved shocks. However, as pointed out by [18] and [51], providing the nonsingular covariance matrix in time series analysis requires a modest condition that is denote the autocovariance function of the process, fX t g. In this study, we assume the non-singular covariance matrix of the innovations for stationary VAR models since an easily veri…able and realistic condition is needed for ensuring the non-singular covariance matrix for time series data as noted in [18] and [51].
represents the two-dimensional vector of time series consisting of the center (C t ) and half range , respectively. Throughout this paper, we assume that the bounds of interval process are weakly stationary, which also implies the weak stationarity of center (midpoint) and half range (radius) of the interval time series as the linear combinations of stationary components. Similar to Equation 2, fZ t g follows the VAR(p) model with coe¢ cient matrix, The method of Ordinary Least Squares (OLS) is applied to estimate the un- considering the stationarity conditions given by [36]. Then, where h is the forecast horizon, can be approximated by using the components, , as follows.
Multivariate time series forecasting methods to construct joint forecast densities for a given horizon create regions of elliptical form and/or Bonferroni cube ( [45]). On the other hand, obtaining these ellipsoids becomes intractable when the number of variables in the system and/or prediction steps are greater than two. Additionally, the constructed prediction ellipsoids using prediction densities has an unknown form for non-Gaussian errors thus leading to unreliable results. Therefore, the Bonferroni forecast cube is proposed by [45] to provide more appropriate forecast regions when dealing with asymmetric error distributions. Thus, we obtained only Bonferroni forecast regions to evaluate the …nite sample properties of our proposed procedures. Construction of such forecast regions may be a¤ected due to departures from the assumptions and lead to incorrect results, see [25] and [28]. Alternatively, the bootstrap method can be used to obtain multivariate forecast regions since it is not required full knowledge of the underlying data and distributional assumptions. In this study, the bootstrap methods proposed by [28] and [9] are used to construct joint forecast regions in interval-valued time series. The complete algorithm of the ONBB based forecast regions for VAR(p) models de…ned in Equations 2 and 3 are described as follows: Step 1. For a realization of VAR(p) process, determine the optimal-lag order p by using information criteria such as Step 2. Obtain the corresponding vector of residuals" Y;t and" Z;t for t = p + 1; ; T . LetF " Y;t andF " Z;t be the empirical distribution functions of the centered and re-scaled residuals.
and" Y;t and" Z;t are random samples fromF " Y;t andF " Z;t , respectively.
Step 5. Compute h = 1; 2; step ahead bootstrap replicates of future observations,Ŷ T (h) andẐ T (h), with the following recursions: Note that" Y;T +h and " Z;T +h are randomly drawn residuals fromF " Y;t andF " Z;t , respectively. As stated before, the main di¤erence between the ONBB and FRP procedures is that FRP utilizes the residuals obtained from the forward representation of the model, whereas the ONBB based algorithm uses the ONBB and residual based bootstrap methods to obtain bootstrap replicates. Hence, for the FRP method, we present only the Steps 3-4 in the algorithm described above.
Step 3. Generate the bootstrap series fY 1 ; ; Y T g and fZ 1 ; ; Z T g for t = 1; ; T as follows: where Y t = Y t and Z t = Z t for t = p + 1; ; 0. Also, " Y;t and " Z;t are randomly drawn residuals fromF " Y;t andF " Z;t , respectively. Then, the bootstrap Bonferroni cube (BBC) with at least 100(1 )% nominal coverage are obtained as follows: where q j ( )'s are the th percentiles of the generated bootstrap distributions of the jth elements of

Numerical Results
In order to assess the …nite sample performances of proposed approaches in obtaining joint forecast regions, we conduct a simulation study under six di¤erent experimental designs as presented in Table 1. The bivariate data generating processes consisting of center and half-range processes are designed considering the characteristics of the …nancial time series. For this purpose, center processes are generated from a known structure, an autoregressive model of order 2 (AR(2)), based on the assumption of uniformly distributed half-range interval time series with di¤erent choices of parameter values. For each of these con…gurations, we consider three error distributions: N (0; 1), t(5) and 2 (4), and two sample sizes: T = 100; 300. The sample size T should be su¢ ciently large for estimating the many parameters of the unrestricted VAR models since the number of parameters to be estimated, kp 2 + p, seriously increases with increasing number of equations, k (k = 2 for bivariate VAR models) and the number of lags (order of the VAR model), p ( [4]). In addition, in blocking techniques, the dependence structure of the original data can be preserved within the consecutive observations in each block, and the boostrap samples do not re ‡ect this dependency structure if the block length`is too small ( [41]). Moreover, the size of block length`is determined based on the sample size T to obtain consistent estimates of the model parameters, and it should increase when the sample size increases. Thus, the sample size can not be too small to apply ONBB method, and thus, we use the aforementioned sample sizes as in [28] and [11]. For ONBB method, three block lengths are chosen as`= T 1=3 ; T 1=4 ; T 1=5 as proposed by [31] since the performances of block resampling techniques are sensitive to block length selection. For each con…guration, M C = 2000 Monte Carlo simulations with B = 2000 bootstrap replications are performed. For many purposes, such as estimation of standard errors and constructing con…dence intervals, B = 1000 bootstrap samples are enough as noted in [27] and [32]. Also, for determining the appropriate number of bootstrap samples, [23] proposed a pretest procedure to minimize experimental randomness when performing bootstrap tests. The authors emphasizes that using a …nite number of bootstrap replicates may lead to some loss of power and the power of the test always increases as the number of bootstrap repetitions increases. In this study, we use 2000 bootstrap samples as in the study of [28] since we extend the FRP method to interval-valued data.
The signi…cance level is determined as 0.05 to construct 95% Bonferroni forecast cubes for forecast horizons h = 1; 2; ; 10. The performance of the ONBB based procedure is compared with the method of FRP by means of coverage probability and volume of the Bonferroni forecast cube.
To compare the performances of forecast cubes, bivariate VAR(p) processes are generated based on the parameter settings given in Table 1 ; 10 and m = 1; ; B as follows. Experimental designs Center process (C t ) Half-range process (R t ) Con…guration-I C t = 0:3C t 1 + 0:1C t 2 + " t U (0:3; 0:5) Con…guration-II C t = 0:4C t 1 + 0:2C t 2 + " t U (0:3; 0:5) Con…guration-III C t = 0:8C t 1 + 0:15C t 2 + " t U (0:3; 0:5) Con…guration-IV C t = 0:3C t 1 + 0:1C t 2 + " t U (0:5; 1) Con…guration-V C t = 0:4C t 1 + 0:2C t 2 + " t U (0:5; 1) Con…guration-VI C t = 0:8C t 1 + 0:15C t 2 + " t U (0:5; 1) Further, the volumes of the bootstrap Bonferroni cubes (V (BBC)) forŶ T (h) andẐ T (h) are calculated by the following equation.  the innovations follow t(5) distribution. For Con…guration-II, the coverage performances of the both bootstrap methods are similar with the exception of rightskewed error distribution. Furthermore, the coverage probabilities calculated by ONBB are closer to the true coverages than that of the FRP method when the innovations follow 2 (4) distribution (see second row of Figure 1). Figure 2 illustrates the estimated coverage probabilities of the Bonferroni cubes for the …rst three con…gurations when Min-Max approach is used. It is clear that the coverages estimated by bootstrap methods are rather similar regardless of the error distributions as it is shown in the …rst row of Figure 2. However, for Con…guration-II, the coverage probabilities of FRP method under-estimates the actual coverages for short term-forecasts, and over-estimates coverages as the forecast horizon increases. Moreover, the results for this con…guration indicate that the coverage probabilities  Figures 1-2), FRP method yields better coverage probabilities than ONBB especially for the 2 and normally distributed error terms. Figures 3-4 indicate the coverage performances of FRP and ONBB methods when the last three con…gurations given in Table 1 are considered in the data generation process. For Center-Range method, compared to FRP, ONBB has better performances for the increasing persistence of data generation process when the errors are Gaussian. Under persistent process of mid-points, for both interval-time series approaches, FRP and ONBB procedures under-estimate the true coverage probabilities far below the nominal level, in general. Although the coverage probabilities produced by both methods are in the same direction, the extent of under-estimation for FRP method is quite smaller than those of ONBB method for all error distributions. Also, one of the remarkable results regarding the Center-Range method is that the bootstrap procedures have the coverage performances with decreasing trend away from the actual coverage probabilities for the persistent process and non-Gaussian errors. Figures 5-8 show the estimated volumes of bootstrap Bonferroni cubes. In general, for both interval-time series approaches and all con…gurations considered in the study, ONBB method outperforms FRP by producing considerably less volumes of forecast cubes except for Center-Range method under Con…guration-V and t(5) distribution of innovations.

Case Studies
In this section, we consider two …nancial interval-valued time series; monthly S&P 500 index and USD/SEK (US Dollar to Swedish Krona) exchange rate data to compare the forecasting performances of bootstrap procedures (FRP and ONBB) and interval-valued time series methods (Center-Range and Min-Max).
The monthly S&P 500 index data, which is consisted a total of 156 observations (intervals), was obtained from November 2005 to October 2018, and the monthly USD/SEK exchange rate data, observed from January 2009 to November 2018, consists of 118 intervals. The lower and upper bounds of these data-sets have been transformed into the log-returns (which are calculated as Y t = log(Y t =Y t 1 ) and Z t = log(Z t =Z t 1 )) to remove the trend components. To construct out-of-sample forecast cubes, the datasets were divided into in-sample and out-of-sample sets. For     exchange rate data are not Gaussian either individually or jointly. Also, the ADF test results with small p values indicate that the minimum and maximum series de…ned on both datasets are stationary. Further, the Ljung-Box (LB) test statistics of order 12 (Q(12)) to test for autocorrelations in log-return series show that all the log-return series excluding the maximum of USD/SEK exchange rate data have statistically signi…cant autocorrelations. All of the exploratory analysis presented in Tables 2-3 suggests that the VAR(p) model is a suitable choice to model the minimum and maximum series of S&P 500 Index and USD/SEK exchange rate data. The optimal lag lengths of the VAR processes were determined using Akaike information criterion and the best models were chosen as VAR (6) for S&P 500 Index data and VAR(2) for USD/SEK exchange rate data according to the results of AIC. Also, the dominant roots of the …tted VAR models of order 6 and 2 are calculated as 0.851 (persistent) and 0.550 (stationary), respectively. The h-steps-ahead bootstrap forecast cubes, together with the true out-of-sample values which are denoted by a dot, are constructed based on B = 2000 bootstrap resamples for both real-world datasets. Also, the signi…cance level is set to 0.05 to obtain %95 Bonferroni cubes for next h = 1; 2; ; 12 observations. Note that the bootstrap forecast cubes are obtained for three block lengths`= T 1=3 ; T 1=4 ; T 1=5 as proposed by [31] when implementing ONBB method. Since the results obtained for those block lengths are similar, therefore to save space and to make this section more readable, we present only the results obtained for`= T 1=3 . (The results for T 1=4 and T 1=5 can be obtained from the author upon request.) The forecast cubes generated by FRP and ONBB methods regarding the Min-Max and Center-Range approaches are presented in Figures 9-10 for the S&P 500 Index data and in Figures  11-12 for the USD/SEK exchange rate data. Our results showed that the forecast regions obtained by Center-Range and Min-Max methods are similar and all the true values are covered by both methods, in general. However, the forecast cubes calculated by both interval-time series methods and bootstrap methods failed to include third out-of-sample point for S&P 500 Index data and fourth out-of-sample points for USD/SEK exchange rate data. Also, the forecast region obtained by ONBB for Min-Max method covers the second out-of-sample point of USD/SEK exchange rate data whereas the forecast cube produced by FRP failed to cover this true out-of-sample value for both interval-time series methods. Although FRP and ONBB based procedures generate similar Bonferroni forecast cubes in terms of covering the out-of-sample points, it is clear that the ONBB has less volumes of forecast cubes compared to FRP for short and long-term forecast horizons.

Conclusions
In this study, we propose the two interval-valued time series approaches to obtain joint forecast regions of …nancial interval-valued time series. The forecast regions are obtained using the ONBB method of [9] and the FRP method introduced by [28]. We examine the …nite sample performances of the proposed interval-time series approaches and bootstrap procedures by extensive Monte-Carlo simulations and two real-world examples. Our …ndings show that for the Min-Max method and persistent processes, the forecast regions produced by FRP are slightly better than those of obtained from the ONBB method in terms of the estimated coverage probabilities, in general. The prominent result is that the ONBB based procedure exhibit improved performance over the FRP method by generating less volumes of forecast cubes for Center-Range and Min-Max methods under all error distributions considered in the study. The similar results are also obtained for both real datasets, and our conclusions do not vary signi…cantly with di¤erent choices of the block lengths considered in this study.

Declaration of Competing Interests
The author declare that they have no known competing …nancial interests or personal relationships that could have appeared to in ‡uence the work reported in this paper.