Well, I have seasonal data and I want to mimic weekly data through moving bootstrapping. But I can't find the most suitable sample and I just piece them together. Is the concept that the mean and var come from AR right? And how to use it in the bootstrapping?
And I want to get the distribution of them.
proc import datafile='C:\Users\user\Desktop\\morgan.csv'
out= morgan dbms=dlm; delimiter=',';
format Date yymm.;
getnames=yes;
run;
data morgan;
set a;
PXlag = lag1(PX );
run;
proc autoreg data=b;
model PX = PXlag / lagdep=PXlag;
output out=resid mean standards;
r=resid;
run;
proc surveyselect data=b out=outboot
seed=30459584
method=urs
samprate=1
outhits
rep=1000;
run;
proc univariate data=outboot;
var x;
by Replicate;
output out=outall kurtosis=curt;
run;
proc univariate data=outall;
var curt;
output out=final pctlpts=2.5, 97.5 pctlpre=ci;
run;
1. Do you have a reference for what you are trying to achieve?
2. Do you intend to use PROC IML to implement the bootstrap?
Most examples use a bootstrap to resample IID data. You need to be careful in time series data to preserve the time-element of the data. I suggest you do an internet search and read about
> bootstrap "time series"
The page in the file, 211, is 4. And I want to simulate a moving block bootstrapping, but I only find simple simulation, the other file, so I am here to ask qutions. If I choose wrong topic to post, I can move to another place.
The topic is very appropriate, but I just wanted to know if you have a license for SAS/IML and can implement a SAS/IML solution if we provide one.
%macro def_spread_bootstrap(input_data =, boot_iter = 250);
/* Compute the number of observations in the input dataset: */
%local nobs_input;
proc sql noprint;
select count(*) into :nobs_input
from &input_data
;
quit;
/* First, fit the models for the original data to obtain residuals */
/* Excess return is regressed on the 1st lag of default spread. */
proc autoreg data = &input_data(drop = div_yield vix) noprint;
model ret_excess = def_spread_lag1 / method = ml;
output out = areg_out1 r = full_resid1 p = full_pred1;
run;
quit;
/* Fitting AR(2) model for the default spread. */
proc autoreg data = &input_data(drop = div_yield vix) noprint;
model def_spread = / nlag = 2 method = ml;
output out = areg_out2 r = full_resid2 p = full_pred2;
run;
quit;
/* Combine output from the two regressions */
data combined_output;
set areg_out1;
set areg_out2 (keep = full_pred2 full_resid2);
run;
/* Predict the excess return out-of-sample. The first observation is going to be the predicted
value from the original model, and the rest will be filled via bootstrap.
*/
data bs_result;
set combined_output (keep = full_pred1 firstobs = &nobs_input
rename = (full_pred1 = pred_bs));
run;
/* Extract bivariate residuals: */
data bivar_resid;
set combined_output (keep = full_resid1 full_resid2
rename = (full_resid1 = full_resid1_bs full_resid2 = full_resid2_bs));
run;
/* The following loop produces &boot_iter boostraped datasets. For each dataset,
the predicted return is calculated and added to bs_result. */
%do i = 1 %to &boot_iter;
/* To perform sampling w/o replacement, shuffle the bivariate residuals: */
data bivar_resid;
set bivar_resid;
rnd = ranuni(0);
if _n_ eq 1 then rnd = 0;
if _n_ eq &nobs_input then rnd = 1;
run;
proc sort data = bivar_resid;
by rnd;
run;
data input_bs;
set combined_output;
/* Add the shuffled residuals to the combined_output: */
set bivar_resid (drop = rnd);
/* Create boostrapped time series by adding the shuffled residuals
to the fitted values: */
ret_excess_bs = full_pred1 + full_resid1_bs;
def_spread_bs = full_pred2 + full_resid2_bs;
if _n_ eq 1 then ret_excess_bs = ret_excess;
if _n_ eq 1 then def_spread_bs = def_spread;
if _n_ eq &nobs_input then def_spread_bs = .;
def_spread_bs_lag1 = lag(def_spread_bs);
keep date ret_excess_bs def_spread_bs def_spread_bs_lag1;
run;
/* Run the model for the excess return based on the bootstrapped series: */
proc autoreg data = input_bs noprint;
model ret_excess_bs = def_spread_bs_lag1 / method = ml;
output out = out_bs p = pred_bs;
run;
quit;
/* Extract the predicted return and append it to bs_result: */
proc append base = bs_result
data = out_bs(keep = pred_bs firstobs = &nobs_input);
run;
%end; /* do cycle */
/* Plot the predicted returns: */
proc univariate data = bs_result noprint;
title 'Histogram of predicted excess returns';
histogram pred_bs / cfill = pink;
run;
/* Compute the boostrap t-statistic: */
proc ttest data = bs_result;
title 'Bootstrap t-value';
run;
/* Delete all datasets that were created within the macro: */
proc sql;
drop table areg_out1, areg_out2, bivar_resid, bs_result,
combined_output, input_bs, out_bs
;
quit;
%mend def_spread_bootstrap;
Multivariate time series bootstrap code is the closet one that I can find. But I don't know how to modify it. That's the reason why I post the original code and want to modify it.
http://www.ntuzov.com/Nik_Site/Site_pages/Software_skills/SAS.htm
Check
Example 9.12: Simulations of a Univariate ARMA Process
in
Chapter 9
General Statistics Examples
of IML documentation.
And there are a bunch of function you can use to simulation time series data in IML.
Check
Chapter 13
Time Series Analysis and Examples
and maybe you could find one .
Sorry, after discussion with others, I believe maybe I mistake the soluthion. In the beginging, I want to use the bootstrapping to slove the insufficient of my data. So I want to get the distribution of the data and then simulate to impute the data. Now I believe that I can dicrectly use the imputation to handle the problem. I'm sorry for your confusion and thank you for your help.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.