Well, I have seasonal data and I want to mimic weekly data through moving bootstrapping. But I can't find the most suitable sample and I just piece them together. Is the concept that the mean and var come from AR right? And how to use it in the bootstrapping?
And I want to get the distribution of them.
proc import datafile='C:\Users\user\Desktop\\morgan.csv'
out= morgan dbms=dlm; delimiter=',';
format Date yymm.;
getnames=yes;
run;
data morgan;
set a;
PXlag = lag1(PX );
run;
proc autoreg data=b;
model PX = PXlag / lagdep=PXlag;
output out=resid mean standards;
r=resid;
run;
proc surveyselect data=b out=outboot
seed=30459584
method=urs
samprate=1
outhits
rep=1000;
run;
proc univariate data=outboot;
var x;
by Replicate;
output out=outall kurtosis=curt;
run;
proc univariate data=outall;
var curt;
output out=final pctlpts=2.5, 97.5 pctlpre=ci;
run;
1. Do you have a reference for what you are trying to achieve?
2. Do you intend to use PROC IML to implement the bootstrap?
Most examples use a bootstrap to resample IID data. You need to be careful in time series data to preserve the time-element of the data. I suggest you do an internet search and read about
> bootstrap "time series"
The page in the file, 211, is 4. And I want to simulate a moving block bootstrapping, but I only find simple simulation, the other file, so I am here to ask qutions. If I choose wrong topic to post, I can move to another place.
The topic is very appropriate, but I just wanted to know if you have a license for SAS/IML and can implement a SAS/IML solution if we provide one.
%macro def_spread_bootstrap(input_data =, boot_iter = 250);
/* Compute the number of observations in the input dataset: */
%local nobs_input;
proc sql noprint;
select count(*) into :nobs_input
from &input_data
;
quit;
/* First, fit the models for the original data to obtain residuals */
/* Excess return is regressed on the 1st lag of default spread. */
proc autoreg data = &input_data(drop = div_yield vix) noprint;
model ret_excess = def_spread_lag1 / method = ml;
output out = areg_out1 r = full_resid1 p = full_pred1;
run;
quit;
/* Fitting AR(2) model for the default spread. */
proc autoreg data = &input_data(drop = div_yield vix) noprint;
model def_spread = / nlag = 2 method = ml;
output out = areg_out2 r = full_resid2 p = full_pred2;
run;
quit;
/* Combine output from the two regressions */
data combined_output;
set areg_out1;
set areg_out2 (keep = full_pred2 full_resid2);
run;
/* Predict the excess return out-of-sample. The first observation is going to be the predicted
value from the original model, and the rest will be filled via bootstrap.
*/
data bs_result;
set combined_output (keep = full_pred1 firstobs = &nobs_input
rename = (full_pred1 = pred_bs));
run;
/* Extract bivariate residuals: */
data bivar_resid;
set combined_output (keep = full_resid1 full_resid2
rename = (full_resid1 = full_resid1_bs full_resid2 = full_resid2_bs));
run;
/* The following loop produces &boot_iter boostraped datasets. For each dataset,
the predicted return is calculated and added to bs_result. */
%do i = 1 %to &boot_iter;
/* To perform sampling w/o replacement, shuffle the bivariate residuals: */
data bivar_resid;
set bivar_resid;
rnd = ranuni(0);
if _n_ eq 1 then rnd = 0;
if _n_ eq &nobs_input then rnd = 1;
run;
proc sort data = bivar_resid;
by rnd;
run;
data input_bs;
set combined_output;
/* Add the shuffled residuals to the combined_output: */
set bivar_resid (drop = rnd);
/* Create boostrapped time series by adding the shuffled residuals
to the fitted values: */
ret_excess_bs = full_pred1 + full_resid1_bs;
def_spread_bs = full_pred2 + full_resid2_bs;
if _n_ eq 1 then ret_excess_bs = ret_excess;
if _n_ eq 1 then def_spread_bs = def_spread;
if _n_ eq &nobs_input then def_spread_bs = .;
def_spread_bs_lag1 = lag(def_spread_bs);
keep date ret_excess_bs def_spread_bs def_spread_bs_lag1;
run;
/* Run the model for the excess return based on the bootstrapped series: */
proc autoreg data = input_bs noprint;
model ret_excess_bs = def_spread_bs_lag1 / method = ml;
output out = out_bs p = pred_bs;
run;
quit;
/* Extract the predicted return and append it to bs_result: */
proc append base = bs_result
data = out_bs(keep = pred_bs firstobs = &nobs_input);
run;
%end; /* do cycle */
/* Plot the predicted returns: */
proc univariate data = bs_result noprint;
title 'Histogram of predicted excess returns';
histogram pred_bs / cfill = pink;
run;
/* Compute the boostrap t-statistic: */
proc ttest data = bs_result;
title 'Bootstrap t-value';
run;
/* Delete all datasets that were created within the macro: */
proc sql;
drop table areg_out1, areg_out2, bivar_resid, bs_result,
combined_output, input_bs, out_bs
;
quit;
%mend def_spread_bootstrap;
Multivariate time series bootstrap code is the closet one that I can find. But I don't know how to modify it. That's the reason why I post the original code and want to modify it.
http://www.ntuzov.com/Nik_Site/Site_pages/Software_skills/SAS.htm
Check
Example 9.12: Simulations of a Univariate ARMA Process
in
Chapter 9
General Statistics Examples
of IML documentation.
And there are a bunch of function you can use to simulation time series data in IML.
Check
Chapter 13
Time Series Analysis and Examples
and maybe you could find one .
Sorry, after discussion with others, I believe maybe I mistake the soluthion. In the beginging, I want to use the bootstrapping to slove the insufficient of my data. So I want to get the distribution of the data and then simulate to impute the data. Now I believe that I can dicrectly use the imputation to handle the problem. I'm sorry for your confusion and thank you for your help.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.