BookmarkSubscribeRSS Feed
karen8169
Obsidian | Level 7

Well, I have seasonal data and I want to mimic weekly data through moving bootstrapping. But I can't find the most suitable sample and I just piece them together. Is the concept that the mean and var come from AR right? And how to use it in the bootstrapping?

And I want to get the distribution of them.

proc import datafile='C:\Users\user\Desktop\\morgan.csv'
out= morgan dbms=dlm; delimiter=',';
format Date yymm.;
getnames=yes;
run;

   data morgan;
   set a;
   PXlag = lag1(PX );
run;

proc autoreg data=b;
   model PX = PXlag / lagdep=PXlag;
  output out=resid mean standards;
  r=resid;
run;

proc surveyselect data=b out=outboot 
seed=30459584 
method=urs 
samprate=1 
outhits 
rep=1000; 
run;
proc univariate data=outboot;
var x;
by Replicate; 
output out=outall kurtosis=curt;
run;
proc univariate data=outall;
var curt;
output out=final pctlpts=2.5, 97.5 pctlpre=ci;
run;

 

6 REPLIES 6
Rick_SAS
SAS Super FREQ

1. Do you have a reference for what you are trying to achieve? 

2. Do you intend to use PROC IML to implement the bootstrap? 

 

Most examples use a bootstrap to resample IID data. You need to be careful in time series data to preserve the time-element of the data. I suggest you do an internet search and read about

>  bootstrap "time series"

 

karen8169
Obsidian | Level 7

The page in the file, 211, is 4. And I want to simulate a moving block bootstrapping, but I only find  simple simulation, the other file, so I am here to ask qutions. If I choose wrong topic to post, I can move to another place.

Rick_SAS
SAS Super FREQ

The topic is very appropriate, but I just wanted to know if you have a license for SAS/IML and can implement a SAS/IML solution if we provide one.

karen8169
Obsidian | Level 7
%macro def_spread_bootstrap(input_data =, boot_iter = 250);
	
	/* Compute the number of observations in the input dataset: */
	%local nobs_input; 
	proc sql noprint;
		select count(*) into :nobs_input
			from &input_data
		;
	quit;
	
	/* First, fit the models for the original data to obtain residuals */

	/* Excess return is regressed on the 1st lag of default spread. */
	proc autoreg data = &input_data(drop = div_yield vix) noprint; 
      	model ret_excess =  def_spread_lag1 / method = ml;
	 	output out = areg_out1 r = full_resid1 p = full_pred1; 
	run;
	quit;

	/* Fitting AR(2) model for the default spread. */
	proc autoreg data = &input_data(drop = div_yield vix) noprint; 
      	model def_spread =  / nlag = 2 method = ml;
	  	output out = areg_out2 r = full_resid2 p = full_pred2; 
	run;
	quit;

	/* Combine output from the two regressions */
	data combined_output;
		set areg_out1;
		set areg_out2 (keep = full_pred2 full_resid2);
	run;

	/* Predict the excess return out-of-sample. The first observation is going to be the predicted 
	   value from the original model, and the rest will be filled via bootstrap.
	*/
	data bs_result;
		set combined_output (keep = full_pred1 firstobs = &nobs_input 
						 rename = (full_pred1 = pred_bs));
	run;

	/* Extract bivariate residuals:  */
	data bivar_resid;
		set combined_output (keep = full_resid1 full_resid2 
					 	 rename = (full_resid1 = full_resid1_bs full_resid2 = full_resid2_bs));	
	run;

	/* The following loop produces &boot_iter boostraped datasets. For each dataset,
   	   the predicted return is calculated and added to bs_result. */

	%do i = 1 %to &boot_iter;

		/* To perform sampling w/o replacement, shuffle the bivariate residuals: */ 
		data bivar_resid;
			set bivar_resid;
			rnd =  ranuni(0);
			if _n_ eq 1 then rnd = 0;
			if _n_ eq &nobs_input then rnd = 1;
		run;
		proc sort data = bivar_resid;
			by rnd;
		run;

		data input_bs;
			set combined_output;
			/* Add the shuffled residuals to the combined_output: */
			set bivar_resid (drop = rnd);
			/*  Create boostrapped time series by adding the shuffled residuals 
	    		to the fitted values: */
			ret_excess_bs = full_pred1 + full_resid1_bs;
			def_spread_bs = full_pred2 + full_resid2_bs;
			if _n_ eq 1 then ret_excess_bs = ret_excess;
			if _n_ eq 1 then def_spread_bs = def_spread;
			if _n_ eq &nobs_input then  def_spread_bs = .;
			def_spread_bs_lag1 = lag(def_spread_bs);
			keep date ret_excess_bs def_spread_bs def_spread_bs_lag1;
		run;

		/* Run the model for the excess return based on the bootstrapped series: */
		proc autoreg data = input_bs noprint; 
      		model ret_excess_bs = def_spread_bs_lag1 / method = ml;
	 		output out = out_bs p = pred_bs;
		run;
		quit;

		/* Extract the predicted return and append it to bs_result: */
		proc append base = bs_result 
				  data = out_bs(keep = pred_bs firstobs = &nobs_input);
		run;	

	%end; /* do cycle */
	
	/* Plot the predicted returns: */
	proc univariate data = bs_result noprint;
		title 'Histogram of predicted excess returns';	
		histogram pred_bs / cfill = pink;
	run;

	/* Compute the boostrap t-statistic: */		
	proc ttest data = bs_result;		
		title 'Bootstrap t-value';		
	run;

	/* Delete all datasets that were created within the macro: 	*/
	proc sql;
		drop table areg_out1, areg_out2, bivar_resid, bs_result,
			combined_output, input_bs, out_bs
		;
	quit;
	
%mend def_spread_bootstrap;

Multivariate time series bootstrap code is the closet one that I can find. But I don't know how to modify it. That's the reason why I post the original code and want to modify it.

 

http://www.ntuzov.com/Nik_Site/Site_pages/Software_skills/SAS.htm

Ksharp
Super User

Check

 Example 9.12: Simulations of a Univariate ARMA Process  

in

 Chapter 9
General Statistics Examples

of IML documentation.

 

 

And there are a bunch of function you can use to simulation time series data in IML.

Check

 Chapter 13
Time Series Analysis and Examples

and maybe you could find one .

 

 

 

karen8169
Obsidian | Level 7

Sorry, after discussion with others, I believe maybe I mistake the soluthion. In the beginging, I want to use the bootstrapping to slove the insufficient of my data. So I want to get the distribution of the data and then simulate to impute the data. Now I believe that I can dicrectly use the imputation to handle the problem. I'm sorry for your confusion and thank you for your help.

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 2883 views
  • 0 likes
  • 3 in conversation