BookmarkSubscribeRSS Feed
karen8169
Obsidian | Level 7

Well, I have seasonal data and I want to mimic weekly data through moving bootstrapping. But I can't find the most suitable sample and I just piece them together. Is the concept that the mean and var come from AR right? And how to use it in the bootstrapping?

And I want to get the distribution of them.

proc import datafile='C:\Users\user\Desktop\\morgan.csv'
out= morgan dbms=dlm; delimiter=',';
format Date yymm.;
getnames=yes;
run;

   data morgan;
   set a;
   PXlag = lag1(PX );
run;

proc autoreg data=b;
   model PX = PXlag / lagdep=PXlag;
  output out=resid mean standards;
  r=resid;
run;

proc surveyselect data=b out=outboot 
seed=30459584 
method=urs 
samprate=1 
outhits 
rep=1000; 
run;
proc univariate data=outboot;
var x;
by Replicate; 
output out=outall kurtosis=curt;
run;
proc univariate data=outall;
var curt;
output out=final pctlpts=2.5, 97.5 pctlpre=ci;
run;

 

6 REPLIES 6
Rick_SAS
SAS Super FREQ

1. Do you have a reference for what you are trying to achieve? 

2. Do you intend to use PROC IML to implement the bootstrap? 

 

Most examples use a bootstrap to resample IID data. You need to be careful in time series data to preserve the time-element of the data. I suggest you do an internet search and read about

>  bootstrap "time series"

 

karen8169
Obsidian | Level 7

The page in the file, 211, is 4. And I want to simulate a moving block bootstrapping, but I only find  simple simulation, the other file, so I am here to ask qutions. If I choose wrong topic to post, I can move to another place.

Rick_SAS
SAS Super FREQ

The topic is very appropriate, but I just wanted to know if you have a license for SAS/IML and can implement a SAS/IML solution if we provide one.

karen8169
Obsidian | Level 7
%macro def_spread_bootstrap(input_data =, boot_iter = 250);
	
	/* Compute the number of observations in the input dataset: */
	%local nobs_input; 
	proc sql noprint;
		select count(*) into :nobs_input
			from &input_data
		;
	quit;
	
	/* First, fit the models for the original data to obtain residuals */

	/* Excess return is regressed on the 1st lag of default spread. */
	proc autoreg data = &input_data(drop = div_yield vix) noprint; 
      	model ret_excess =  def_spread_lag1 / method = ml;
	 	output out = areg_out1 r = full_resid1 p = full_pred1; 
	run;
	quit;

	/* Fitting AR(2) model for the default spread. */
	proc autoreg data = &input_data(drop = div_yield vix) noprint; 
      	model def_spread =  / nlag = 2 method = ml;
	  	output out = areg_out2 r = full_resid2 p = full_pred2; 
	run;
	quit;

	/* Combine output from the two regressions */
	data combined_output;
		set areg_out1;
		set areg_out2 (keep = full_pred2 full_resid2);
	run;

	/* Predict the excess return out-of-sample. The first observation is going to be the predicted 
	   value from the original model, and the rest will be filled via bootstrap.
	*/
	data bs_result;
		set combined_output (keep = full_pred1 firstobs = &nobs_input 
						 rename = (full_pred1 = pred_bs));
	run;

	/* Extract bivariate residuals:  */
	data bivar_resid;
		set combined_output (keep = full_resid1 full_resid2 
					 	 rename = (full_resid1 = full_resid1_bs full_resid2 = full_resid2_bs));	
	run;

	/* The following loop produces &boot_iter boostraped datasets. For each dataset,
   	   the predicted return is calculated and added to bs_result. */

	%do i = 1 %to &boot_iter;

		/* To perform sampling w/o replacement, shuffle the bivariate residuals: */ 
		data bivar_resid;
			set bivar_resid;
			rnd =  ranuni(0);
			if _n_ eq 1 then rnd = 0;
			if _n_ eq &nobs_input then rnd = 1;
		run;
		proc sort data = bivar_resid;
			by rnd;
		run;

		data input_bs;
			set combined_output;
			/* Add the shuffled residuals to the combined_output: */
			set bivar_resid (drop = rnd);
			/*  Create boostrapped time series by adding the shuffled residuals 
	    		to the fitted values: */
			ret_excess_bs = full_pred1 + full_resid1_bs;
			def_spread_bs = full_pred2 + full_resid2_bs;
			if _n_ eq 1 then ret_excess_bs = ret_excess;
			if _n_ eq 1 then def_spread_bs = def_spread;
			if _n_ eq &nobs_input then  def_spread_bs = .;
			def_spread_bs_lag1 = lag(def_spread_bs);
			keep date ret_excess_bs def_spread_bs def_spread_bs_lag1;
		run;

		/* Run the model for the excess return based on the bootstrapped series: */
		proc autoreg data = input_bs noprint; 
      		model ret_excess_bs = def_spread_bs_lag1 / method = ml;
	 		output out = out_bs p = pred_bs;
		run;
		quit;

		/* Extract the predicted return and append it to bs_result: */
		proc append base = bs_result 
				  data = out_bs(keep = pred_bs firstobs = &nobs_input);
		run;	

	%end; /* do cycle */
	
	/* Plot the predicted returns: */
	proc univariate data = bs_result noprint;
		title 'Histogram of predicted excess returns';	
		histogram pred_bs / cfill = pink;
	run;

	/* Compute the boostrap t-statistic: */		
	proc ttest data = bs_result;		
		title 'Bootstrap t-value';		
	run;

	/* Delete all datasets that were created within the macro: 	*/
	proc sql;
		drop table areg_out1, areg_out2, bivar_resid, bs_result,
			combined_output, input_bs, out_bs
		;
	quit;
	
%mend def_spread_bootstrap;

Multivariate time series bootstrap code is the closet one that I can find. But I don't know how to modify it. That's the reason why I post the original code and want to modify it.

 

http://www.ntuzov.com/Nik_Site/Site_pages/Software_skills/SAS.htm

Ksharp
Super User

Check

 Example 9.12: Simulations of a Univariate ARMA Process  

in

 Chapter 9
General Statistics Examples

of IML documentation.

 

 

And there are a bunch of function you can use to simulation time series data in IML.

Check

 Chapter 13
Time Series Analysis and Examples

and maybe you could find one .

 

 

 

karen8169
Obsidian | Level 7

Sorry, after discussion with others, I believe maybe I mistake the soluthion. In the beginging, I want to use the bootstrapping to slove the insufficient of my data. So I want to get the distribution of the data and then simulate to impute the data. Now I believe that I can dicrectly use the imputation to handle the problem. I'm sorry for your confusion and thank you for your help.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 2206 views
  • 0 likes
  • 3 in conversation