BookmarkSubscribeRSS Feed
Bright
Obsidian | Level 7

Hi, 

I have a dataset with dates ranging from Feb2018 to Feb2019. I need to exclude the data corresponding to each month and perform some procedures for the data of the remaining months. For example my code for running a logistic regression excluding the data of Nov2018 is as follows. Anybody can help to automate this process (instead of manually changing dates in the where statement)? Thanks!

proc logistic data=have (where=(Date_r >= '01Dec2018'd  | Date_r <'1Nov2018'd));
model High_level (event='1')= Score_r Length_r;
run;

 

1 REPLY 1
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Bright 

 

A good approach is to use Call Execute to generate and run all the Proc Logistic steps. The problem can be divided in two: 

 

First get a list of all months to exclude, where each month is expressed as the first and last date in a given month, so it can be used in a where condition. In this example  the outher boundaries (first and last month to exclude) are taken from the first and last date found in the input data set, and the list of months is made in a loop, so all months betewwn the boundaries are written to a data set. 

 

Then generate calls to Proc Logistic with en exclude-condition for each month. In this example it is done in a data step whith the month list from the previous step as input. The outher bounds and the current month to exclude is included in a title statement, so the information will show in the Proc Logistic output.

 

The final data step is included in two versions: One that writes the generated statements to the log, so they can be copied into the program editor and executed there, and one that uses Call Execute to execute the generated statements also. Alse included is a first step that creates test input that can be used by the code but doesn't make sense as real input to Proc Logistic.

 

/************************************************************************************/
/* Create test data - works in code but doen't make sense as input to proc logistic */
/************************************************************************************/
data have;
	do High_level = 1 to 13;
		Date_r = intnx('month','01jan2018'd,High_level)+15;
		Score_r = 1000;
		Length_r = 1000;
		output;
	end;
run;
	
/************************************************************************************/
/* working code - input data set is specified in the macro variable _INPUT          */
/************************************************************************************/
%let _INPUT = work.have;

* Get first and last date_r from input;
proc sql;
	create table dateminmax as 
		select 
			min(date_r) as DateFirst, 
			max(date_r) as DateLast 
		from &_INPUT;
quit;

* create data set with first and last date in all months;
data monthlist (drop=i);
	format DateFirst DateLast ExcludeFirst ExcludeLast date9.;
	set dateminmax;
	do i = 0 to intck('month',DateFirst,DateLast);
		ExcludeFirst = intnx('month',DateFirst,i);
		ExcludeLast = intnx('month',DateFirst,i,'e');
		output;
	end;
run;

* Generate Proc Logistic steps and write to log;
data _null_; set monthlist (firstobs=3);
	length u1 u2 u3 $100;
	u1 = "title 'Proc Logistic - interval " || put(DateFirst,monyy.) || " - " || put(DateLast,monyy.) ||
		" - exclude " || put(ExcludeFirst,monyy.) || "';";
	u2 = 'proc logistic data=&_INPUT (where=(Date_r > ' || 
		put(ExcludeLast,5.) || ' or Date_r < ' || put(ExcludeFirst,5.) || '));';
	u3 = "model High_level (event='1')= Score_r Length_r; run;";
	put / u1;
	put u2;
	put u3;
run;

* Generate and execute Proc Logistic steps;
data _null_; set monthlist (firstobs=3);
	length u1 u2 u3 $100;
	u1 = "title 'Proc Logistic - interval " || put(DateFirst,monyy.) || " - " || put(DateLast,monyy.) ||
		" - exclude " || put(ExcludeFirst,monyy.) || "';";
	u2 = 'proc logistic data=&_INPUT (where=(Date_r > ' || 
		put(ExcludeLast,5.) || ' or Date_r < ' || put(ExcludeFirst,5.) || '));';
	u3 = "model High_level (event='1')= Score_r Length_r; run;";
	call execute (u1);
	call execute (u2);
	call execute (u3);
run;

 

Example from log - steps generated:  

 

title 'Proc Logistic - interval FEB18 - FEB19 - exclude JUN18';
proc logistic data=&_INPUT (where=(Date_r > 21365 or Date_r < 21336));
model High_level (event='1')= Score_r Length_r; run;

title 'Proc Logistic - interval FEB18 - FEB19 - exclude JUL18';
proc logistic data=&_INPUT (where=(Date_r > 21396 or Date_r < 21366));
model High_level (event='1')= Score_r Length_r; run;

Example from Proc Logistic output - note dynamic title:

 

 

logistic.gif

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 399 views
  • 1 like
  • 2 in conversation