DATA Step, Macro, Functions and more

removing consecutive returns less than X months

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

removing consecutive returns less than X months

[ Edited ]

Hello SAS programmers!

 

I am working with monthly stock data from Feb 1980 to Dec 2016.

The dataset has roughly 450,000 stock-month

 

What I want to do is to create a variable which represents consecutive runs of months (within stocks and between stocks)

and remove all stock-month observations less than 6months runs.

 

If you check following data structure you'll catch this.

 

stock date run
1 2000JAN  1
1 2000FEB  2
1 2000MAR 3
1 2000APR  4
1 2000AUG 1
1 2000SEP  2
1 2000DEC 1
1 2001JAN  2
1 2001FEB  3
2 2000JAN  1
2 2000FEB  2
2 2000MAR  3
2 2000APR  4
2 2000AUG  1
2 2000SEP  2
2 2000DEC  1
2 2001JAN  2
2 2001FEB  3

 

I want to fix my dataset as following(Deleting stock-month block with maximum run<3).

 

stock date run
1 2000JAN  1
1 2000FEB  2
1 2000MAR 3
1 2000APR  4
1 2000DEC 1
1 2001JAN  2
1 2001FEB  3

 

2 2000JAN  1
2 2000FEB  2
2 2000MAR  3
2 2000APR  4
2 2000DEC  1
2 2001JAN  2
2 2001FEB  3

 

 

In this situation, how can I fix my dataset and variable "run"??

 Please notice that I have 450,000 obs of full dataset so please give me general approach to do this. 

THX.


Accepted Solutions
Solution
‎05-23-2017 02:13 AM
PROC Star
Posts: 283

Re: removing consecutive returns less than X months

If i understood your req:

 

 

data have;

set have;

by stock;

if first.stock then grp=0;

if run=1 then grp+1;

run;

 

 

data want;

do n=1 by 1 until(last.grp);

set have;

by stock grp;

end;

do until(last.grp);

set have;

by stock grp;

if n>=3 then output;

end;

drop n grp;

run;

 

Regards,

Naveen Srinivasan

View solution in original post


All Replies
PROC Star
Posts: 754

Re: removing consecutive returns less than X months

Is your date variable formatted in this way?

Occasional Contributor
Posts: 10

Re: removing consecutive returns less than X months

My dates are the last days of the month in given year,

 

for example, 

30NOV1980

31DEC1980

31JAN1981

28FEB1981

PROC Star
Posts: 754

Re: removing consecutive returns less than X months

Ok. Not sure I understand, eg these lines

 

1 2000OCT 1
1 2000SEP  2

 

Why would your run variable increase in this case?

 

Occasional Contributor
Posts: 10

Re: removing consecutive returns less than X months

[ Edited ]

Oh...sorry... TYPO..
it's August...I corrected it

PROC Star
Posts: 754

Re: removing consecutive returns less than X months

[ Edited ]

Makes more sense. I have implemented your run variable, though I am not certain aboyut the restrictions to put on it. Here I simply output only if run < 3.

 

data have;
input stock$ date:date9.;
format date date9.;
datalines;
1 31JAN2000
1 28FEB2000
1 31MAR2000
1 30APR2000
1 31AUG2000
1 30SEP2000
1 31DEC2000
1 31JAN2001
1 28FEB2001
2 31JAN2000
2 28FEB2000
2 31MAR2000
2 30APR2000
2 31AUG2000
2 30SEP2000
2 31DEC2000
2 31JAN2001
2 28FEB2001
;

data want;
	set have;
	
	if intck('month',lag1(date),date) = 1 then run + 1;
	else run = 1;

	retain run 1;
	if run <3;			/* Subsetting if */
run;
Occasional Contributor
Posts: 10

Re: removing consecutive returns less than X months

This code for generating variable run is superb..
Thank you so much draycut for your advice.
Solution
‎05-23-2017 02:13 AM
PROC Star
Posts: 283

Re: removing consecutive returns less than X months

If i understood your req:

 

 

data have;

set have;

by stock;

if first.stock then grp=0;

if run=1 then grp+1;

run;

 

 

data want;

do n=1 by 1 until(last.grp);

set have;

by stock grp;

end;

do until(last.grp);

set have;

by stock grp;

if n>=3 then output;

end;

drop n grp;

run;

 

Regards,

Naveen Srinivasan

Occasional Contributor
Posts: 10

Re: removing consecutive returns less than X months

Posted in reply to novinosrin

This is exactly what I want.

Based on draycut's argument, this code completes the procedure

 

Thank you very much!!! I really appreciate it!

Trusted Advisor
Posts: 1,137

Re: removing consecutive returns less than X months

data want;
set have;
by stock;
retain count;
if first.stock then count=.;
if run=1 then count+1;
run;

data want2(where=(count2>=3));
do until(last.count);
set want;
by stock count;
retain count2;
if first.count then count2=1;
else count2+1;
end;
do until(last.count);
set want;
by stock count;
output;
end;
drop count count2;
run;
Thanks,
Jag
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 240 views
  • 3 likes
  • 4 in conversation