BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jkim197
Obsidian | Level 7

Hello SAS programmers!

 

I am working with monthly stock data from Feb 1980 to Dec 2016.

The dataset has roughly 450,000 stock-month

 

What I want to do is to create a variable which represents consecutive runs of months (within stocks and between stocks)

and remove all stock-month observations less than 6months runs.

 

If you check following data structure you'll catch this.

 

stock date run
1 2000JAN  1
1 2000FEB  2
1 2000MAR 3
1 2000APR  4
1 2000AUG 1
1 2000SEP  2
1 2000DEC 1
1 2001JAN  2
1 2001FEB  3
2 2000JAN  1
2 2000FEB  2
2 2000MAR  3
2 2000APR  4
2 2000AUG  1
2 2000SEP  2
2 2000DEC  1
2 2001JAN  2
2 2001FEB  3

 

I want to fix my dataset as following(Deleting stock-month block with maximum run<3).

 

stock date run
1 2000JAN  1
1 2000FEB  2
1 2000MAR 3
1 2000APR  4
1 2000DEC 1
1 2001JAN  2
1 2001FEB  3

 

2 2000JAN  1
2 2000FEB  2
2 2000MAR  3
2 2000APR  4
2 2000DEC  1
2 2001JAN  2
2 2001FEB  3

 

 

In this situation, how can I fix my dataset and variable "run"??

 Please notice that I have 450,000 obs of full dataset so please give me general approach to do this. 

THX.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

If i understood your req:

 

 

data have;

set have;

by stock;

if first.stock then grp=0;

if run=1 then grp+1;

run;

 

 

data want;

do n=1 by 1 until(last.grp);

set have;

by stock grp;

end;

do until(last.grp);

set have;

by stock grp;

if n>=3 then output;

end;

drop n grp;

run;

 

Regards,

Naveen Srinivasan

View solution in original post

9 REPLIES 9
PeterClemmensen
Tourmaline | Level 20

Is your date variable formatted in this way?

jkim197
Obsidian | Level 7

My dates are the last days of the month in given year,

 

for example, 

30NOV1980

31DEC1980

31JAN1981

28FEB1981

PeterClemmensen
Tourmaline | Level 20

Ok. Not sure I understand, eg these lines

 

1 2000OCT 1
1 2000SEP  2

 

Why would your run variable increase in this case?

 

jkim197
Obsidian | Level 7

Oh...sorry... TYPO..
it's August...I corrected it

PeterClemmensen
Tourmaline | Level 20

Makes more sense. I have implemented your run variable, though I am not certain aboyut the restrictions to put on it. Here I simply output only if run < 3.

 

data have;
input stock$ date:date9.;
format date date9.;
datalines;
1 31JAN2000
1 28FEB2000
1 31MAR2000
1 30APR2000
1 31AUG2000
1 30SEP2000
1 31DEC2000
1 31JAN2001
1 28FEB2001
2 31JAN2000
2 28FEB2000
2 31MAR2000
2 30APR2000
2 31AUG2000
2 30SEP2000
2 31DEC2000
2 31JAN2001
2 28FEB2001
;

data want;
	set have;
	
	if intck('month',lag1(date),date) = 1 then run + 1;
	else run = 1;

	retain run 1;
	if run <3;			/* Subsetting if */
run;
jkim197
Obsidian | Level 7
This code for generating variable run is superb..
Thank you so much draycut for your advice.
novinosrin
Tourmaline | Level 20

If i understood your req:

 

 

data have;

set have;

by stock;

if first.stock then grp=0;

if run=1 then grp+1;

run;

 

 

data want;

do n=1 by 1 until(last.grp);

set have;

by stock grp;

end;

do until(last.grp);

set have;

by stock grp;

if n>=3 then output;

end;

drop n grp;

run;

 

Regards,

Naveen Srinivasan

jkim197
Obsidian | Level 7

This is exactly what I want.

Based on draycut's argument, this code completes the procedure

 

Thank you very much!!! I really appreciate it!

Jagadishkatam
Amethyst | Level 16
data want;
set have;
by stock;
retain count;
if first.stock then count=.;
if run=1 then count+1;
run;

data want2(where=(count2>=3));
do until(last.count);
set want;
by stock count;
retain count2;
if first.count then count2=1;
else count2+1;
end;
do until(last.count);
set want;
by stock count;
output;
end;
drop count count2;
run;
Thanks,
Jag

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1300 views
  • 3 likes
  • 4 in conversation