Hi, all,
I always appreciate your expertise and help-out.
Without your help, I couldn't have survived up to this moment.
I have a panel data based on year-months.
I need to check and mark three or more consecutive years in which trades were made in the same months. So, I sorted my data by ID, CUSIP, MONTH, and YEAR.
Then, the data runs as follows:
ID CUSIP YEAR MONTH
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
Now, based on this dataset, I am going to mark the identical months over equal to or more than three observations. But, at the same time, YEAR should be consecutive.
Thus, my desired results are supposed to look like the following.
ID CUSIP YEAR MON Mark
13 92939U 1992 2 0
13 92939U 1993 3 0
13 92939U 1992 4 1
13 92939U 1993 4 1
13 92939U 1994 4 1
13 92939U 1993 7 0
13 92939U 1994 7 0
13 92939U 1992 8 1
13 92939U 1993 8 1
13 92939U 1994 8 1
13 92939U 1993 9 0
13 92939U 1994 9 0
13 92939U 1992 10 1
13 92939U 1993 10 1
13 92939U 1994 10 1
13 92939U 1992 11 1
13 92939U 1993 11 1
13 92939U 1994 11 1
13 92939U 1995 11 1
50 381317 1993 2 0
50 381317 1994 2 0
50 381317 1996 2 0
50 381317 1997 2 0
Only the same months more than three obs. over consecutive years are marked 1, otherwise zero.
Thank you in advance!
KS -,
Use a double DOW loop.
In the first loop count the length of the run.
Then convert the count to your 0/1 binary flag.
Then in the second loop re-read the data and write out all of the observations for this run.
data have;
input ID $ CUSIP $ YEAR MONTH ;
cards;
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
;
data want;
do _n_=1 by 1 until (last.month);
set have ;
by id CUSIP month ;
end;
mark = _n_ >=3 ;
do _n_=1 to _n_ ;
set have;
output;
end;
run;
Results
Obs ID CUSIP YEAR MONTH mark 1 13 92939U 1992 2 0 2 13 92939U 1993 3 0 3 13 92939U 1992 4 1 4 13 92939U 1993 4 1 5 13 92939U 1994 4 1 6 13 92939U 1993 7 0 7 13 92939U 1994 7 0 8 13 92939U 1992 8 1 9 13 92939U 1993 8 1 10 13 92939U 1994 8 1 11 13 92939U 1993 9 0 12 13 92939U 1994 9 0 13 13 92939U 1992 10 1 14 13 92939U 1993 10 1 15 13 92939U 1994 10 1 16 13 92939U 1992 11 1 17 13 92939U 1993 11 1 18 13 92939U 1994 11 1 19 13 92939U 1995 11 1 20 50 381317 1993 2 1 21 50 381317 1994 2 1 22 50 381317 1996 2 1 23 50 381317 1997 2 1
Use a double DOW loop.
In the first loop count the length of the run.
Then convert the count to your 0/1 binary flag.
Then in the second loop re-read the data and write out all of the observations for this run.
data have;
input ID $ CUSIP $ YEAR MONTH ;
cards;
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
;
data want;
do _n_=1 by 1 until (last.month);
set have ;
by id CUSIP month ;
end;
mark = _n_ >=3 ;
do _n_=1 to _n_ ;
set have;
output;
end;
run;
Results
Obs ID CUSIP YEAR MONTH mark 1 13 92939U 1992 2 0 2 13 92939U 1993 3 0 3 13 92939U 1992 4 1 4 13 92939U 1993 4 1 5 13 92939U 1994 4 1 6 13 92939U 1993 7 0 7 13 92939U 1994 7 0 8 13 92939U 1992 8 1 9 13 92939U 1993 8 1 10 13 92939U 1994 8 1 11 13 92939U 1993 9 0 12 13 92939U 1994 9 0 13 13 92939U 1992 10 1 14 13 92939U 1993 10 1 15 13 92939U 1994 10 1 16 13 92939U 1992 11 1 17 13 92939U 1993 11 1 18 13 92939U 1994 11 1 19 13 92939U 1995 11 1 20 50 381317 1993 2 1 21 50 381317 1994 2 1 22 50 381317 1996 2 1 23 50 381317 1997 2 1
data have;
input ID $ CUSIP $ YEAR MONTH ;
cards;
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
;
proc sql;
create table want as
select *,range(year)+1=count(*) and count(*)>2 as mark
from have
group by id,cusip,month;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.