- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, all,
I always appreciate your expertise and help-out.
Without your help, I couldn't have survived up to this moment.
I have a panel data based on year-months.
I need to check and mark three or more consecutive years in which trades were made in the same months. So, I sorted my data by ID, CUSIP, MONTH, and YEAR.
Then, the data runs as follows:
ID CUSIP YEAR MONTH
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
Now, based on this dataset, I am going to mark the identical months over equal to or more than three observations. But, at the same time, YEAR should be consecutive.
Thus, my desired results are supposed to look like the following.
ID CUSIP YEAR MON Mark
13 92939U 1992 2 0
13 92939U 1993 3 0
13 92939U 1992 4 1
13 92939U 1993 4 1
13 92939U 1994 4 1
13 92939U 1993 7 0
13 92939U 1994 7 0
13 92939U 1992 8 1
13 92939U 1993 8 1
13 92939U 1994 8 1
13 92939U 1993 9 0
13 92939U 1994 9 0
13 92939U 1992 10 1
13 92939U 1993 10 1
13 92939U 1994 10 1
13 92939U 1992 11 1
13 92939U 1993 11 1
13 92939U 1994 11 1
13 92939U 1995 11 1
50 381317 1993 2 0
50 381317 1994 2 0
50 381317 1996 2 0
50 381317 1997 2 0
Only the same months more than three obs. over consecutive years are marked 1, otherwise zero.
Thank you in advance!
KS -,
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use a double DOW loop.
In the first loop count the length of the run.
Then convert the count to your 0/1 binary flag.
Then in the second loop re-read the data and write out all of the observations for this run.
data have;
input ID $ CUSIP $ YEAR MONTH ;
cards;
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
;
data want;
do _n_=1 by 1 until (last.month);
set have ;
by id CUSIP month ;
end;
mark = _n_ >=3 ;
do _n_=1 to _n_ ;
set have;
output;
end;
run;
Results
Obs ID CUSIP YEAR MONTH mark 1 13 92939U 1992 2 0 2 13 92939U 1993 3 0 3 13 92939U 1992 4 1 4 13 92939U 1993 4 1 5 13 92939U 1994 4 1 6 13 92939U 1993 7 0 7 13 92939U 1994 7 0 8 13 92939U 1992 8 1 9 13 92939U 1993 8 1 10 13 92939U 1994 8 1 11 13 92939U 1993 9 0 12 13 92939U 1994 9 0 13 13 92939U 1992 10 1 14 13 92939U 1993 10 1 15 13 92939U 1994 10 1 16 13 92939U 1992 11 1 17 13 92939U 1993 11 1 18 13 92939U 1994 11 1 19 13 92939U 1995 11 1 20 50 381317 1993 2 1 21 50 381317 1994 2 1 22 50 381317 1996 2 1 23 50 381317 1997 2 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use a double DOW loop.
In the first loop count the length of the run.
Then convert the count to your 0/1 binary flag.
Then in the second loop re-read the data and write out all of the observations for this run.
data have;
input ID $ CUSIP $ YEAR MONTH ;
cards;
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
;
data want;
do _n_=1 by 1 until (last.month);
set have ;
by id CUSIP month ;
end;
mark = _n_ >=3 ;
do _n_=1 to _n_ ;
set have;
output;
end;
run;
Results
Obs ID CUSIP YEAR MONTH mark 1 13 92939U 1992 2 0 2 13 92939U 1993 3 0 3 13 92939U 1992 4 1 4 13 92939U 1993 4 1 5 13 92939U 1994 4 1 6 13 92939U 1993 7 0 7 13 92939U 1994 7 0 8 13 92939U 1992 8 1 9 13 92939U 1993 8 1 10 13 92939U 1994 8 1 11 13 92939U 1993 9 0 12 13 92939U 1994 9 0 13 13 92939U 1992 10 1 14 13 92939U 1993 10 1 15 13 92939U 1994 10 1 16 13 92939U 1992 11 1 17 13 92939U 1993 11 1 18 13 92939U 1994 11 1 19 13 92939U 1995 11 1 20 50 381317 1993 2 1 21 50 381317 1994 2 1 22 50 381317 1996 2 1 23 50 381317 1997 2 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I wrote 20 lines to create this, but you suggested only a few lines!
It is a great help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input ID $ CUSIP $ YEAR MONTH ;
cards;
13 92939U 1992 2
13 92939U 1993 3
13 92939U 1992 4
13 92939U 1993 4
13 92939U 1994 4
13 92939U 1993 7
13 92939U 1994 7
13 92939U 1992 8
13 92939U 1993 8
13 92939U 1994 8
13 92939U 1993 9
13 92939U 1994 9
13 92939U 1992 10
13 92939U 1993 10
13 92939U 1994 10
13 92939U 1992 11
13 92939U 1993 11
13 92939U 1994 11
13 92939U 1995 11
50 381317 1993 2
50 381317 1994 2
50 381317 1996 2
50 381317 1997 2
;
proc sql;
create table want as
select *,range(year)+1=count(*) and count(*)>2 as mark
from have
group by id,cusip,month;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your coding is even shorter. I will keep them.