GVKEY | FY | ID | Join | Start |
1004 | 2000 | 1111 | . | 2000 |
1004 | 2001 | 1111 | . | 2000 |
1004 | 2002 | 1112 | . | 2002 |
1005 | 2000 | 1112 | . | 2000 |
1005 | 2001 | 1112 | . | 2000 |
1005 | 2002 | 1113 | . | 2002 |
1006 | 1995 | 1113 | . | 1995 |
1006 | 1996 | 1113 | . | 1995 |
1006 | 1997 | 1113 | . | 1995 |
1007 | 2005 | 1114 | 2002 | 2002 |
1007 | 2006 | 1114 | 2002 | 2002 |
1007 | 2007 | 1114 | 2002 | 2002 |
My desired result is the last column (start). What I am trying to do is the following:
1. If join (4th column) is given then start=join
2. If "join" is not given then start=fy. However, "start" will be only first FY based on ID. For example,
in the first row for ID 1111, the "start" is 2000 as it is the first year.
In the 2nd row for same ID 1111, the "start" is also 2000 not 2001.
in the 3rd row ID is changed. So, start=fy
in the fourth row ID is same but gvkey is changed. So, for being a new gvkey(another company), for ID 1112, the start=fy and so on.
I don't know how to code this. Can anyone help me, please?
data have;
input GVKEY FY ID Join;* Start ;
cards;
1004 2000 1111 . 2000
1004 2001 1111 . 2000
1004 2002 1112 . 2002
1005 2000 1112 . 2000
1005 2001 1112 . 2000
1005 2002 1113 . 2002
1006 1995 1113 . 1995
1006 1996 1113 . 1995
1006 1997 1113 . 1995
1007 2005 1114 2002 2002
1007 2006 1114 2002 2002
1007 2007 1114 2002 2002
;
data desired;
set have;
by gvkey id;
retain start;
if first.id and join=. then start=fy;
if join then start=join;
run;
data have;
input GVKEY FY ID Join;* Start ;
cards;
1004 2000 1111 . 2000
1004 2001 1111 . 2000
1004 2002 1112 . 2002
1005 2000 1112 . 2000
1005 2001 1112 . 2000
1005 2002 1113 . 2002
1006 1995 1113 . 1995
1006 1996 1113 . 1995
1006 1997 1113 . 1995
1007 2005 1114 2002 2002
1007 2006 1114 2002 2002
1007 2007 1114 2002 2002
;
data desired;
set have;
by gvkey id;
retain start;
if first.id and join=. then start=fy;
if join then start=join;
run;
The logic could also be coded as if/then/else
data desired; set have; by gvkey id; retain start; if join then start = join; else if first.id then start = fy; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.