| GVKEY | FY | ID | Join | Start | 
| 1004 | 2000 | 1111 | . | 2000 | 
| 1004 | 2001 | 1111 | . | 2000 | 
| 1004 | 2002 | 1112 | . | 2002 | 
| 1005 | 2000 | 1112 | . | 2000 | 
| 1005 | 2001 | 1112 | . | 2000 | 
| 1005 | 2002 | 1113 | . | 2002 | 
| 1006 | 1995 | 1113 | . | 1995 | 
| 1006 | 1996 | 1113 | . | 1995 | 
| 1006 | 1997 | 1113 | . | 1995 | 
| 1007 | 2005 | 1114 | 2002 | 2002 | 
| 1007 | 2006 | 1114 | 2002 | 2002 | 
| 1007 | 2007 | 1114 | 2002 | 2002 | 
My desired result is the last column (start). What I am trying to do is the following:
1. If join (4th column) is given then start=join
2. If "join" is not given then start=fy. However, "start" will be only first FY based on ID. For example,
in the first row for ID 1111, the "start" is 2000 as it is the first year.
In the 2nd row for same ID 1111, the "start" is also 2000 not 2001.
in the 3rd row ID is changed. So, start=fy
in the fourth row ID is same but gvkey is changed. So, for being a new gvkey(another company), for ID 1112, the start=fy and so on.
I don't know how to code this. Can anyone help me, please?
data have;
input GVKEY	FY	ID	Join;*	Start ;
cards;
1004	2000	1111	.	2000
1004	2001	1111	.	2000
1004	2002	1112	.	2002
1005	2000	1112	.	2000
1005	2001	1112	.	2000
1005	2002	1113	.	2002
1006	1995	1113	.	1995
1006	1996	1113	.	1995
1006	1997	1113	.	1995
1007	2005	1114	2002	2002
1007	2006	1114	2002	2002
1007	2007	1114	2002	2002
;
data desired;
 set have;
 by gvkey id;
 retain start;
 if first.id and join=. then start=fy;
 if join then start=join;
run;
data have;
input GVKEY	FY	ID	Join;*	Start ;
cards;
1004	2000	1111	.	2000
1004	2001	1111	.	2000
1004	2002	1112	.	2002
1005	2000	1112	.	2000
1005	2001	1112	.	2000
1005	2002	1113	.	2002
1006	1995	1113	.	1995
1006	1996	1113	.	1995
1006	1997	1113	.	1995
1007	2005	1114	2002	2002
1007	2006	1114	2002	2002
1007	2007	1114	2002	2002
;
data desired;
 set have;
 by gvkey id;
 retain start;
 if first.id and join=. then start=fy;
 if join then start=join;
run;
The logic could also be coded as if/then/else
data desired;
  set have;
  by gvkey id;
  retain start;
  if join then 
    start = join;
  else
  if first.id then
    start = fy;
run;
					
				
			
			
				
			
			
			
			
			
			
			
		It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
