I have a data set where individuals can come in at different times, stay in the data for a few years and then leave. I want to be able to flag when is their last year of data. So my data looks like:
| Obs | in_2010 | in_2011 | in_2012 | in_2013 | in_2014 | in_2015 | in_2016 | 
| 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 
| 2 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 
| 3 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 
| 4 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 
| 5 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 
| 6 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 
| 7 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 
| 8 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 
What I want to be able to create is a new variable that lists the last year of their enrollment in the data. There will be individuals who are in the data for the whole time period, or just arrive in 2016, and those are still considered active. But the ones who go from a 1 to 0, I want to capture. Since there are many possible combinations I don't want to hard code all the possibilities. Is there is a simpler way to do this? Thanks!
I am not sure of the expected output but I assumed that this could be it. So please check the output and if it the one expected then you may try this code.
data want;
set have;
array vars(*) in_2010	in_2011	in_2012	in_2013	in_2014	in_2015	in_2016;
do i = 1 to dim(vars);
if vars(i)=1 then year=compress(vname(vars(i)),,'kd');
end;
run;Define the variables in an array and loop backwards:
data want;
 set have;
      array iny  in_: ;
      do i = dim(iny) to 1 by -1;
           if  iny(i) = 1 then do;
     
              year = substr(vname(iny(i)) , 3,4);
              leave;
          end;
     end;
run;
            Good afternoon @Shmuel, I hope you are having a great weekend.
Just a nit assuming your understanding of what OP wants is desired
I have corrected your code as follows:
data want;
 set have;
 length year $4;
      array iny  in_: ;
      do i = dim(iny) to 1 by -1;
           if  iny(i) = 1 then do;
              year = substr(vname(iny(i)) , 4);
/*			 year = scan(vname(iny(i)) , -1,'_'); you could use scan too*/
              leave;
          end;
     end;
	 drop i;
run;Correction :
1. Starting position to extract from string. Since the values end with number for the year, stop value is not needed
2. VNAME function extract defaults to a vlength of 200 bytes alike scan at compile time and this makes SAS pad with blanks to fill in for those unwanted bytes. So to counter this, a best practice would is to have length statement that assigns vlength at compile time
3. Scan is another option if you wish. Well hardly makes a difference, nevertheless i have included that option in the comment
4. Perhaps there is a possibility of year being used for computation of intervals like differences etc. Well if OP chooses to do so, the code can be tweaked to
data want;
 set have;
/* length year $4;*/
      array iny  in_: ;
      do i = dim(iny) to 1 by -1;
           if  iny(i) = 1 then do;
              year =input( substr(vname(iny(i)) , 4),8.);
/*			 year =input( scan(vname(iny(i)) , -1,'_'),8.);*/
              leave;
          end;
     end;
	 drop i;
run;5. Hi @Jagadishkatam In my humble opinion, your forward loop checking one by one for all the elements of the array and resetting everytime logic, makes your year=compress(vname(vars(i)),,'kd'); execute everytime when vars(i)=1 is true and that makes it not as efficient as the reverse loop and leave expended by @Shmuel
Best Regards!
PS In essence, we do not even need a loop aka linear approach, and whichn/vname alone would suffice. But in my test with 80 to 100 million records , linear does just as well or sometimes eclipses whichn although whichn code is much shorter.
My humble request is to post the desired output sample for the input for some dumb folks like me to figure out as well. Thank you!
I am not sure of the expected output but I assumed that this could be it. So please check the output and if it the one expected then you may try this code.
data want;
set have;
array vars(*) in_2010	in_2011	in_2012	in_2013	in_2014	in_2015	in_2016;
do i = 1 to dim(vars);
if vars(i)=1 then year=compress(vname(vars(i)),,'kd');
end;
run;Thanks, that output is very close to what I want.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
