BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
endofline
Obsidian | Level 7

I  have a data set where individuals can come in at different times, stay in the data for a few years and then leave. I want to be able to flag when is their last year of data. So my data looks like:

 

Obs in_2010 in_2011 in_2012 in_2013 in_2014 in_2015 in_2016
1 0 0 0 0 1 1 1
2 1 1 1 1 0 0 0
3 0 0 0 1 1 0 0
4 0 0 0 0 0 1 1
5 1 1 1 1 1 1 1
6 0 0 0 1 1 1 0
7 1 1 1 1 1 1 0
8 0 0 0 1 1 0 0

 

What I want to be able to create is a new variable that lists the last year of their enrollment in the data. There will be individuals who are in the data for the whole time period, or just arrive in 2016, and those are still considered active. But the ones who go from a 1 to 0, I want to capture. Since there are many possible combinations I don't want to hard code all the possibilities. Is there is a simpler way to do this? Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
Jagadishkatam
Amethyst | Level 16

I am not sure of the expected output but I assumed that this could be it. So please check the output and  if it the one expected then you may try this code.

 

image.png

 

data want;
set have;
array vars(*) in_2010	in_2011	in_2012	in_2013	in_2014	in_2015	in_2016;
do i = 1 to dim(vars);
if vars(i)=1 then year=compress(vname(vars(i)),,'kd');
end;
run;
Thanks,
Jag

View solution in original post

6 REPLIES 6
Shmuel
Garnet | Level 18

Define the variables in an array and loop backwards:

data want;
 set have;
      array iny  in_: ;
      do i = dim(iny) to 1 by -1;
           if  iny(i) = 1 then do;
     
              year = substr(vname(iny(i)) , 3,4);
              leave;
          end;
     end;
run;
            
novinosrin
Tourmaline | Level 20

Good afternoon @Shmuel, I hope you are having a great weekend.

Just a nit assuming your understanding of what OP wants is desired

 

I have corrected your code as follows:

 


data want;
 set have;
 length year $4;
      array iny  in_: ;
      do i = dim(iny) to 1 by -1;
           if  iny(i) = 1 then do;
              year = substr(vname(iny(i)) , 4);

/*			 year = scan(vname(iny(i)) , -1,'_'); you could use scan too*/
              leave;
          end;
     end;
	 drop i;
run;

Correction :

1. Starting position to extract from string. Since the values end with number for the year, stop value is not needed

2. VNAME function extract defaults to a vlength of 200 bytes alike scan at compile time and this makes SAS pad with blanks to fill in for those unwanted bytes. So to counter this, a best practice would is to have length statement that assigns vlength at compile time

3. Scan is another option if you wish. Well hardly makes  a difference, nevertheless i have included that option in the comment

4. Perhaps there is a possibility of year being used for computation of intervals like differences etc. Well if OP chooses to do so, the code can be tweaked to 

 


data want;
 set have;
/* length year $4;*/
      array iny  in_: ;
      do i = dim(iny) to 1 by -1;
           if  iny(i) = 1 then do;
              year =input( substr(vname(iny(i)) , 4),8.);

/*			 year =input( scan(vname(iny(i)) , -1,'_'),8.);*/
              leave;
          end;
     end;
	 drop i;
run;

5. Hi @Jagadishkatam In my humble opinion, your forward loop checking  one by one for all the elements of the array and resetting everytime logic, makes your year=compress(vname(vars(i)),,'kd'); execute everytime when vars(i)=1 is true and that makes it not as efficient as the reverse loop and leave expended by @Shmuel

 

Best Regards!

 

PS In essence, we do not even need a loop aka linear approach, and whichn/vname alone would suffice. But in my test with 80 to 100 million records , linear does just as well or sometimes eclipses whichn although whichn code is much shorter.

 

 

 

 

 

 

 

Jagadishkatam
Amethyst | Level 16
Thank you @novinosrin for the explanation.
Thanks,
Jag
novinosrin
Tourmaline | Level 20

My humble request is to post the desired output sample for the input for some dumb folks like me to figure out as well. Thank you!

Jagadishkatam
Amethyst | Level 16

I am not sure of the expected output but I assumed that this could be it. So please check the output and  if it the one expected then you may try this code.

 

image.png

 

data want;
set have;
array vars(*) in_2010	in_2011	in_2012	in_2013	in_2014	in_2015	in_2016;
do i = 1 to dim(vars);
if vars(i)=1 then year=compress(vname(vars(i)),,'kd');
end;
run;
Thanks,
Jag
endofline
Obsidian | Level 7

Thanks, that output is very close to what I want. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 950 views
  • 3 likes
  • 4 in conversation