Hi there.
I've got stuck with my SAS code.
My aim: To find the average gap of a row (but only a gap where there is either a year before or after the gap) So if someone joins for 2 years, leaves then comes back 2 years later it will count a 2 year gap. However if someone joins and stays for 3 year and then leaves they should not be counted if that's all they did.
I've managed to get to a point the code will either include to much or not enough. I can make it either include all gaps (even if they have never been in before, so essentially just all 0's) or only when there's a specified amount of years either side of it.
data my_data;
input KEY YEAR97 YEAR98 YEAR99 YEAR00 YEAR01 YEAR02 YEAR03 YEAR04 YEAR05 YEAR06 YEAR07 YEAR08 YEAR09;
datalines;
1 0 0 1 0 0 1 0 1 0 0 1 0 0
2 1 1 0 1 1 0 1 0 1 1 0 1 1
3 0 0 0 0 1 0 0 0 1 0 0 1 0
4 1 1 1 1 1 1 1 1 1 1 1 1 1
;
data gaps_in_year;
set my_data;
array year_array YEAR97 YEAR98 YEAR99 YEAR00 YEAR01 YEAR02 YEAR03 YEAR04 YEAR05 YEAR06 YEAR07 YEAR08 YEAR09;
gap_counter = 0;
total_gap_length = 0;
do i = 2 to dim(year_array)-1;
if year_array[i-1] = 1 and year_array[i] = 0 and year_array[i+1] = 1 then do;
gap_counter + 1;
end;
else if gap_counter > 0 then do;
total_gap_length + gap_counter;
gap_counter = 0;
end;
end;
/* Calculate average gap length for the current row */
if gap_counter > 0 then do;
total_gap_length + gap_counter;
end;
drop i ;
run;
You can
data my_data;
input KEY YEAR97 YEAR98 YEAR99 YEAR00 YEAR01 YEAR02 YEAR03 YEAR04 YEAR05 YEAR06 YEAR07 YEAR08 YEAR09;
datalines;
1 0 0 1 0 0 1 0 1 0 0 1 0 0
2 1 1 0 1 1 0 1 0 1 1 0 1 1
3 0 0 0 0 1 0 0 0 1 0 0 1 0
4 1 1 1 1 1 1 1 1 1 1 1 1 1
;
data want (drop=_:);
set my_data;
length _strng $14; *Assign a length at least one byte longer than N of YEAR variables *;
_strng=cats(of year:);
_strng=substr(_strng,findc(_strng,'1')); *trim leading zeroes;
_last1=findc(_strng,'1',-14); *Find RIGHTMOST '1', note negative 3rd parameter;
_strng=substr(_strng,1,_last1); *trim trailing zeroes*;
ngaps=countw(_strng,'1')-1;
total_gap_length=lengthn(compress(_strng,'1'));
if ngaps>0 then average_gap_length=total_gap_length/ngaps;
run;
You can
data my_data;
input KEY YEAR97 YEAR98 YEAR99 YEAR00 YEAR01 YEAR02 YEAR03 YEAR04 YEAR05 YEAR06 YEAR07 YEAR08 YEAR09;
datalines;
1 0 0 1 0 0 1 0 1 0 0 1 0 0
2 1 1 0 1 1 0 1 0 1 1 0 1 1
3 0 0 0 0 1 0 0 0 1 0 0 1 0
4 1 1 1 1 1 1 1 1 1 1 1 1 1
;
data want (drop=_:);
set my_data;
length _strng $14; *Assign a length at least one byte longer than N of YEAR variables *;
_strng=cats(of year:);
_strng=substr(_strng,findc(_strng,'1')); *trim leading zeroes;
_last1=findc(_strng,'1',-14); *Find RIGHTMOST '1', note negative 3rd parameter;
_strng=substr(_strng,1,_last1); *trim trailing zeroes*;
ngaps=countw(_strng,'1')-1;
total_gap_length=lengthn(compress(_strng,'1'));
if ngaps>0 then average_gap_length=total_gap_length/ngaps;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.