Hello,
I have data where I have an indicator for when someone had coverage in a year. I need to add across a row starting at the first 1 a person has and stopping when there is a '.'. The 1 can start not at the first year and can have 1's after the streak ends.
This is what it looks like now:
ID | Yr_2000 | Yr_2001 | Yr_2002 | Yr_2003 | Yr_2004 | Yr_2005 | Yr_2006 | Yr_2007 | Yr_2008 | Yr_2009 | Yr_2010 |
A | 1 | 1 | 1 | . | . | . | 1 | 1 | . | . | . |
B | . | . | . | 1 | 1 | 1 | . | . | . | . | 1 |
C | . | . | . | . | . | . | . | . | . | . | . |
D | . | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | . | . |
E | . | . | . | . | . | . | . | . | . | . | . |
F | 1 | 1 | . | 1 | . | . | 1 | 1 | . | . | . |
G | . | . | . | . | . | 1 | 1 | 1 | 1 | . | 1 |
H | . | . | . | . | . | . | . | . | . | . | 1 |
ID | . | 1 | 1 | 1 | . | . | . | . | . | . | . |
G | 1 | . | . | . | . | . | . | . | 1 | . | . |
K | 1 | 1 | 1 | 1 | 1 | . | . | . | . | . | . |
I want a table like this:
ID | Coverage Length |
A | 3 |
B | 3 |
C | 0 |
D | 8 |
E | 0 |
F | 2 |
G | 4 |
H | 1 |
ID | 3 |
G | 1 |
K | 5 |
Thank you!
UNTESTED CODE
data want;
set have;
array yr(*) yr_2000-yr_2010;
count=0;
do i=1 to dim(yr);
count=sum(count,yr(i));
if count>0 and missing(yr(i)) then leave;
end;
run;
If you want tested code, you need to provide the data as SAS data step code and not as a screen capture.
data have; infile cards expandtabs truncover; input ID $ Yr_2000 Yr_2001 Yr_2002 Yr_2003 Yr_2004 Yr_2005 Yr_2006 Yr_2007 Yr_2008 Yr_2009 Yr_2010 ; cards; A 1 1 1 . . . 1 1 . . . B . . . 1 1 1 . . . . 1 C . . . . . . . . . . . D . 1 1 1 1 1 1 1 1 . . E . . . . . . . . . . . F 1 1 . 1 . . 1 1 . . . G . . . . . 1 1 1 1 . 1 H . . . . . . . . . . 1 ID . 1 1 1 . . . . . . . G 1 . . . . . . . 1 . . K 1 1 1 1 1 . . . . . . ; data want; set have; temp=cats(of Yr_:); pid=prxparse('/1+/o'); s=1;e=length(temp); cover_len=0; call prxnext(pid,s,e,temp,p,l); do while(p>0); cover_len=max(cover_len,l); call prxnext(pid,s,e,temp,p,l); end; drop s e p l temp; run;
data have;
input ID $ Yr_2000 Yr_2001 Yr_2002 Yr_2003 Yr_2004 Yr_2005 Yr_2006 Yr_2007 Yr_2008 Yr_2009 Yr_2010;
cards;
A 1 1 1 . . . 1 1 . . .
B . . . 1 1 1 . . . . 1
C . . . . . . . . . . .
D . 1 1 1 1 1 1 1 1 . .
E . . . . . . . . . . .
F 1 1 . 1 . . 1 1 . . .
G . . . . . 1 1 1 1 . 1
H . . . . . . . . . . 1
ID . 1 1 1 . . . . . . .
G 1 . . . . . . . 1 . .
K 1 1 1 1 1 . . . . . .
;
data want;
set have;
array t Yr_2000-- Yr_2010;
junk1=cats(of t(*));
junk2=index(junk1,'1');
if junk2 then junk3=substr(junk1,junk2);
junk4=verify(junk3,'1');
junk5=coalesce(of t(*));
cover_length=0;
if junk5 then cover_length=junk4-junk5;
drop junk: Yr_2000-- Yr_2010;
run;
Hello @infoneimanhpi,
Maybe it's as easy as this:
data want(keep=id c:);
set have;
Coverage_Length=lengthn(scan(cats(of Yr_2000-Yr_2010),1));
run;
You can do it as follows:
data HAVE;
input ID $ Yr_2000-Yr_2010;
datalines;
A 1 1 1 . . . 1 1 . . .
B . . . 1 1 1 . . . . 1
C . . . . . . . . . . .
D . 1 1 1 1 1 1 1 1 . .
E . . . . . . . . . . .
F 1 1 . 1 . . 1 1 . . .
G . . . . . 1 1 1 1 . 1
H . . . . . . . . . . 1
I . 1 1 1 . . . . . . .
G 1 . . . . . . . 1 . .
K 1 1 1 1 1 . . . . . .
;
data WANT (keep=ID c);
set HAVE;
array y Yr_2000-Yr_2010;
c = 0;
do i=1 to dim(y) until(c>0 and y[i]=.);
c = sum(c,y[i]);
end;
label c = 'Coverage Length';
run;
Hope this helps.
What if the max length is at last , not first ? Like :
A 1 1 1 . . . 1 1 1 1 1
B . . . 1 . . . . . 1 1
Below two options to get what you want.
data have;
infile cards expandtabs truncover;
input ID $ Yr_2000 Yr_2001 Yr_2002 Yr_2003 Yr_2004
Yr_2005 Yr_2006 Yr_2007 Yr_2008 Yr_2009 Yr_2010 ;
cards;
A 1 1 1 . . . 1 1 . . .
B . . . 1 1 1 . . . . 1
C . . . . . . . . . . .
D . 1 1 1 1 1 1 1 1 . .
E . . . . . . . . . . .
F 1 1 . 1 . . 1 1 . . .
G . . . . . 1 1 1 1 . 1
H . . . . . . . . . . 1
ID . 1 1 1 . . . . . . .
G 1 . . . . . . . 1 . .
K 1 1 1 1 1 . . . . . .
;
data want1(drop=_: yr_:);
set have;
_str=catt(of yr_:);
max_len=0;
_stop=countw(_str,'.');
do _i=1 to _stop;
max_len=max(max_len,lengthn(scan(_str,_i,'.')));
end;
run;
proc print data=want1;
run;
data want2(drop=_: yr_:);
set have;
array yrs {*} yr_:;
do _i=1 to dim(yrs);
_len= sum(0,_len,yrs[_i]) * (yrs[_i]=1);
max_len=max(_len,max_len);
end;
run;
proc print data=want2;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.