Hello,
I have data where I have an indicator for when someone had coverage in a year. I need to add across a row starting at the first 1 a person has and stopping when there is a '.'. The 1 can start not at the first year and can have 1's after the streak ends.
This is what it looks like now:
ID | Yr_2000 | Yr_2001 | Yr_2002 | Yr_2003 | Yr_2004 | Yr_2005 | Yr_2006 | Yr_2007 | Yr_2008 | Yr_2009 | Yr_2010 |
A | 1 | 1 | 1 | . | . | . | 1 | 1 | . | . | . |
B | . | . | . | 1 | 1 | 1 | . | . | . | . | 1 |
C | . | . | . | . | . | . | . | . | . | . | . |
D | . | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | . | . |
E | . | . | . | . | . | . | . | . | . | . | . |
F | 1 | 1 | . | 1 | . | . | 1 | 1 | . | . | . |
G | . | . | . | . | . | 1 | 1 | 1 | 1 | . | 1 |
H | . | . | . | . | . | . | . | . | . | . | 1 |
ID | . | 1 | 1 | 1 | . | . | . | . | . | . | . |
G | 1 | . | . | . | . | . | . | . | 1 | . | . |
K | 1 | 1 | 1 | 1 | 1 | . | . | . | . | . | . |
I want a table like this:
ID | Coverage Length |
A | 3 |
B | 3 |
C | 0 |
D | 8 |
E | 0 |
F | 2 |
G | 4 |
H | 1 |
ID | 3 |
G | 1 |
K | 5 |
Thank you!
UNTESTED CODE
data want;
set have;
array yr(*) yr_2000-yr_2010;
count=0;
do i=1 to dim(yr);
count=sum(count,yr(i));
if count>0 and missing(yr(i)) then leave;
end;
run;
If you want tested code, you need to provide the data as SAS data step code and not as a screen capture.
data have; infile cards expandtabs truncover; input ID $ Yr_2000 Yr_2001 Yr_2002 Yr_2003 Yr_2004 Yr_2005 Yr_2006 Yr_2007 Yr_2008 Yr_2009 Yr_2010 ; cards; A 1 1 1 . . . 1 1 . . . B . . . 1 1 1 . . . . 1 C . . . . . . . . . . . D . 1 1 1 1 1 1 1 1 . . E . . . . . . . . . . . F 1 1 . 1 . . 1 1 . . . G . . . . . 1 1 1 1 . 1 H . . . . . . . . . . 1 ID . 1 1 1 . . . . . . . G 1 . . . . . . . 1 . . K 1 1 1 1 1 . . . . . . ; data want; set have; temp=cats(of Yr_:); pid=prxparse('/1+/o'); s=1;e=length(temp); cover_len=0; call prxnext(pid,s,e,temp,p,l); do while(p>0); cover_len=max(cover_len,l); call prxnext(pid,s,e,temp,p,l); end; drop s e p l temp; run;
data have;
input ID $ Yr_2000 Yr_2001 Yr_2002 Yr_2003 Yr_2004 Yr_2005 Yr_2006 Yr_2007 Yr_2008 Yr_2009 Yr_2010;
cards;
A 1 1 1 . . . 1 1 . . .
B . . . 1 1 1 . . . . 1
C . . . . . . . . . . .
D . 1 1 1 1 1 1 1 1 . .
E . . . . . . . . . . .
F 1 1 . 1 . . 1 1 . . .
G . . . . . 1 1 1 1 . 1
H . . . . . . . . . . 1
ID . 1 1 1 . . . . . . .
G 1 . . . . . . . 1 . .
K 1 1 1 1 1 . . . . . .
;
data want;
set have;
array t Yr_2000-- Yr_2010;
junk1=cats(of t(*));
junk2=index(junk1,'1');
if junk2 then junk3=substr(junk1,junk2);
junk4=verify(junk3,'1');
junk5=coalesce(of t(*));
cover_length=0;
if junk5 then cover_length=junk4-junk5;
drop junk: Yr_2000-- Yr_2010;
run;
Hello @infoneimanhpi,
Maybe it's as easy as this:
data want(keep=id c:);
set have;
Coverage_Length=lengthn(scan(cats(of Yr_2000-Yr_2010),1));
run;
You can do it as follows:
data HAVE;
input ID $ Yr_2000-Yr_2010;
datalines;
A 1 1 1 . . . 1 1 . . .
B . . . 1 1 1 . . . . 1
C . . . . . . . . . . .
D . 1 1 1 1 1 1 1 1 . .
E . . . . . . . . . . .
F 1 1 . 1 . . 1 1 . . .
G . . . . . 1 1 1 1 . 1
H . . . . . . . . . . 1
I . 1 1 1 . . . . . . .
G 1 . . . . . . . 1 . .
K 1 1 1 1 1 . . . . . .
;
data WANT (keep=ID c);
set HAVE;
array y Yr_2000-Yr_2010;
c = 0;
do i=1 to dim(y) until(c>0 and y[i]=.);
c = sum(c,y[i]);
end;
label c = 'Coverage Length';
run;
Hope this helps.
What if the max length is at last , not first ? Like :
A 1 1 1 . . . 1 1 1 1 1
B . . . 1 . . . . . 1 1
Below two options to get what you want.
data have;
infile cards expandtabs truncover;
input ID $ Yr_2000 Yr_2001 Yr_2002 Yr_2003 Yr_2004
Yr_2005 Yr_2006 Yr_2007 Yr_2008 Yr_2009 Yr_2010 ;
cards;
A 1 1 1 . . . 1 1 . . .
B . . . 1 1 1 . . . . 1
C . . . . . . . . . . .
D . 1 1 1 1 1 1 1 1 . .
E . . . . . . . . . . .
F 1 1 . 1 . . 1 1 . . .
G . . . . . 1 1 1 1 . 1
H . . . . . . . . . . 1
ID . 1 1 1 . . . . . . .
G 1 . . . . . . . 1 . .
K 1 1 1 1 1 . . . . . .
;
data want1(drop=_: yr_:);
set have;
_str=catt(of yr_:);
max_len=0;
_stop=countw(_str,'.');
do _i=1 to _stop;
max_len=max(max_len,lengthn(scan(_str,_i,'.')));
end;
run;
proc print data=want1;
run;
data want2(drop=_: yr_:);
set have;
array yrs {*} yr_:;
do _i=1 to dim(yrs);
_len= sum(0,_len,yrs[_i]) * (yrs[_i]=1);
max_len=max(_len,max_len);
end;
run;
proc print data=want2;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.