I want to sum 1s for a given key, but only the last ones, if there was a gap, i.e. there was 0, I do not want to sum. example below:
ID | val_202001 | val_202002 | val_202003 | val_202004 | val_202005 | val_202006 | val_202007 | val_202008 | val_202009 | val_202010 | val_202011 | val_202012 | val_202101 | val_202102 | val_202103 | expected | |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 15 | |
2 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 11 | |
3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 5 |
Thanks for your help / inspiration
UNTESTED CODE
data want;
set have;
array x val_202001--val_202103;
expected=0;
do i=dim(x) to 1 by -1;
if x(i)=1 then expected=expected+1;
else if x(i)=0 then leave;
end;
drop i;
run;
If you want tested code, please provide (a portion of) the data as SAS data step code, which you can type in yourself, or create it via these instructions.
Do not keep data in structure, it makes working with it more complicated.
So first, transpose:
proc transpose data=have out=long (rename=(_name_=period col1=value));
by id;
var val:;
run;
Then it is easy:
data want;
set long;
by id;
if first.id or value = 0
then expected = 0;
else expected + 1;
if last.id;
run;
data have;
input ID val_202001 val_202002 val_202003 val_202004 val_202005 val_202006 val_202007 val_202008 val_202009 val_202010 val_202011 val_202012 val_202101 val_202102 val_202103 ;
cards;
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1
3 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
;
data want;
set have;
array t(*) val_: ;
want = dim(t) - findc(cats(of t(*)),'0','b');
run;
Thank you @average_joe
Could your series end with 0? Could there be multiple groups, not just 1 or 2? What should the result look like for these series of values?
1 1 0 0 1 1 1 1 1 1 1 1 1 0 1
1 1 0 0 1 1 1 1 1 1 1 1 1 0 0
Sir @Astounding Nice one. Indeed a certain possibility that expects the logic to pick just the right most set of 1's if it were to exist -
data have;
input ID val_202001 val_202002 val_202003 val_202004 val_202005 val_202006 val_202007 val_202008 val_202009 val_202010 val_202011 val_202012 val_202101 val_202102 val_202103 ;
cards;
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1
3 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
4 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1
5 1 1 0 0 1 1 1 1 1 1 1 1 1 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
;
data want;
set have;
array t(*) val_: ;
want = lengthn(scan(cats(of t(*)),-1,'1','k'));
run;
data have;
input ID val_202001 val_202002 val_202003 val_202004 val_202005 val_202006 val_202007 val_202008 val_202009 val_202010 val_202011 val_202012 val_202101 val_202102 val_202103 ;
cards;
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1
3 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
4 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1
5 1 1 0 0 1 1 1 1 1 1 1 1 1 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
;
data want;
set have;
temp=cats(of val_:);
pid=prxparse('/1+/');
s=1;e=length(temp);
call prxnext(pid,s,e,temp,p,l);
do while(p>0);
want=l;
call prxnext(pid,s,e,temp,p,l);
end;
drop temp pid s e p l;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.