Hi Team,
I have a dataset like this.
Description code cases denom numer triplet
infections 702 277 179 15 702
infections 703 288 280 25 702_703
infections 704 125 122 7 702_703_704
I want as shown below...
Basically to summarize the cases, denom and numer......and to take the last one in the triplet
Description cases denom numer triplet
infections 690 581 47 702_703_704
Regards
Karun,
Here is one way (assuming your data are already organized such that all description records are together and the last record in each group is the one that contains the desired triplet):
data want (drop=_:);
set have;
retain _cases _denom _numer;
by description notsorted;
_cases=ifn(first.description,cases,sum(cases,_cases));
_denom=ifn(first.description,denom,sum(denom,_denom));
_numer=ifn(first.description,numer,sum(numer,_numer));
if last.description then do;
cases=_cases;
denom=_denom;
numer=_numer;
output;
end;
run;
Karun,
Here is one way (assuming your data are already organized such that all description records are together and the last record in each group is the one that contains the desired triplet):
data want (drop=_:);
set have;
retain _cases _denom _numer;
by description notsorted;
_cases=ifn(first.description,cases,sum(cases,_cases));
_denom=ifn(first.description,denom,sum(denom,_denom));
_numer=ifn(first.description,numer,sum(numer,_numer));
if last.description then do;
cases=_cases;
denom=_denom;
numer=_numer;
output;
end;
run;
or
data have;
input Description $10. code cases denom numer;
cards;
infections 702 277 179 15
infections 703 288 280 25
infections 704 125 122 7
;
data want;
set have end=last;
length triplet $ 20;
retain newcases newdenom newnumer triplet;
newcases+cases;
newdenom+denom;
newnumer+numer;
triplet=catx('_',triplet,code);
if last ;
proc print;run;
Thanks so much. It works well
Regards
Linlin,
I think your code works only for only the example i put up.
if there are several repeats of the Description then Art's code works perfectly fine
Sorry for not telling it in my question that there are several of those repeats
Regards
Hi Karun,
below is the modified code:
data have;
input Description $10. code cases denom numer;
cards;
infections 702 277 179 15
infections 703 288 280 25
infections 704 125 122 7
vnfections 802 277 179 15
vnfections 803 888 880 85
vnfections 804 125 122 7
;
data want;
set have ;
by Description;
length triplet $ 20;
retain newcases newdenom newnumer triplet;
newcases+cases;
newdenom+denom;
newnumer+numer;
triplet=catx('_',triplet,code);
if last.Description then do ;
output;
call missing(of new:,triplet);
end;
proc print;run;
Hi Thanks a ton for your time.
I understood everything but
call missing(of new:,triplet);
and why this statement has to be after the output stmnt??
Regards
call missing sets all the variables in the list to missing, so they are reset for the new loop
Hi,
When you set the triplet to missing in each loop along with new:
triplet=catx('_',triplet,code);
is is like ""_code=code???(when a missing char value concatenates with a value it would be just the value and not the _code?????
Description=infectio code=702 cases=227 denom=179 numer=15 FIRST.Description=1 LAST.Description=0
triplet=702 newcases=227 newdenom=179 newnumer=15 _ERROR_=0 _N_=1
Description=infectio code=703 cases=288 denom=281 numer=26 FIRST.Description=0 LAST.Description=0
triplet=702_703 newcases=515 newdenom=460 newnumer=41 _ERROR_=0 _N_=2
Description=infectio code=704 cases=125 denom=122 numer=7 FIRST.Description=0 LAST.Description=1
triplet= newcases=. newdenom=. newnumer=. _ERROR_=0 _N_=3
Description=vnfectio code=802 cases=277 denom=179 numer=15 FIRST.Description=1 LAST.Description=0
triplet=802 newcases=277 newdenom=179 newnumer=15 _ERROR_=0 _N_=4
Description=vnfectio code=803 cases=888 denom=880 numer=85 FIRST.Description=0 LAST.Description=0
triplet=802_803 newcases=1165 newdenom=1059 newnumer=100 _ERROR_=0 _N_=5
Description=vnfectio code=804 cases=125 denom=122 numer=7 FIRST.Description=0 LAST.Description=1
triplet= newcases=. newdenom=. newnumer=. _ERROR_=0 _N_=6
Regards
Hi Team,
Good Morning
Could you please help me understand this logic???
Regards
Karun,
You are getting the values you showed, I presume, by using a put _all_ statement in your code AFTER the call missing statement.
If so, you are not getting the values that are being output as the output statement is BEFORE the call missing statement.
Add another put statement before the call missing statement and I think you will be able to understand what the code is doing.
Thanks ART,
Helped me to understand better.
Regards
Here is a Proc SQL approach:
proc sql;
select description,sum(cases) as cases, sum(denom) as denom, sum(numer) as numer, triplet
from have
group by description
having length(triplet)=max(length(triplet));
quit;
Haikuo
max(Triplet) may also work with this SQL syntax provided the example structure holds for other cases.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.