DATA Step, Macro, Functions and more

Summing several records into one record.

Reply
N/A
Posts: 0

Summing several records into one record.

Each ID have several records. I want one record for each ID.

Each ID have 1 to 2 PROD:s. Each record can have CAT in {'A','B'}. If an ID have CAT='A' in any of it's records then CAT should become 'A', otherwise 'B'.

The first PROD in time should become PR1, the second PR2(if any).

The corresponding sums of AMOUNT should be assigned to AM1 and AM2 resp.

Corresponding days D1 and D2 should be calculated by taking the last DATE2 - the first DATE1 for each PROD. First DATE1 for the first PROD should be saved as STRT1 and last DATE2 for the first PROD should be saved as STP1.
If a second PROD exists then STRT2 and STP2 should be saved in the same manner.

Example:

ID***PROD**DATE1*****DATE2*****AMOUNT**CAT
10***111****20050108**20050110****100*******'B'
10***111****20050112**20050117****500*******'A'
10***111****20050121**20050124****200*******'B'
10***222****20050110**20050112****150*******'B'
10***222****20050115**20050120****250*******'B'
20***222****20010601**20010616****700*******'B'

should become:

ID**PR1*STRT1****STP1**D1**AM1*PR2*STRT2**STP2****D2**AM2*CAT
10**111*050108**050124*16**800**222**050110**050120*10**400***'A'
20**222*010601**010616*10**700***(.)***(BLANK)*(BLANK)*(.)**(.)****'B'

Note that the first PROD in time becomes PR1 regardless of what PROD.

This must be possible to do in a single data step. Any help is much appreciated.
Super Contributor
Super Contributor
Posts: 3,174

Re: Summing several records into one record.

Posted in reply to deleted_user
You have/had another recent thread where BY GROUP PROCESSING was demonstrated as a way to address the need to group data-rows based on defined conditions (generated using IF / THEN statements).

Have you attempted any coding at all, given that this data-sample appears to be similar to the "selecting records" post/thread you had going previously? Suggest you reveal SAS programming code that you have attempted to use to solve this query, then obtain useful and most suitable feedback to make your SAS program most effective.

Of course, unless you're just looking to have someone else write the SAS program for you? And, so, there are individuals on these forum channels willing to do so, possibly.

Scott Barry
SBBWorks, Inc.
N/A
Posts: 0

Re: Summing several records into one record.

Scott Barry,

I was expecting a reply from you and with the kind of comments that it actually contains.

Wouldn't it be nice if you could become the SAS Institute official censor of this forum?
Super Contributor
Super Contributor
Posts: 365

Re: Summing several records into one record.

Posted in reply to deleted_user
Hello EmestoC,

Thsi is a possible solution, may be not an optimal one...
[pre]
data i;
infile datalines delimiter="*'";
input ID PROD D1 $ D2 $ AMOUNT CAT $;
date1=MDY(SUBSTR(d1,5,2),SUBSTR(d1,7,2),SUBSTR(d1,1,4));
date2=MDY(SUBSTR(d2,5,2),SUBSTR(d2,7,2),SUBSTR(d2,1,4));
format date1 date2 date7.;
drop d1 d2;
datalines;
10***111****20050108**20050110****100*******'B'
10***111****20050112**20050117****500*******'A'
10***111****20050121**20050124****200*******'B'
10***222****20050110**20050112****150*******'B'
10***222****20050115**20050120****250*******'B'
20***222****20010601**20010616****700*******'B'
;
run;
/* CAT */;
proc sort data=i(keep=id cat where=(Cat='A')) out=c;
by id;
run;
/* AMT */;
proc SQL;
create table a as
select ID, Prod, SUM(AMOUNT) as AM, count(*) as N
from i
group by ID, Prod
;quit;
/* DAYS */;
proc sort data=i out=i;
by id prod date1;
run;
data d;
retain dt1;
set i;
if first.prod then dt1=date1;
if last.prod then do; d=date2-dt1; output; end;
by ID prod;
keep ID prod date1 date2 d;
run;
/* Prods into macro variables */;
proc SQL;
select distinct prod into Smiley Tongue1-Smiley Tongue2 from i
%let p1=%trim(&p1);
%let p2=%trim(&p2);
;quit;
/* MERGE */;
data da;
merge d a;
by id prod;
run;
/* TRANSPOSE date1 date2 d and am */;
proc transpose data=da out=d1 prefix=d1_;
var date1;
id prod;
by id;
run;
proc transpose data=da out=d2 prefix=d2_;
var date1;
id prod;
by id;
run;
proc transpose data=da out=days prefix=d_;
var d;
id prod;
by id;
run;
proc transpose data=da out=am prefix=am_;
var am;
id prod;
by id;
run;
data r0;
retain ID STRT1 STP1 D1 AM1 STRT2 STP2 D2 AM2;
merge d1 (rename=(d1_&p1=STRT1 d1_&p2=STRT2))
d2 (rename=(d2_&p1=STP1 d2_&p2=STP2 ))
days(rename=(d_&p1 =D1 d_&p2 =D2 ))
am (rename=(am_&p1=AM1 am_&p2=AM2 ));
if STRT1=. and STP1=. and D1=. and AM1=. then do;
STRT1=STRT2; STP1=STP2; D1=D2; AM1=AM2;
STRT2=. ; STP2= .; D2= .; AM2= .;
end;
by id;
drop _name_;
run;
/* Add CAT */;
proc SQL;
create table r as
select a.*,
case
when b.cat="A" then "A"
else "B"
end as CAT
from r0 as a left join c as b
on a.ID=b.ID
;quit;
[/pre]
Sincerely,
SPR
Super User
Posts: 10,044

Re: Summing several records into one record.

Posted in reply to deleted_user
Hi.
Why are your logic so complex always.
I totally be messed up by it.

[pre]
options missing=.;
data temp;
input ID $ PROD $ DATE1 : yymmdd8. DATE2 : yymmdd8. AMOUNT CAT$;
datalines;
10 111 20050108 20050110 100 B
10 111 20050112 20050117 500 A
10 111 20050121 20050124 200 B
10 222 20050110 20050112 150 B
10 222 20050115 20050120 250 B
20 222 20010601 20010616 700 B
;run;
proc format ;
value st
.= ' '
low-high=[yymmddn8.]
;
run;
proc sort data=temp;
by id prod;
run;
data op;
set temp;
by id prod;
retain pr1 pr2 strt1 strt2 stp1 stp2 am1 am2 _cat d1 d2;
keep id pr1 pr2 strt1 strt2 stp1 stp2 am1 am2 _cat d1 d2;
format strt1 stp1 strt2 stp2 st.;

if first.prod and last.prod then do;
pr1=prod;
strt1=date1;
stp1=date2;
d1=date2-date1;
am1=amount;
_cat=cat;
call missing(pr2,strt2,stp2,d2,am2);
output;
end;

else do;
if first.prod then am=0;
am+amount;
if first.prod and first.id then do;
strt1=date1;
pr1=prod;
_cat=cat;
end;
if cat eq 'A' then _cat='A' ;
if last.prod and not last.id then do;
am1=am;
stp1=date2;
end;
if first.prod and not first.id then do;
strt2=date1;
pr2=prod;
end;
if last.prod and last.id then do;
am2=am;
stp2=date2;
d1=stp1-strt1;
d2=stp2-strt2;
output;
end;
end;
run;
proc report data=op nowd out=result(drop=_break_);
column id pr1 strt1 stp1 d1 am1 pr2 strt2 stp2 d2 am2 _cat ;
run;
proc print noobs;run;
[/pre]


Good Luck.


Ksharp
Ask a Question
Discussion stats
  • 4 replies
  • 137 views
  • 0 likes
  • 4 in conversation