I have a large data set which i need to agregate based on the below conditions:
1 The data i have:
data have;
input Client $ transaction $ date1 MMDDYY10. date2 MMDDYY10. value1 value2;
cards;
id1 a 05/10/2010 05/10/2010 100 200
id1 a 05/10/2010 05/10/2010 200 200
id1 b 05/10/2010 05/22/2013 200 300
id1 c 05/10/2010 08/15/2016 300 100
id1 a 03/10/2017 03/10/2017 700 200
id1 b 03/10/2017 05/22/2018 200 700
;
RUN;
2. the data i want is as per below:
DATA WANT;
INPUT Client $ date1 MMDDYY10. Sum1 Sum2;
cards;
id1 05/10/2010 300 100
id1 03/10/2017 700 700
;
run;
Basically Sum1 is the sum of value 1 where date1=min(date2) and Sum2 is the sum of value2 where date1=max(date2).
the solution can be in sas or sql.
Any help would be highly appreciated.
Hi @inid
Basically Sum1 is the sum of value 1 where date1=min(date2) and Sum2 is the sum of value2 where date1=max(date2).
In your example, for client id1, min(date2) corresponds to 05/10/2010 and max(date2) to 05/22/2018. Four records have date1 = min(date2) and no record have date1=max(date2). The sum of value1 would 800 in the first case, and 0 in the second.
So it doesn't match your 'WANT' table.
Could you please explain further what is the exact rule? Wouldn't it be:
If so, you can try this:
proc sql;
create table want as
select Client, date1 format=MMDDYY10., sum(v1) as sum1, sum(v2) as sum2
from (select *, value1*(date2=min(date2)) as v1, value2*(date2=max(date2)) as v2
from have as a
group by client, date1)
group by client, date1;
quit;
Best,
proc summary data=have nway;
class client date1;
var value1 value2;
output out=want sum=sum1 sum2;
run;
Hi @inid
Basically Sum1 is the sum of value 1 where date1=min(date2) and Sum2 is the sum of value2 where date1=max(date2).
In your example, for client id1, min(date2) corresponds to 05/10/2010 and max(date2) to 05/22/2018. Four records have date1 = min(date2) and no record have date1=max(date2). The sum of value1 would 800 in the first case, and 0 in the second.
So it doesn't match your 'WANT' table.
Could you please explain further what is the exact rule? Wouldn't it be:
If so, you can try this:
proc sql;
create table want as
select Client, date1 format=MMDDYY10., sum(v1) as sum1, sum(v2) as sum2
from (select *, value1*(date2=min(date2)) as v1, value2*(date2=max(date2)) as v2
from have as a
group by client, date1)
group by client, date1;
quit;
Best,
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.