I have a large data set which i need to agregate based on the below conditions:
1 The data i have:
data have;
input Client $ transaction $ date1 MMDDYY10. date2 MMDDYY10. value1 value2;
cards;
id1 a 05/10/2010 05/10/2010 100 200
id1 a 05/10/2010 05/10/2010 200 200
id1 b 05/10/2010 05/22/2013 200 300
id1 c 05/10/2010 08/15/2016 300 100
id1 a 03/10/2017 03/10/2017 700 200
id1 b 03/10/2017 05/22/2018 200 700
;
RUN;
2. the data i want is as per below:
DATA WANT;
INPUT Client $ date1 MMDDYY10. Sum1 Sum2;
cards;
id1 05/10/2010 300 100
id1 03/10/2017 700 700
;
run;
Basically Sum1 is the sum of value 1 where date1=min(date2) and Sum2 is the sum of value2 where date1=max(date2).
the solution can be in sas or sql.
Any help would be highly appreciated.
Hi @inid
Basically Sum1 is the sum of value 1 where date1=min(date2) and Sum2 is the sum of value2 where date1=max(date2).
In your example, for client id1, min(date2) corresponds to 05/10/2010 and max(date2) to 05/22/2018. Four records have date1 = min(date2) and no record have date1=max(date2). The sum of value1 would 800 in the first case, and 0 in the second.
So it doesn't match your 'WANT' table.
Could you please explain further what is the exact rule? Wouldn't it be:
If so, you can try this:
proc sql;
create table want as
select Client, date1 format=MMDDYY10., sum(v1) as sum1, sum(v2) as sum2
from (select *, value1*(date2=min(date2)) as v1, value2*(date2=max(date2)) as v2
from have as a
group by client, date1)
group by client, date1;
quit;
Best,
proc summary data=have nway;
class client date1;
var value1 value2;
output out=want sum=sum1 sum2;
run;
Hi @inid
Basically Sum1 is the sum of value 1 where date1=min(date2) and Sum2 is the sum of value2 where date1=max(date2).
In your example, for client id1, min(date2) corresponds to 05/10/2010 and max(date2) to 05/22/2018. Four records have date1 = min(date2) and no record have date1=max(date2). The sum of value1 would 800 in the first case, and 0 in the second.
So it doesn't match your 'WANT' table.
Could you please explain further what is the exact rule? Wouldn't it be:
If so, you can try this:
proc sql;
create table want as
select Client, date1 format=MMDDYY10., sum(v1) as sum1, sum(v2) as sum2
from (select *, value1*(date2=min(date2)) as v1, value2*(date2=max(date2)) as v2
from have as a
group by client, date1)
group by client, date1;
quit;
Best,
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.