Hi.
I could use some help condensing multiple rows into 1 based on an ID # and a Date. I'm using SAS EG ver 7.15. i tried using Data > Sort and removing duplicate adjacent rows. While that did make headway it didn't get all of them because I have some other columns where the values differ by row. I've seen similar questions but all have something unique about them that prevents me from getting my attempts to work.
Sample of what I have:
ID # | Date | Amount
100 01Jan18 $1.00
100 01Jan18 $0.00
100 02Feb18 $1.00
200 01Jan18 $2.00
200 01Jan18 $0.00
200 01Jan18 $0.00
200 03Feb18 $1.00
200 03Feb18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
What I'm looking for is something like below where each ID has one row per date. I'd like to keep the row where there's a non-zero dollar amount when there is one.
ID # | Date | Amount
100 01Jan18 $1.00
100 02Feb18 $1.00
200 01Jan18 $2.00
200 03Feb18 $1.00
300 02Jan18 $0.00
Any thoughts? Thank you.
I think your explanation is incomplete, in the sense that you don't say what happens if there are two or more rows for a given ID # and Date, and the amount value is above zero for two or more rows. What do we do then? Or is your example complete and there can only be one record per ID # and DATE that has a non-zero Amount?
But based upon the example you show, and assuming (but I hate to assume, could you please clarify if this is the case or not?) that there can only be one non-zero amount value per ID # and date, you could use PROC SUMMARY to get the desired results.
proc summary data=have;
class id date;
var amount;
output out=want sum=;
run;
when the amount value differ like
100 01Jan18 $1.00
100 01Jan18 $0.00
which one would you want to keep?
Also, is your sample a good and a complete representative of your real?
Hi. I would want to keep the record that has a value greater than $0.00. However, there will be some occurrences where all the entries for same ID and Date are $0.00 in which case I just need to keep one of them.
I've omitted several other columns of informational data that I will be keeping but I don't need to include in the decision about which row to keep and which row(s) to condense.
data have;
input ID $ Date :date7. Amount : dollar10.2;
format amount dollar10. date date7.;
cards;
100 01Jan18 $1.00
100 01Jan18 $0.00
100 02Feb18 $1.00
200 01Jan18 $2.00
200 01Jan18 $0.00
200 01Jan18 $0.00
200 03Feb18 $1.00
200 03Feb18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
;
proc sql;
create table want as
select distinct *
from have
group by id,date
having amount=max(amount)
order by id,date;
quit;
I think the following is a safe bet
proc sort data=have out=_have;
by id date descending amount;
run;
data want;
set _have;
by id date;
if first.date;
run;
@Toader wrote:
Hi. I would want to keep the record that has a value greater than $0.00. However, there will be some occurrences where all the entries for same ID and Date are $0.00 in which case I just need to keep one of them.
I've omitted several other columns of informational data that I will be keeping but I don't need to include in the decision about which row to keep and which row(s) to condense.
Just sort and keep the last record.
data want ;
set have ;
by id date value ;
if last.date;
run;
I think your explanation is incomplete, in the sense that you don't say what happens if there are two or more rows for a given ID # and Date, and the amount value is above zero for two or more rows. What do we do then? Or is your example complete and there can only be one record per ID # and DATE that has a non-zero Amount?
But based upon the example you show, and assuming (but I hate to assume, could you please clarify if this is the case or not?) that there can only be one non-zero amount value per ID # and date, you could use PROC SUMMARY to get the desired results.
proc summary data=have;
class id date;
var amount;
output out=want sum=;
run;
Are you trying to SUM the Amount by ID & Date, then something like this will work if Amount is in actual numeric values.
proc sort data=have; by id date; run; data want; retain Amount_new; set have; by id date; Amount+Amount_new; if last.date; run;
No Suryakiran. I am not trying to sum the amounts. I'm trying to take x number of events that happened on the same day and turn it into 1 event. For example, a person goes shopping and buys 6 different items. Each item ends up as a row in my table. I don't care that there were 6 items. I care that there was 1 shopping event for that person on that date.
@Toader wrote:
No Suryakiran. I am not trying to sum the amounts. I'm trying to take x number of events that happened on the same day and turn it into 1 event. For example, a person goes shopping and buys 6 different items. Each item ends up as a row in my table. I don't care that there were 6 items. I care that there was 1 shopping event for that person on that date.
Seems like my solution in message 3 would work.
By the way, summing a non-zero amount and many zeros is the same as taking the single value where the amount is non-zero.
data have;
input id date :date9. amt : dollar3.2;
format date date9. amt dollar3.2;
cards;
100 01jan18 $1.00
100 01jan18 $0.00
100 02feb18 $1.00
200 01jan18 $2.00
200 01jan18 $0.00
200 01jan18 $0.00
200 01jan18 $0.00
300 01jan18 $0.00
300 02jan18 $0.00
300 02jan18 $0.00
300 02jan18 $0.00
;
run;
proc sort data=have ;
by id date descending amt ;
run;
proc sort data=have out=want nodupkey;
by id date ;
run;
Your data are already sorted by id/date, so no active sorting is necessary. Instead interleave the data set through 2 mutually exclusive filters (amt>0 followed by amt=0),and keep the first record for each id/date combo.
data have;
input ID $ Date :date7. Amount : dollar10.2;
format amount dollar10. date date7.;
cards;
100 01Jan18 $1.00
100 01Jan18 $0.00
100 02Feb18 $1.00
200 01Jan18 $2.00
200 01Jan18 $0.00
200 01Jan18 $0.00
200 03Feb18 $1.00
200 03Feb18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
300 02Jan18 $0.00
;
data want;
set have (where=(amount>0)) have (where=(amount=0));
by id date;
if first.date;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.