BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Toader
Calcite | Level 5

Hi.

 

I could use some help condensing multiple rows into 1 based on an ID # and a Date. I'm using SAS EG ver 7.15. i tried using Data > Sort and removing duplicate adjacent rows. While that did make headway it didn't get all of them because I have some other columns where the values differ by row. I've seen similar questions but all have something unique about them that prevents me from getting my attempts to work.

 

Sample of what I have:

ID #      |       Date       |     Amount 

100           01Jan18        $1.00 

100           01Jan18        $0.00

100           02Feb18        $1.00            

200           01Jan18        $2.00

200           01Jan18        $0.00

200           01Jan18        $0.00

200           03Feb18        $1.00

200           03Feb18        $0.00

300           02Jan18        $0.00

300           02Jan18        $0.00

300           02Jan18        $0.00

300           02Jan18        $0.00

 

 

What I'm looking for is something like below where each ID has one row per date. I'd like to keep the row where there's a non-zero dollar amount when there is one.

ID #      |       Date       |     Amount

100           01Jan18        $1.00

100           02Feb18        $1.00

200           01Jan18        $2.00

200           03Feb18        $1.00

300           02Jan18        $0.00

 

Any thoughts? Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I think your explanation is incomplete, in the sense that you don't say what happens if there are two or more rows for a given ID # and Date, and the amount value is above zero for two or more rows. What do we do then? Or is your example complete and there can only be one record per ID # and DATE that has a non-zero Amount?

 

But based upon the example you show, and assuming (but I hate to assume, could you please clarify if this is the case or not?) that there can only be one non-zero amount value per ID # and date, you could use PROC SUMMARY to get the desired results.

 

proc summary data=have;
    class id date;
    var amount;
    output out=want sum=;
run;
--
Paige Miller

View solution in original post

10 REPLIES 10
novinosrin
Tourmaline | Level 20

when the amount value differ like

 

100           01Jan18        $1.00 

100           01Jan18        $0.00

 

which one would you want to keep?

 

Also, is your sample a good and a complete representative of your real?

Toader
Calcite | Level 5

Hi. I would want to keep the record that has a value greater than $0.00. However, there will be some occurrences where all the entries for same ID and Date are $0.00 in which case I just need to keep one of them.

 

I've omitted several other columns of informational data that I will be keeping but I don't need to include in the decision about which row to keep and which row(s) to condense.

novinosrin
Tourmaline | Level 20
data have;
input ID   $     Date  :date7.     Amount : dollar10.2;
format amount dollar10. date date7.;
cards;
100           01Jan18        $1.00 
100           01Jan18        $0.00
100           02Feb18        $1.00
200           01Jan18        $2.00
200           01Jan18        $0.00
200           01Jan18        $0.00
200           03Feb18        $1.00
200           03Feb18        $0.00
300           02Jan18        $0.00
300           02Jan18        $0.00
300           02Jan18        $0.00
300           02Jan18        $0.00
;


proc sql;
create table want as
select distinct *
from have
group by id,date
having  amount=max(amount)
order by id,date;
quit;

 

I think the following is a safe bet

 

proc sort data=have out=_have;
by id date descending amount;
run;

data want;
set _have;
by id date;
if first.date;
run;

 

 

Tom
Super User Tom
Super User

@Toader wrote:

Hi. I would want to keep the record that has a value greater than $0.00. However, there will be some occurrences where all the entries for same ID and Date are $0.00 in which case I just need to keep one of them.

 

I've omitted several other columns of informational data that I will be keeping but I don't need to include in the decision about which row to keep and which row(s) to condense.


Just sort and keep the last record.

data want ;
  set have ;
  by id date value ;
  if last.date;
run;
PaigeMiller
Diamond | Level 26

I think your explanation is incomplete, in the sense that you don't say what happens if there are two or more rows for a given ID # and Date, and the amount value is above zero for two or more rows. What do we do then? Or is your example complete and there can only be one record per ID # and DATE that has a non-zero Amount?

 

But based upon the example you show, and assuming (but I hate to assume, could you please clarify if this is the case or not?) that there can only be one non-zero amount value per ID # and date, you could use PROC SUMMARY to get the desired results.

 

proc summary data=have;
    class id date;
    var amount;
    output out=want sum=;
run;
--
Paige Miller
SuryaKiran
Meteorite | Level 14

Are you trying to SUM the Amount by ID & Date, then something like this will work if Amount is in actual numeric values.

 

proc sort data=have;
by id date;
run;

data want;
retain Amount_new;
set have;
by id date;
Amount+Amount_new;
if last.date;
run;  

Thanks,
Suryakiran
Toader
Calcite | Level 5

No Suryakiran. I am not trying to sum the amounts. I'm trying to take x number of events that happened on the same day and turn it into 1 event. For example, a person goes shopping and buys 6 different items. Each item ends up as a row in my table. I don't care that there were 6 items. I care that there was 1 shopping event for that person on that date. 

PaigeMiller
Diamond | Level 26

@Toader wrote:

No Suryakiran. I am not trying to sum the amounts. I'm trying to take x number of events that happened on the same day and turn it into 1 event. For example, a person goes shopping and buys 6 different items. Each item ends up as a row in my table. I don't care that there were 6 items. I care that there was 1 shopping event for that person on that date. 


Seems like my solution in message 3 would work.

 

By the way, summing a non-zero amount and many zeros is the same as taking the single value where the amount is non-zero.

--
Paige Miller
r_behata
Barite | Level 11
data have;
input id date :date9. amt : dollar3.2;
format date date9. amt dollar3.2;
cards;
100 01jan18 $1.00
100 01jan18 $0.00
100 02feb18 $1.00
200 01jan18 $2.00
200 01jan18 $0.00
200 01jan18 $0.00
200 01jan18 $0.00
300 01jan18 $0.00
300 02jan18 $0.00
300 02jan18 $0.00
300 02jan18 $0.00
;
run;



proc sort data=have ;
by id date descending amt ;
run;
proc sort data=have out=want nodupkey; by id date ; run;

 

mkeintz
PROC Star

Your data are already sorted by id/date, so no active sorting is necessary.  Instead interleave the data set through 2 mutually exclusive filters (amt>0  followed by amt=0),and keep the first record for each id/date combo.

 

 

data have;
input ID   $     Date  :date7.     Amount : dollar10.2;
format amount dollar10. date date7.;
cards;
100           01Jan18        $1.00 
100           01Jan18        $0.00
100           02Feb18        $1.00
200           01Jan18        $2.00
200           01Jan18        $0.00
200           01Jan18        $0.00
200           03Feb18        $1.00
200           03Feb18        $0.00
300           02Jan18        $0.00
300           02Jan18        $0.00
300           02Jan18        $0.00
300           02Jan18        $0.00
;
data want;
  set have (where=(amount>0))  have (where=(amount=0));
  by id date;
  if first.date;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1184 views
  • 0 likes
  • 7 in conversation