BookmarkSubscribeRSS Feed
chennupriya
Quartz | Level 8

datea    time                     customer
2015-11-20  11:18:36.0000000  lucy
2015-11-20  11:21:09.0000000         lucy

dateb      amount
2015-11-20  250

output

datea    time                     customer  amount
2015-11-20  11:18:36.0000000  lucy   250
2015-11-20  11:21:09.0000000         lucy   250

 

I have three tables and i have an output where it has two records but the amounts are duplicated so how to pick a record with max time . can anyone pls help

6 REPLIES 6
dcruik
Lapis Lazuli | Level 10

This should eliminate your duplicates and keep the record with the max time:

 

data output;
input datea time customer$ amount;
informat datea yymmdd10. time time10. customer $10. amount 6.;
format datea yymmdd10. time time10. customer $10. amount 6.;
datalines;
2015-11-20 11:18:36 lucy 250
2015-11-20 11:21:09 lucy 250
;
run;

proc sort data=output;
by datea customer descending time;
run;

data want;
set output;
by datea customer descending time;
If First.customer;
run;

 

FreelanceReinh
Jade | Level 19

You pointed out that the amounts are duplicated. So, would you want to keep both records if the amounts were different, say, 240 and 250?

 

How would your data look like if there had been two completely different transactions which happened to involve the same amount? For example, Lucy paid 250 for a pair of shoes and another 250 for a coat on the same day. Would it make sense to pick only one of these transactions?

 

Would you still want to pick only one of the two records if they were, say, 12 hours apart rather than only a few minutes?

 

What if the first datetime was 2015-11-20 23:58:36 and the second was 2015-11-21 00:01:09? In this case, grouping by datea would separate them and both would be selected.

 

 

chennupriya
Quartz | Level 8

hi ,

what if the data is like this

 

datea    time                     customer  amount sssid
2015-11-20  11:18:36.0000000  lucy   250    xxx
2015-11-20  11:21:09.0000000         lucy   250    xxx
2015-11-23  11:20:35.0000000            lucy   500    xxx

 

and output should be

 

datea    time                     customer  amount sssid
2015-11-20  11:21:09.0000000         lucy   250    xxx
2015-11-23  11:20:35.0000000            lucy   500    xxx

Reeza
Super User

The concept is correct, use a sort and BY group processing. Sort such that for duplicate records the latest time is at the top and then take the first record for each group.

 

proc sort data=output;
by datea amount descending time;
run;

data want;
set output;
by datea amount;
if first.amount;
run;
PGStats
Opal | Level 21

If everything is the same except the time in those duplicates, then you can simply do:

 

proc sql;
create table want as
select datea, max(time) as time, customer, amount, sssid
from have
group by datea, customer, amount, sssid;
quit; 
PG
dcruik
Lapis Lazuli | Level 10

If you have multiple datea values, then I would just modify the sort by variables and what you are using as the variable with your First. satement.  This would get you what you want with the updated data you provided:

 

data have;
input datea time customer$ amount sssid$;
informat datea yymmdd10. time time20. customer $10. amount 6. sssid $10.;
format datea yymmdd10. time time20. customer $10. amount 6. sssid $10.;
datalines;
2015-11-20 11:18:36 lucy 250 xxx
2015-11-20 11:21:09 lucy 250 xxx
2015-11-23 11:20:35 lucy 500 xxx
;
run;

proc sort data=have;
by customer datea descending time;
run;

data want;
set have;
by customer datea descending time;
If First.datea;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 810 views
  • 0 likes
  • 5 in conversation