BookmarkSubscribeRSS Feed
Raghs_Newbee
Calcite | Level 5

Hi Team,

I have 3 million obs with account numbers of different persons with N number of transactions at different dates for the last three years.

I need to collect the Distinct(Account_Number) and the Amount which is actually the last transaction made by the customer.

Thanks

11 REPLIES 11
art297
Opal | Level 21

If there isn't a chance that customers had multiple transactions on the last date, the task would be easy with proc sql, simply using distinct and where date=max(date).

However, since customers probably could have multiple transactions, and if the data are already in date order, then I would recommend sorting the file by account_Number and then using last.account_Number in a datastep.

Raghs_Newbee
Calcite | Level 5

Thanks Art, But unfortunately yes.. few customer have more transactions in the last date.. So i am unable to do so using Proc SQL. The report I am pulling also consits of a lot of other variables that needs either to be summed up or transposed to find the count. I will post some Sample data and the output I need for a Better understanding.

Haikuo
Onyx | Level 15

Do your date value have a time portion to identify different transactions on the same date or do your transaction number serve the same purpose?

Raghs_Newbee
Calcite | Level 5

Yes, The Date Column is in DateTime format. Attached is a Sample Data. I am not worried about the blanks in the Date Column, Just need the Last Date Time Transaction. I did try the Querry of HIMA. I get repeated Account ID when the transaction Amount is Different.

Hima
Obsidian | Level 7

data a;
input account_number $ tran_am $ tran_dt $;
cards;
12345678 12.00 02/16/2012
12345678 14.00 02/15/2012
23456789 15.00 02/14/2012
23456789 13.00 02/13/2012
;
run;

proc sql;
select distinct account_number, tran_am, tran_dt from a
group by account_number having tran_dt = max(tran_dt);
quit;

Output:

                                account_
                                number        tran_am   tran_dt
                                ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
                                12345678      12.00     02/16/20
                                23456789      15.00     02/14/20


Hima
Obsidian | Level 7

The above query works good even if there are multiple transactions with in the max date.

art297
Opal | Level 21

Hima: your suggested code will produce multiple records when there are multiple transactions on the last day and they are for different amounts.  e.g.:

data a;

input account_number $ tran_am $ tran_dt $;

cards;

12345678 14.00 02/15/2012

12345678 12.00 02/16/2012

12345678  9.00 02/16/2012

23456789 13.00 02/13/2012

23456789 15.00 02/14/2012

23456789  9.00 02/14/2012

;

run;

proc sql;

select distinct account_number, tran_am, tran_dt from a

group by account_number having tran_dt = max(tran_dt);

quit;

Hima
Obsidian | Level 7

Art - I see what you are saying Smiley Happy. If the date in the table has time stamp may be that will work?

Suppose that the format of the tran_date is something like this 12MAY2009:00:00:00.00000 (datetime25.6)

data a;

input account_number $ tran_am $ tran_dt $;

cards;

12345678 14.00 12JAN2012:00:00:00.00000

12345678 12.00 12JAN2012:04:50:18.254602

12345678  9.00 12JAN2012:09:23:00.00000

23456789 13.00 11JAN2012:00:00:00.00000

23456789 15.00 11JAN2012:04:50:18.254602

23456789  9.00 11JAN2012:09:23:00.00000

;

run;

proc sql;

select distinct account_number, tran_am, tran_dt from a

group by account_number having tran_dt = max(tran_dt);

quit;

Raghs_Newbee - Can you please post sample data?
Ksharp
Super User

Yes. the value of datetime is actually the number of second from 01jan1960:00:00:00

Raghs_Newbee
Calcite | Level 5

Hi Hima I have attached some sample data in the post. Thanks to all for the wonderfull help.

Ksharp
Super User

How about:

proc import datafile='c:\Test_Data.xls' out=test dbms=excel replace;
 usedate=no;scantime=no;
run;
data temp(drop=_date);
 set test(rename=(date=_date));
 date=input(_date,datetime20.);
 format date datetime.;
 run;
proc sql;
 create table want as 
  select * from temp
   group by customer_id
    having date=max(date);
quit;


Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 8172 views
  • 0 likes
  • 5 in conversation