Solved: Basic Proc Freq

Stretlow · Posted 04-02-2014 11:24 AM

Hello all.

Ok I'm returning to SAS after 5 years of not using it and am self training again from scratch.

I've got my basic datastep stuff sorted however some of the proc procedures have completely left my mind.

I've read quite a lot of stuff but am struggling to find what I want and I know its really simple (please don't mock!)

I have a data table that contains various variables

3 of those are (with sample observations)

AuthDate AccountID

13Feb2014 123456

12Feb2014 321565

12Feb2014 123456

12Feb2014 321565

What im trying do is count where the same accountid appears >1 on the same day and show the name so the count from this small output would be 1 (321565 on 12Feb2014)

Appreciate any help.

Stret

TarunKumar · Posted 04-04-2014 07:36 AM

data have;

input AuthDate:$9. AccountID:$6. Amount;

datalines;

13Feb2014 123456 20

12Feb2014 321565 50

12Feb2014 123456 10

12Feb2014 321565 60

run;

proc sql;

create table want as

select distinct AuthDate,AccountID from have

group by AuthDate,AccountID

having count(AccountID) >1 and sum(Amount) > 40;

quit;

View solution in original post

Reeza · Posted 04-02-2014 11:40 AM

Specify a table statement, and then add a where clause to your out dataset to keep only records of interest, ie count>1

proc freq data=have;

table authdate*accountID/out=want(where=(count>1)) missing;

run;

jwillis · Posted 04-02-2014 12:03 PM

Here's an example using a sort and a datastep. The code is written so that it only considers two duplicates per accountid. If there are more than 2 rows of accountids, you'll need to check for first.accountid = 0 and last.accountid = 0. I defaulted count to 0. You might want to default it to 1 until you are certain you have captured all combinations of first and last. For example, if first accont = 0 and last account = 0 and first date = 0 and last date = 0, then you would make count = 0 so that you do not double count when you reach last account and last date.

data start;

infile datalines truncover;

input AuthDate date9. AccountID $;

datalines;

13Feb2014 123456

12Feb2014 321565

12Feb2014 123456

12Feb2014 321565

;

run;

proc sort data=start;
by accountid authdate;
run;

data finish;
   set start;
   format authdate date9.;
   by accountid authdate;
   count =0;
   if first.accountid = 1 and last.accountid = 1 then count=0;
   if first.accountid = 1 and last.accountid = 0 then do;
      if first.authdate=1 and last.authdate = 1 then count=0;
   end;
   if first.accountid = 1 and last.accountid = 0 then do;
      if first.authdate=1 and last.authdate = 1 then count=0;
      if first.authdate=1 and last.authdate = 0 then count=0;
   end;
   if first.accountid = 0 and last.accountid = 1 then do;
      if first.authdate=1 and last.authdate = 1 then count=0;
      if first.authdate=0 and last.authdate = 1 then count=1;
   end;

run;

Stretlow · Posted 04-03-2014 03:11 AM

Thank you for your swift responses on this guys.

I'll give them both a whirl later today and report back.

Thank you for your help

Stretlow · Posted 04-04-2014 03:03 AM

Unfortunately I cant get either of those to do what I need. the First proc freq killes my pc!!!

I just need a simple count of where the account id appears on the same day more than once so if there was two on the 1st and and two on the second to the same accountid the count would be 2.

Any other suggestions?

Really do appreciate the help

Stret

TarunKumar · Posted 04-04-2014 03:47 AM

data have;

input AuthDate:$9. AccountID:$6.;

datalines;

13Feb2014 123456

12Feb2014 321565

12Feb2014 123456

12Feb2014 321565

run;

proc sql;

create table want as

select * from have

group by AuthDate,AccountID

having count(AccountID) >1;

quit;

Stretlow · Posted 04-04-2014 04:02 AM

I've never used proc SQL before however this does seem to be giving me the results I need on the count so thank you.

Last question.

Where your code says having count(AccountID) >1;

If I had another variable in the table can you put multiple criteria in that so maybe if i wanted where count >1 and total amount >= 50, is that also possible in that statement?

Thanks again

TarunKumar · Posted 04-04-2014 04:37 AM

yes depend on what you want and how data is.

Stretlow · Posted 04-04-2014 06:23 AM

so for exmaple the fields are the same as above except I now have an extra variable named "Amount".

I might want to look where the accountID count is >1 on the same day (as above) and the total amount associated with those account ids >=40?

Stret

TarunKumar · Posted 04-04-2014 07:36 AM

data have;

input AuthDate:$9. AccountID:$6. Amount;

datalines;

13Feb2014 123456 20

12Feb2014 321565 50

12Feb2014 123456 10

12Feb2014 321565 60

run;

proc sql;

create table want as

select distinct AuthDate,AccountID from have

group by AuthDate,AccountID

having count(AccountID) >1 and sum(Amount) > 40;

quit;

JVarghese · Posted 04-14-2014 05:24 PM

proc sort data=data1;

by authdate acctid;

run;

data new (drop=amt);

retain totalamt ;

set data1;

by authdate acctid ;

if first.authdate & first.acctid then do;

count=1;

totalamt=amt;

end;

if (last.authdate & last.acctid)or (first.authdate & last.acctid) then output;

else do;

count+1;

totalamt+amt;

end;

run;

proc print data=new noobs;

var authdate acctid amt;

where count>1 and totalamt>=40;

run;

Try it with your modifications..

JVarghese · Posted 04-14-2014 05:26 PM

I think Reeza's answer is more efficient !

Tom · Posted 04-14-2014 05:59 PM

You could also use PROC SUMMARY (which is really just PROC MEANS with NOPRINT option).

proc summary data=have nway ;

class accountid authdate ;

var amount ;

output n=count sum=total out=want (where=(count > 1)) ;

run;

Note that if you have missing values of the AMOUNT variable then the value generated by the N statistic will be different than the value in the automatic variable _FREQ_ because the missing values will be excluded.

Stretlow · Posted 04-24-2014 03:35 AM

Hi Chaps.

I've been off work for a week so not had chance to have ago at these (or thank you for your help)

I'll have a pop before the end of the week and report back.

Thanks Again

Stret

Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Re: Basic Proc Freq

Registration is open

SAS Training: Just a Click Away