Help using Base SAS procedures

Basic Proc Freq

Accepted Solution Solved
Reply
Contributor
Posts: 49
Accepted Solution

Basic Proc Freq

Hello all.

Ok I'm returning to SAS after 5 years of not using it and am self training again from scratch.

I've got my basic datastep stuff sorted however some of the proc procedures have completely left my mind.

I've read quite a lot of stuff but am struggling to find what I want and I know its really simple (please don't mock!)

I have a data table that contains various variables

3 of those are (with sample observations)


AuthDate    AccountID  

13Feb2014  123456      

12Feb2014  321565     

12Feb2014  123456      

12Feb2014  321565     


What im trying do is count where the same accountid appears >1 on the same day and show the name so the count from this small output would be 1 (321565 on 12Feb2014)

Appreciate any help.

Stret


Accepted Solutions
Solution
‎04-04-2014 07:36 AM
Frequent Contributor
Posts: 75

Re: Basic Proc Freq

data have;

input AuthDate:$9.    AccountID:$6. Amount;

datalines;

13Feb2014  123456 20    

12Feb2014  321565   50 

12Feb2014  123456    10 

12Feb2014  321565   60

run;

proc sql;

create table want as

select distinct AuthDate,AccountID from have

group by AuthDate,AccountID

having count(AccountID) >1 and sum(Amount) > 40;

quit;

View solution in original post


All Replies
Super User
Posts: 17,912

Re: Basic Proc Freq

Specify a table statement, and then add a where clause to your out dataset to keep only records of interest, ie count>1

proc freq data=have;

table authdate*accountID/out=want(where=(count>1)) missing;

run;

Regular Contributor
Posts: 217

Re: Basic Proc Freq

Here's an example using a sort and a datastep.  The code is written so that it only considers two duplicates per accountid. If there are more than 2 rows of accountids, you'll need to check for first.accountid = 0 and last.accountid = 0.  I defaulted count to 0.  You might want to default it to 1 until you are certain you have captured all combinations of first and last.  For example, if first accont = 0 and last account = 0 and first date = 0 and last date  = 0, then you would make count = 0 so that you do not double count when you reach last account and last date.

data start;

infile datalines truncover;

  input AuthDate date9. AccountID $;

datalines;

13Feb2014 123456      

12Feb2014 321565     

12Feb2014 123456      

12Feb2014 321565

;

run;


proc sort data=start;
by accountid authdate;
run;

data finish;
   set start;
   format authdate date9.;
   by accountid authdate;
   count =0;
   if first.accountid = 1 and last.accountid = 1 then count=0;
   if first.accountid = 1 and last.accountid = 0 then do;
      if first.authdate=1 and last.authdate  = 1 then count=0;
   end;
   if first.accountid = 1 and last.accountid = 0 then do;
      if first.authdate=1 and last.authdate  = 1 then count=0;
      if first.authdate=1 and last.authdate  = 0 then count=0;
   end;
   if first.accountid = 0 and last.accountid = 1 then do;
      if first.authdate=1 and last.authdate  = 1 then count=0;
      if first.authdate=0 and last.authdate  = 1 then count=1;
   end;

run;

Contributor
Posts: 49

Re: Basic Proc Freq

Thank you for your swift responses on this guys.

I'll give them both a whirl later today and report back.

Thank you for your help

Contributor
Posts: 49

Re: Basic Proc Freq

Unfortunately I cant get either of those to do what I need.  the First proc freq killes my pc!!!

I just need a simple count of where the account id appears on the same day more than once so if there was two on the 1st and and two on the second to the same accountid the  count would be 2.

Any other suggestions?

Really do appreciate the help

Stret

Frequent Contributor
Posts: 75

Re: Basic Proc Freq

data have;

input AuthDate:$9.    AccountID:$6.;

datalines;

13Feb2014  123456     

12Feb2014  321565    

12Feb2014  123456     

12Feb2014  321565 

run;

proc sql;

create table want as

select * from have

group by AuthDate,AccountID

having count(AccountID) >1;

quit;

Contributor
Posts: 49

Re: Basic Proc Freq

I've never used proc SQL before however this does seem to be giving me the results I need on the count so thank you.

Last question.

Where your code says     having count(AccountID) >1;

If I had another variable in the table can you put multiple criteria in that so maybe if i wanted where count >1 and total amount >= 50, is that also possible in that statement?

Thanks again

Frequent Contributor
Posts: 75

Re: Basic Proc Freq

yes depend on what you want and how data is.

Contributor
Posts: 49

Re: Basic Proc Freq

so for exmaple the fields are the same as above except I now have an extra variable named "Amount".

I might want to look where the accountID count is >1 on the same day (as above) and the total amount associated with those account ids >=40?

Stret

Solution
‎04-04-2014 07:36 AM
Frequent Contributor
Posts: 75

Re: Basic Proc Freq

data have;

input AuthDate:$9.    AccountID:$6. Amount;

datalines;

13Feb2014  123456 20    

12Feb2014  321565   50 

12Feb2014  123456    10 

12Feb2014  321565   60

run;

proc sql;

create table want as

select distinct AuthDate,AccountID from have

group by AuthDate,AccountID

having count(AccountID) >1 and sum(Amount) > 40;

quit;

Contributor
Posts: 43

Re: Basic Proc Freq

proc sort data=data1;

     by authdate acctid;

run;

data new (drop=amt);

     retain totalamt ;

     set data1;

     by authdate acctid  ;

     if first.authdate & first.acctid then do;

                                                            count=1;

                                                            totalamt=amt;

                                                      end;

  if (last.authdate & last.acctid)or (first.authdate & last.acctid) then output;

  else do;

       count+1;

      totalamt+amt;

   end;

    

run;

proc print data=new noobs;

     var authdate acctid amt;  

     where count>1 and totalamt>=40;

run;

Try it with your modifications..

Contributor
Posts: 43

Re: Basic Proc Freq

I think Reeza's answer is  more efficient !

Super User
Super User
Posts: 6,502

Re: Basic Proc Freq

You could also use PROC SUMMARY (which is really just PROC MEANS with NOPRINT option).

proc summary data=have nway ;

  class accountid authdate ;

  var amount ;

  output n=count sum=total out=want (where=(count > 1)) ;

run;

Note that if you have missing values of the AMOUNT variable then the value generated by the N statistic will be different than the value in the automatic variable _FREQ_ because the missing values will be excluded.

Contributor
Posts: 49

Re: Basic Proc Freq

Hi Chaps.

I've been off work for a week so not had chance to have ago at these (or thank you for your help)

I'll have a pop before the end of the week and report back.

Thanks Again

Stret

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 518 views
  • 3 likes
  • 6 in conversation