DATA Step, Macro, Functions and more

first. and last.

Reply
Super Contributor
Posts: 435

first. and last.

I have a raw data file with some 3K records. I need to find out customers with different names and same address.

I tried this code, but got note as follows.

       data rawdata2;

         set rawdata1;  /*(my .csv which has name, address and zip)*/

        if first.name and last.Address and last.zip_code;

         run;

NOTE: Variable 'first.name'n is uninitialized.

NOTE: Variable 'last.Address'n is uninitialized.

NOTE: Variable 'last.zip_code'n is uninitialized.

can someone suggest me to accomplish this task?

Super User
Super User
Posts: 7,942

Re: first. and last.

Your missing a:

by name address zip_code;

However, I would suggest you do:

proc sql;

     create table WANT as

     select     distinct *

     from        HAVE

     group by  *

     having     count(*) gt 1;

quit;

Note, not tested the above as no data, but something along those lines, if you only want those that appear more than once.  If you just want a nice list, then either distinct or proc sort with nodupkey will do the trick.

Super Contributor
Posts: 435

Re: first. and last.

I ran your code and the log as follows. Am I doing something wrong?

108        proc sql;

109         create table rawdata3 as

110         select     distinct *

111         from        rawdata1

112         group by  Address,zip_code

113         having     count(*) gt 1;

NOTE: The query requires remerging summary statistics back with the original data.

NOTE: Table WORK.RAWDATA3 created, with 0 rows and 13 columns.

114        quit;

NOTE: PROCEDURE SQL used (Total process time):

      real time           0.00 seconds

      cpu time            0.00 seconds

Super User
Super User
Posts: 7,942

Re: first. and last.

Not sure about your data, the below code does work.  Try moving the count() up after distinct *, and see what count its returning, maybe your group by only ever means there is one of each:

data have;

  name="AA"; address="ERTY"; zip_code="TYRY"; output;

  name="AA"; address="ERTY"; zip_code="DFGH"; output;

  name="BB"; address="WESR"; zip_code="TLKJH"; output;

run;

proc sql;

  create table WANT as

  select  distinct *

  from   HAVE

  group by NAME,ADDRESS

  having count(*) gt 1;

quit;

Super Contributor
Posts: 435

Re: first. and last.

Thanks again.

If I move the count(*) like below, then what aggregate function I should use in having clause.

        proc sql;

        create table rawdata3 as

         select     distinct count(*)

       from        rawdata1

         group by  Address,zip_code

         having  //*<insert aggregate function>*/   gt 1;

Super User
Super User
Posts: 7,942

Re: first. and last.

Yes, I meant like this:

proc sql;

  create table WANT as

  select  distinct *,

          count(*) as CNT

  from   HAVE

  group by NAME,ADDRESS;

quit;

From the resultant table you can see the distinct values and what they have counted to.  I suspect your grouping is returning a count of 1 for each of them. 

Super Contributor
Posts: 435

Re: first. and last.

My data is likethis,

name address zip

Ravi  abc 123

Raj    abc  123

Hari   def   456

Hari   def   456

Kiran  ghi   789

I need output like,

name address zip

Ravi  abc 123

Raj    abc  123

I need the output with same address and same zip with different names. All the variables are in character.

Super User
Super User
Posts: 7,942

Re: first. and last.

Ok, slightly moved things around, this should work:

data have;

  infile datalines;

  input name $ address $ zip;

datalines;

Ravi  abc 123

Raj    abc  123

Hari   def   456

Hari   def   456

;

run;

proc sql;

  create table WANT as

  select  *

  from   (select distinct NAME,ADDRESS,ZIP from HAVE)

  group by ADDRESS,ZIP

  having count(NAME) > 1;

quit;

Super Contributor
Posts: 435

Re: first. and last.

Thanks a ton!

Can we do this in data step with first. and last.? Because I'm interested to get the solution via first. and last.

Super User
Posts: 7,771

Re: first. and last.

Read up on this:

SAS 9.2 Language Reference: Concepts, Second Edition: BY-Group Processing in the DATA Step

You are missing the basic concepts for this.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,039

Re: first. and last.

To use BY processing you need to:

SORT the data BY the variables of interest.

SET the data BY the variables of interest.

Then you can reference the FIRST. and LAST. flags.

Let's assume that you want records with more than one value of NAME for the same values of ADDRESS and ZIP.

data want ;

  set have ;

  by zip address name ;

  if NOT (first.address and last.address);

run;

PROC Star
Posts: 7,471

Re: first. and last.

Tom's approach will almost get you what you want, but not quite. However, it will work if you use the proc sort do get rid of the exact matches. e.g.:

data have;

  input name $ address $ zip;

  datalines;

Ravi  abc 123

Raj    abc  123

Hari   def   456

Hari   def   456

;

proc sort data=have out=want nodupkey;

  by address zip name;

run;

data want;

  set want;

  by address zip;

  if not (first.zip and last.zip);

run;

Contributor
Posts: 34

Re: first. and last.

proc sort data=have out=want;

     by address;

run;

data want;

     set want;

if first.address=last.address then delete;

run;

you can also write a proc sort with nodupkey with name in the by statement if you have same name and same address. 

New Contributor
Posts: 2

Re: first. and last.

Posted in reply to BharathBandi

Here is another variation..

proc sort data=rawdata1;

    by zip_code address name;

run;

data differentNames_SameAddress;

     set rawdata1;  /*(my .csv which has name, address and zip)*/

     by zip_code address name;

     /* select records with more than one address */

     if not (fist.address and and last.address) then do; ;

         output differentNames_SameAddress;    

     end;

run;

Ask a Question
Discussion stats
  • 13 replies
  • 511 views
  • 7 likes
  • 7 in conversation