Help using Base SAS procedures

How to get one subset like this?

Reply
Contributor
Posts: 36

How to get one subset like this?

My dataset is shown as below,

case_numbercase_value
a1
a2
a3
a4
b2
b3
b4
b5
c5
c6
c7
c9

I want to extract the observations whose case _value is maximal across each case_number. For this dataset, I want to get one subset like this.

case_numbercase_value
a4
b5
c

9

How to achieve this by coding?

Thanks!

Frequent Contributor
Posts: 95

Re: How to get one subset like this?

data have;

input case_number  $ case_value;

datalines;

a 1

a 2

a 3

a 4

b 2

b 3

b 4

b 5

c 5

c 6

c 7

c 9

;

run;

proc sql;

  create table want as

  select distinct case_number,

  max(case_value) as case_value

  from have

  group by case_number;

quit;

Respected Advisor
Posts: 4,920

Re: How to get one subset like this?

Why use DISTINCT in this query?

PG

PG
Frequent Contributor
Posts: 95

Re: How to get one subset like this?

Thanks for the question since it made me think what's happening with proc sql calls.

It will return the same result set without DISTINCT in select statement with one less operation.

I looked at the method chosen by proc sql with and without DISTINCT keyword in select statement.

When DISTINCT is used "sqxunqs" operation is run as an extra operation to de-dup records which are already unique before this operation.

I should be more careful next time when using DISTINCT keyword especially with large data sets.

Method without DISTINCT:

proc sql _method;

create table want as

  select case_number,

  max(case_value) as case_value

  from have

  group by case_number;

quit;

NOTE: SQL execution methods chosen are:

      sqxcrta

          sqxsumg

              sqxsort

                  sqxsrc( WORK.HAVE )

Method with DISTINCT:

proc sql _method;

create table want as

  select distinct case_number,

  max(case_value) as case_value

  from have

  group by case_number;

quit;

NOTE: SQL execution methods chosen are:

      sqxcrta

          sqxunqs

              sqxsumg

                  sqxsort

                      sqxsrc( WORK.HAVE )

Regular Contributor
Posts: 151

Re: How to get one subset like this?

Another option is Proc Summary.  The 2nd example is based on the maximum value being returned when an ID statement is used.

proc summary data=have nway;

class case_number;

var case_value;

output out=want (drop=_Smiley Happy max=;

run;


or

proc summary data=have nway;

class case_number;

id case_value;

output out=want (drop=_Smiley Happy;

run;

Super Contributor
Posts: 1,636

Re: How to get one subset like this?

or no SQL solution:

data have;

input case_number  $ case_value;

datalines;

a 1

a 2

a 3

a 4

b 2

b 3

b 4

b 5

c 5

c 6

c 7

c 9

;

proc sort;

by case_number  $ case_value;

data want;

   set have;

   by case_number;

   if last.case_number;

proc print;run;

Linlin

Valued Guide
Posts: 765

Re: How to get one subset like this?

hi ... the SQL and SUMMARY solutions seem most "economical", but if you are going to sort, you can always sort again and take advantage of the default EQUALS option that maintains order withing sort groups ...

proc sort data=have out=want;

by case_number descending case_value;

run;


proc sort data=want nodupkey;

by case_number;

run;

Super Contributor
Posts: 1,636

Re: How to get one subset like this?

Hi Mike,

Thank you! I need to practice my Mapping skill learned from you. I am going to post a question at SAS/Graph and ODS Graphics. I am sure you will help me.

- Linlin

Respected Advisor
Posts: 3,156

Re: How to get one subset like this?

FWIW, a hash() solution:

data have;

input case_number$    case_value;

cards;

a    1

a    2

a    3

a    4

b    2

b    3

b    4

b    5

c    5

c    6

c    7

c    9

;

data _null_;

  if 0 then set have;

   dcl hash h(dataset:'have', ordered: 'a',duplicate:'r');

   h.definekey('case_number');

   h.definedata(all:'y');

   h.definedone();

    rc=h.output(dataset:'want');

   stop;

run;

proc print;run;

Haikuo

Super Contributor
Posts: 338

Re: How to get one subset like this?

Hi Haikuo,

I am totally new to hash. Hash looks like a vey handy appraoch.

Coud you please elaborate a bit how you have built the logic in your hash code (time permitting).

This simple data set looks a good opportunity to understand it.

Thanks

Mirisage

Respected Advisor
Posts: 3,156

Re: How to get one subset like this?

Hi Mirisage,

Hash() object is actually more straightforward than many other data step statements, such as 'set', 'merge', 'update', it is just a little different. And you are correct, Above example is ideal to take a peek into hash():

data _null_;

  if 0 then set have; /*This is to set hash() environment, eg variables, before you can define a hash(), in this case, you can also use: call missing (case_number, case_value);*/

   dcl hash h(dataset:'have', ordered: 'a',duplicate:'r'); /*This is to define hash(), named 'h', loaded from 'have', with an order 'a' as 'ascending', if there are duplicates by 'key', then only keep the last record ('r' denotes replace)*/

   h.definekey('case_number'); /*Define search key 'case_number'*/

   h.definedata(all:'y'); /*Define data element, all :'y' denoting with every thing from 'have'

   h.definedone(); /*the end of hash() definition*/

    rc=h.output(dataset:'want'); /output hash 'h' to dataset 'want'*/

   stop;

run;

Here is the help link, where I 've learned most of my hash():

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002576871.htm

Also there are tons of sugi paper on hash(), pay extra attention on those written by Paul dorfman.

Regards,

Haikuo

Super Contributor
Posts: 338

Re: How to get one subset like this?

Hi Haikuo,

Thank you very much for your time and expertise.

Learning resources and your annotated comments are very much appreciated.

Best regards

Mirisage

Ask a Question
Discussion stats
  • 11 replies
  • 358 views
  • 3 likes
  • 8 in conversation