Help using Base SAS procedures

Urvish Shah

Reply
Regular Contributor
Posts: 195

Urvish Shah

Hello

I have the data like this

Data have;

input name $ age;

cards;

A 20

A 21

B 22

C 23

;

nw i wan to eliminate the duplicate obs like this

proc sort data = have nodupkey;

by name;

run;

nw is there any other way to do the same by using Proc Sql in order to reduce the processing time

PROC Star
Posts: 7,363

Urvish Shah

You could always use select distint(name) as onename,age, in a proc sql call, but I doubt if it would improve the processing time.

Trusted Advisor
Posts: 2,113

Urvish Shah

Urvish,

First, it helps the rest of us if you use the topic of your question as the subject (rather than your name).

Use the DISTINCT option in SQL for this.  Documentation in the manual.  Also search the SAS papers for NODUPKEY and DISTINCT to find some good descriptions of their use.

Be aware, SQL is not always faster and may not provide the same results (you should get the same number of record, but not necesssarily the same records).

Doc Muhlbaier

Duke

Regular Contributor
Posts: 184

Urvish Shah

All four rows in the example are different, so DISTINCT is going to return all four.

UrvishShah wrote:

Hello

I have the data like this

Data have;

input name $ age;

cards;

A 20

A 21

B 22

C 23

;

nw i wan to eliminate the duplicate obs like this

proc sort data = have nodupkey;

by name;

run;

nw is there any other way to do the same by using Proc Sql in order to reduce the processing time

PROC Star
Posts: 7,363

Urvish Shah

Howard,

Not if one first selects just based on name (e.g.)

proc sql noprint;

  create table want as

    select distinct(name) as onename

      from have

        group by name

          having count(*) gt 0

;

quit;

and then joins the result back with the original table.  However, as mentioned, I can't see it outperforming proc sort.

Regular Contributor
Posts: 184

Urvish Shah

Seems to me that you can omit the GROUP BY and HAVING clauses and get the same WANT; the HAVING condition can never be false. Then joining this result to the original will get you back to ... the original.

This will work as long as there are no ties:

  create table want as

    select *

      from have

        group by name

          having age EQ max(age)

;

art297 wrote:

Howard,

Not if one first selects just based on name (e.g.)

proc sql noprint;

  create table want as

    select distinct(name) as onename

      from have

        group by name

          having count(*) gt 0

;

quit;

and then joins the result back with the original table.  However, as mentioned, I can't see it outperforming proc sort.

PROC Star
Posts: 7,363

Urvish Shah

Howard,

The following will work with or without ties, but uses the undocumented monotonic function:

Data have;

  input name $ age;

  cards;

A 20

A 20

A 21

B 22

C 23

;

proc sql noprint;

  create table want as

    select distinct(name) as onename

      from have

        group by name

          having min(monotonic()) gt 0

;

quit;

But, I still don't see it necessarily working any faster than using proc sort.

Super User
Posts: 9,682

Urvish Shah

Art.T

I must have to say using monotonic() is dangerous for proc sql, this has been mentioned in sas documentation.

It will result undesirable result,especially for complex SQL.

Ksharp

Regular Contributor
Posts: 184

Urvish Shah

I'm pretty sure that MONOTONIC always returns positive values, so the HAVING and GROUP BY clauses here don't do anything.

DISTINCT is not a function, so the parens around NAME also have no effect.

We still need the AGE column to emulate the SORT/NODUPKEY results. This

  create table want as

    select name , max(age) as age

      from have

        group by name

;

but only in the special case of a single satellite column. Were there multiple satellite columns (like AGE and SEX), no go.

art297 wrote:

Howard,

The following will work with or without ties, but uses the undocumented monotonic function:

Data have;

  input name $ age;

  cards;

A 20

A 20

A 21

B 22

C 23

;

proc sql noprint;

  create table want as

    select distinct(name) as onename

      from have

        group by name

          having min(monotonic()) gt 0

;

quit;

But, I still don't see it necessarily working any faster than using proc sort.

Regular Contributor
Posts: 195

Urvish Shah

Sorry to all for displaying my name insted of the original topic

I am really sorry for that

After posted my question i learned a lot .

Thanks all .

Ask a Question
Discussion stats
  • 9 replies
  • 176 views
  • 0 likes
  • 5 in conversation