SAS Procedures

Help using Base SAS procedures
BookmarkSubscribeRSS Feed
Solph
Pyrite | Level 9

I've read warning about using options of NODUP and NODUPKEY in Proc Sort. That is, both compare the previous observation written to the output data set, so if the observations that you want eliminated are not adjacent in the data set after the sort, they will not be eliminated.

I can understand it being a problem with NODUP, but why it's also a problem with NOUPKEY? Is it true that with NODUPKEY, SAS sorts the data by the variables following BY (say by ID, NUM) and eliminates any duplicates after the first occurrence of the cases. If so, I shouldn't have duplicate cases by ID and NUM, right? Here is an example.

*Records for ID#2 not adjacent and it's sex values are not the same, but I care only about duplicates by ID and NUM;

data aa; input

id sex $ NUM AMOUNT;

datalines;

1 F 3 40

1 F 4 40

2 M 3 20

3 M 3 40

3 M 3 40

2 F 3 20

;

proc print data=aa noobs; run;

proc sort data=aa out=bb nodupkey; by id num; run;

proc print data=bb noobs; run;

/*

1 F 3 40

1 F 4 40

2 M 3 20

3 M 3 40

*/

proc sort data=aa out=cc nodup; by id; run;

proc print data=cc noobs; run;

/*

1 F 3 40

1 F 4 40

2 M 3 20

3 M 3 40

2 F 3 20

*/

Thanks very much. I just wanted to be sure, as I prefer not to use PROC SQL and DISTINCT do to it.

***** BUT while I'm on it, I also have a question about SQL's create table name being the same as the source data name (as I just wanted to overwrite it), such as the following code. SAS would give the following waning, Can I ignore it?

      WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

proc sql; create table aa as select distinct id, num from aa; quit;

proc print data=aa noobs; run;

Thanks,

2 REPLIES 2
Tom
Super User Tom
Super User

NODUPKEY will eliminate all of the duplicates because it is only checking the BY variables.  Since the data is being sorted by those variables there is no way for identical values to be separated by some other value as can happen with NODUP.

It is generally good practice not to do that just because it will make it hard to figure out what happened when you get unexpected results.

But otherwise you only need to be concerned about the warning when your table is in an external database.  That database might have rules or methods that will prevent writing back to the same table as the source.

Solph
Pyrite | Level 9

Thanks for confirming about NODUPKEY, Much appreciated, also about SQL.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1537 views
  • 0 likes
  • 2 in conversation