BookmarkSubscribeRSS Feed
Mchan890
Calcite | Level 5

I have a large data set:

 

Have:

PO   Amount   Line
1212        300     1         
1233        250     1        
1233        600     2        

1345      1520     1        

1350      1000     2

1350        500     3

 

Want:

PO   Amount   Line
1212        300     1             

1345      1520     1        

 

I know the nodupkey will remove the duplicates, however, I am trying to remove both if there are duplicate.

 

 

13 REPLIES 13
novinosrin
Tourmaline | Level 20

Hi @Mchan890   Please try

 


data have;
input PO   Amount   Line;
cards;
1212        300     1         
1233        250     1        
1233        600     2        
1345      1520     1        
1350      1000     2
1350        500     3
;

data want;
 set have;
 by po;
 if first.po and last.po;
run;

/*or*/


proc sql;
create table want as
select *
from have
group by po
having count(*)=1;
quit;

 

Reeza
Super User

PROC SORT has a UNIQUERECS  UNIQUEOUT option as well.

 

EDIT: corrected, as per @ballardw code - which is a correct answer. 

novinosrin
Tourmaline | Level 20

Thank you @Reeza  learned something new

ballardw
Super User

Since you want unique values of PO then:

data have;
   input PO   Amount   Line   ;
datalines;
1212        300     1         
1233        250     1        
1233        600     2        
1345      1520     1        
1350      1000     2
1350        500     3
;

Proc sort data=have out=have uniqueout=want nouniquekey;
   by po ;
run;

The data set you want is the one specified by the UNIQUEOUT option and the NOUNIQUEKEY is the appropriate equivalent to NODUPEKEY.

 

Please note how the data was provided as data step code so that we have something to test with.

novinosrin
Tourmaline | Level 20

Hmm Okay, so it's called UNIQUEOUT . I will have to remember that. Thank you!

Reeza
Super User
There used to be a UNIQUERECS option, it was removed because it was confusing and didn't work the way people expected and then they added UNIQUEOUT so I get them confused sometimes 🙂
novinosrin
Tourmaline | Level 20

Good point, I think it would help to keep abreast of which ones are active and not. I suppose there must be some docs, but really is confusing. Sir @ballardw  is very good in knowing the latest stuff in the docs

Reeza
Super User
Unfortunately the key is checking it frequently....and I don't actually use SAS day to day anymore, which is why my participation is 'lower' and declining :(. We do everything at my current shop in R.
ballardw
Super User

@Reeza wrote:
There used to be a UNIQUERECS option, it was removed because it was confusing and didn't work the way people expected and then they added UNIQUEOUT so I get them confused sometimes 🙂

I'm glad I wasn't the only one remembering another no longer available option. I wouldn't be surprised that code would run but without the documentation I wasn't about to try.

data_null__
Jade | Level 19

@ballardw wrote:

Since you want unique values of PO then:

data have;
   input PO   Amount   Line   ;
datalines;
1212        300     1         
1233        250     1        
1233        600     2        
1345      1520     1        
1350      1000     2
1350        500     3
;

Proc sort data=have out=have uniqueout=want nouniquekey;
   by po ;
run;

The data set you want is the one specified by the UNIQUEOUT option and the NOUNIQUEKEY is the appropriate equivalent to NODUPEKEY.

 

Please note how the data was provided as data step code so that we have something to test with.


 

Note: OUT= data will not have the same observations, as DATA= as might be mistakenly implied by your example:  data=have out=have

 

 

ballardw
Super User

@data_null__ wrote:

@ballardw wrote:

Since you want unique values of PO then:

data have;
   input PO   Amount   Line   ;
datalines;
1212        300     1         
1233        250     1        
1233        600     2        
1345      1520     1        
1350      1000     2
1350        500     3
;

Proc sort data=have out=have uniqueout=want nouniquekey;
   by po ;
run;

The data set you want is the one specified by the UNIQUEOUT option and the NOUNIQUEKEY is the appropriate equivalent to NODUPEKEY.

 

Please note how the data was provided as data step code so that we have something to test with.


 

Note: OUT= data will not have the same observations, as DATA= as might be mistakenly implied by your example:  data=have out=have

 

 


I thought I was getting an error without the out=have, and probably should have named it Reduced or similar.

data_null__
Jade | Level 19

 

out=_null_

If you don't need the dups.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2410 views
  • 16 likes
  • 6 in conversation