BookmarkSubscribeRSS Feed
imcbczm
Obsidian | Level 7

Hi, in this example, row 1 and 4 are duplicates because if you swiched col1 and col2 for one of the rows, that row will be a duplicated row of the other based on col1 and col2. I only want to keep one of them. Thank you! Z

 

data have;

input col1 $ col2 $ col3;

datalines;

A B 1

C F 3

D H 5

B A 1

G H 6

;

run;

 

data want;

input col1 $ col2 col3;

datalines;

A B 1

C F 3

D H 5

G H 6

;

run;

2 REPLIES 2
imcbczm
Obsidian | Level 7

Ok, I figured it out, thank you anyway:

 

data temp;

set have;

new1=col1;

new2=col2;

call sortc(new1,new2);

run;

proc print data=temp;

run;

proc sort data=temp out=want2 nodupkey;

by new1 new2;

run;

proc print data=want2;

run;

mkeintz
PROC Star

If the dataset is big, then this may offer some efficiencies.  It passes through the data only once:

 

data need/ view=need;

  set have;

  new1=col1;new2=col2;

  call sortc(new1,new2);

run;

proc sort data=need out=want  nodupkey;

  by new1 new2;

run;

 

 

It takes advantage of the fact that data set need is a data set VIEW, not a data set FILE.  This means it is actualized only when called later (by the proc sort) its data is passed directly to proc sort without being written to disk.

 

Result, one pass of the data, producing sorted results with all duplicate combinations (not just duplicate permutations) of col1/col2 removed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 650 views
  • 0 likes
  • 2 in conversation