BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mike018
Fluorite | Level 6

Hello,

             is there anybody that knows if it's possible to take this sorted data and move duplicate records to another dataset in a way that I didn't need to hand code it (with PROC SQL, PROC SORT, ...).

If it's not possible, I'll probably be able to move the records using last. and first. statements.

// INPUT


PROC DATA=AA out=messages;

BY DURAO ORIGINE DESTINATION CHARGE;

RUN;


PROC PRINT DATA=MESSAGES;

RUN;


Obs DURAO ORIGINE DESTINATION CHARGE

1          0           2                2               3.89

2          0           2                6            563.56

3          0           2                6            563.56

4          0           4                2              12.18

5          0           4                2              45.23

6          0           4                2              45.23

7          0           9                2             145.32

8          1           1                1           1103.32

9          1           1                9             714.32

10        1           3                9           5623.02

11        1           4                2               45.23


// OUTPUT

PROC PRINT DATA=MESSAGES;

RUN;

Obs DURAO ORIGINE DESTINATION CHARGE

1            0             2                2                    3.89

2            0             2                6                563.56

3            0            4                 2                  12.18

4            0            4                 2                  45.23

5            0            9                 2                145.32

6            1            1                 1               1103.32

7            1            1                 9                 714.32

8            1            3                 9               5623.02

9            1            4                 2                   45.23

PROC PRINT DATA=DUPLICATE;

RUN;

Obs DURAO ORIGINE DESTINATION CHARGE

1    0      2       6      563.56
2    0      4       2       45.23

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

To achieve the same result as the proc sort code I suggested you could use something like the following with older versions of SAS:

data test duplicate;

  set aa;

  by DURAO ORIGINE DESTINATION CHARGE;

  if first.charge then output test;

  else do;

    output duplicate;

  end;

run;

View solution in original post

8 REPLIES 8
art297
Opal | Level 21

Couldn't you do what you want using the dupout option in a proc sort nodupkey run?

art297
Opal | Level 21

for example:

proc sort data=aa out=test dupout=duplicate nodupkey;

  by DURAO ORIGINE DESTINATION CHARGE;

run;

I'm not sure if you want to remove the duplicates from the original file, so in the above code, the file with the duplicates removed was written to a file called test.

Mike018
Fluorite | Level 6

For a duplicate (2 times the same record), I move 1 record to the new file test and I keep the other in the original data. Your suggestion seems good, but I use SAS 8.2 I've read that the Dupout option is new with SAS 9. Is it the case?

robby_beum
Quartz | Level 8

Since you're on 8.2, I believe you're correct that you'll have to "Use FIRST.variable and LAST.variable (made available with BY-Group processing) to determine if a BY-Group contains unique or duplicate observations." 

Look under the "Full Code" tab at http://support.sas.com/kb/24/626.html for specific details.

robby_beum
Quartz | Level 8

Just to add onto what Art said, SAS notes the following at http://support.sas.com/kb/24/626.html:

"Note: Beginning with SAS 9.1, the DUPOUT= option on a PROC SORT statement specifies an output data set that contains duplicate observations. The new data set will not contain *all* the duplicates, but one record for each duplicate BY value. See the Base SAS(R) 9.1 Procedures Guide for more details. proc sort data=dsn dupout=new_dsn nodupkey; by var; run;"

art297
Opal | Level 21

To achieve the same result as the proc sort code I suggested you could use something like the following with older versions of SAS:

data test duplicate;

  set aa;

  by DURAO ORIGINE DESTINATION CHARGE;

  if first.charge then output test;

  else do;

    output duplicate;

  end;

run;

Mike018
Fluorite | Level 6

@ robby_beum
For some reason, the code they give didn't seem to work for me.

@ art297
Thanks!
The code you gave me seems to works well.

art297
Opal | Level 21

The code that robby pointed to will put ALL dups in the dups file (including the first one) and only put records in the nodups file that don't contain ANY duplicates.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1846 views
  • 3 likes
  • 3 in conversation