BookmarkSubscribeRSS Feed
Sujithpeta
Quartz | Level 8

Hello 

 

I've a dataset with 4 variables and many rows, among the records few rows have duplicate records except one variable (See the attached picture). If any rows that have a, b, c common values and different d value then i want to delete one row and change the value of d to z.

How can I do that?

 

Thanks in advance!


Untitled.png
2 REPLIES 2
Astounding
PROC Star

First, sort your data set if it is not already in order:

 

proc sort data=have;

by a b c;

run;

 

Then get the data set you want:

 

data want;

set have;

by a b c;

if last.c;

if first.c=0 then d='z';

run;

collinelliot
Barite | Level 11

Here's an SQL version that might do it:

 

data have;
input a b c d $;
datalines;	
1 2 3 x
1 2 2 x
1 2 2 y
2 1 1 x
1 2 4 y
1 2 4 x
;

proc sql;
    CREATE TABLE want AS
    SELECT DISTINCT a, b, c, 
           CASE count(DISTINCT d) WHEN 1 THEN d
                ELSE 'z'
           END AS d
    FROM have
    GROUP BY a, b, c;
quit;

I'm not sure how you'd easily get the final sort order you have in your image, but I think this gets the data right.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1345 views
  • 0 likes
  • 3 in conversation