BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Satori
Quartz | Level 8

I have a dataset like this (below). The variable code takes only two values ('C2', 'U2')

The ID variable is not unique. There can be an ID with only code U2, only code C2 and with both codes. I want to keep only the last one of these, meaning I want to keep the observations with the same ID and different codes, and drop all IDs that have only one code.

 

Obs       ID                         code

1       AE0000037163          U2
2       AE0000037282          U2
3       AE0000037693          U2
4       AE0000037738          U2
5       AE0000037738          C2

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

I will assume you want to see the NON-unique ID values.

 

data have;
   input Obs ID:$12. code $;
   cards;
1 AE0000037163 U2
2 AE0000037282 U2
3 AE0000037693 U2
4 AE0000037738 U2
5 AE0000037738 C2
;
run;
proc print;
proc sort data=have nounikey out=dups uniout=_null_;
   by id;
   run;
proc print;
   run;

Capture.PNG

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Please provide data that illustrates the problem. Right now, there is no ID which has both U2 and C2.

--
Paige Miller
Satori
Quartz | Level 8
I have a dataset like this (below). The variable code takes only two values ('C2', 'U2')

The ID variable is not unique. There can be an ID with only code U2, only code C2 and with both codes. I want to keep only the last one of these, meaning I want to keep the observations with the same ID and different codes, and drop all IDs that have only one code.



Obs ID code

1 AE0000037163 U2
2 AE0000037282 U2
3 AE0000037693 U2
4 AE0000037738 U2
5 AE0000037738 C2

PaigeMiller
Diamond | Level 26

From now on, please provide data as WORKING data step code, as I have shown below in creating data set named HAVE.

 

data have;
input Obs ID $12. code $;
cards;
1 AE0000037163 U2
2 AE0000037282 U2
3 AE0000037693 U2
4 AE0000037738 U2
5 AE0000037738 C2
;

proc sql;
    create table want as select id
    from have
    group by id
    having sum(code='U2')>0 and sum(code='C2')>0; 
quit;

 

--
Paige Miller
Satori
Quartz | Level 8
After running your suggested code, I ended up with a list of observations that only shows the ID column, and none of the IDs are duplicate.
PaigeMiller
Diamond | Level 26

@Satori wrote:
After running your suggested code, I ended up with a list of observations that only shows the ID column, and none of the IDs are duplicate.

I don't understand this. Show us what you see. Show us what you want to see.

--
Paige Miller
data_null__
Jade | Level 19

I will assume you want to see the NON-unique ID values.

 

data have;
   input Obs ID:$12. code $;
   cards;
1 AE0000037163 U2
2 AE0000037282 U2
3 AE0000037693 U2
4 AE0000037738 U2
5 AE0000037738 C2
;
run;
proc print;
proc sort data=have nounikey out=dups uniout=_null_;
   by id;
   run;
proc print;
   run;

Capture.PNG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 477 views
  • 1 like
  • 3 in conversation