BookmarkSubscribeRSS Feed
Zakharkou
Calcite | Level 5

Hello dear SAS community.

 

I have the following question, there are two variables (id1 and id2) in a table (see below), which should be compared with each other.

 

Is it possible to delete the matching observations from this table (position of the observation does not matter), so that only the unique observations for these two variables remain there?

 

 

id1id2
990000990000
990002990000contour
990003990002
990006990003
990007990006
990008990007
990016990008
990018990016
990021990018
990024990021

 

I would be delighted to receive an answer.

11 REPLIES 11
Quentin
Super User

Can you show the code you have tried?  Are you familiar with the IF statement, and the DELETE statement?

BASUG is hosting free webinars ! Check out our recordings of past webinars: https://www.basug.org/videos. Be sure to subscribe to our email list for notification of future BASUG events.
Zakharkou
Calcite | Level 5
I have tried different variants, but have had no success. I think that due to the problem that the rows are sometimes offset, it is more likely to end up with proc sql.
Tom
Super User Tom
Super User

I cannot figure out what you mean.  Can you show what you would want to get for the example input you showed?

Zakharkou
Calcite | Level 5

Yes, of course. Thanks for the feedback.

 

I would like to use this data set:

 

id1id2
990000990000
990002990000contour
990003990002
990006990003
990007990006
990008990007
990016990008
990018990016
990021990018
990024990021


to create the following data set:

 

id1id2
990024990000contour


Ultimately, it's about removing the repetitions on both sides of the table (id1 and id2), even if the rows are "offset".

Tom
Super User Tom
Super User

Ok.  So know we know what you want as the output, but WHY.

Why would you pair 990024 with 990000contour?

They appear to have nothing to do with each other in the original dataset.

 

The only way I can see to get those two values out is to treat the values as ONE variable, not TWO.

data list;
  input id :$20. @@;
cards;
990000	990000
990002	990000contour
990003	990002
990006	990003
990007	990006
990008	990007
990016	990008
990018	990016
990021	990018
990024	990021
;

proc sql;
create table want as 
  select id
  from list
  group by id
  having count(*)=1
;
quit;

proc print;
run;
Obs    id

 1     990000contour
 2     990024
andreas_lds
Jade | Level 19

Could be solved by a data step:

data want;
   merge
      have(keep= id1 rename=(id1=id) in=inOne)
      have(keep= id2 rename=(id2=id) in=inTwo)
   ;
   by id;

   if inOne + inTwo = 1;
run;

 EDIT: Both id-variables must be sorted for this step to work, so using one of the other solution is recommended.

mkeintz
PROC Star

 

As a minor variation, if you want to keep both variables ID1 and ID2, but with a unique row for each unique value, you could:

 

 

data work.have;
  input id1 :$20. id2 :$20.;
cards;
990000 990000
990002 990000contour
990003 990002
990006 990003
990007 990006
990008 990007
990016 990008
990018 990016
990021 990018
990024 990021
;

data want (label="unmatched ID1's and unmatched ID2's");
  set have (keep=id1 rename=id1=id in=in1)
      have (keep=id2 rename=id2=id in=in2);
  by id;
  if first.id=1 and last.id=1;
  if in1=1 then id1=id;
  else id2=id;
  drop id;
run;

proc print;
run;

 

 

which generates:

 

The SAS System

Obs id1 id2
1   990000contour
2 990024  

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

Here one option.

data work.have;
  input id1 :$20. id2 :$20.;
cards;
990000 990000
990002 990000contour
990003 990002
990006 990003
990007 990006
990008 990007
990016 990008
990018 990016
990021 990018
990024 990021
;


proc sql;
  title 'A EXCLUSIVE UNION B';
  (
    select id1 as id from work.have
    except
    select id2 as id from work.have
  )
  union
  (
    select id2 as id from work.have
    except
    select id1 as id from work.have
  )
  ;
quit;

Patrick_0-1700352924714.png

Above code based on SAS Docu Producing Rows from the First Query or the Second Query

 

Zakharkou
Calcite | Level 5
It looks almost as desired. Would it be possible not to merge the 2 columns, but to leave them as they were with id1 and id2?
andreas_lds
Jade | Level 19

@Zakharkou wrote:
It looks almost as desired. Would it be possible not to merge the 2 columns, but to leave them as they were with id1 and id2?

Here's a lengthy solution:

data work.have;
  input id1 :$20. id2 :$20.;
cards;
990000 990000
990002 990000contour
990003 990002
990006 990003
990007 990006
990008 990007
990016 990008
990018 990016
990021 990018
990024 990021
;


data work.stacked;
	set work.have;
	
	array ids[2] id1 id2;
	
	do source = 1 to dim(ids);
		id = ids[source];
		output;
	end;
	
	drop id1 id2;
run;

proc sort data=work.stacked;
	by id;
run;

data work.unique;
	set work.stacked;
	by id;
	
	if first.id and last.id;
run;

proc sort data=work.unique;
	by source;
run;

proc transpose data=work.unique out=want(drop=_name_) prefix=id;
	id source;
	var id;
run;
Patrick
Opal | Level 21

@Zakharkou wrote:
It looks almost as desired. Would it be possible not to merge the 2 columns, but to leave them as they were with id1 and id2?

Below should work

proc sql;
  title 'A EXCLUSIVE UNION B';
  (
    select id1, ' ' as id2 from work.have
    except
    select id2 as id1, ' ' as id2 from work.have
  )
  union all corr
  (
    select id2, ' ' as id1 from work.have
    except
    select id1 as id2, ' ' as id1 from work.have
  )
  ;
quit;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1479 views
  • 1 like
  • 6 in conversation