So what I wanted to do is to generate a table with a, a_match, and b. The idea is to match column a and column a_match based on the same b value but only retain one distinctive matched row when a(i)=a_match(i+n) and a_match(i)=a(i+n) and b(i)=b(i+n).
The the following is what I have for now. The next step is to delete certain rows based on the above criteria
proc sql;
create table one as
select distinct a, b
from original
order by a, b;
quit;
proc sql;
create table two as
select a as a_match, b
from merge1
order by a_match, b;
quit;
proc sql;
create table merge as
select a.a, b.a_match, a.b
from one as a left join two as b on
a.b=b.b
order by a.a, b.a_match, a.b;
quit;
After deleteing rows, what I wanted to do next is to the calcuate the number of distinctive b for each distinctive combination of a and a_match.
Thank you!
Could anyone please help me out? Thank you so much!
Are the value of B unique within the two input tables ORIGINAL and MERGE1? Or for either of them?
If so then just merge them and rename the A variables so that you can distinguish them from each other.
data want ;
merge original (rename= (a = original_a) )
merge1 (rename= (a = match_a))
;
by b;
run;
Thank you Tom for the reply.
I already had the merged dataset, and my question is how to retain only one distinctive matched row when a(i)=a_match(i+n) and a_match(i)=a(i+n) and b(i)=b(i+n).
For example, the merged dataset is like this:
a a_match b
123 321 12345678
123 345 9876450
134 345 90485769
321 123 12345678
345 123 9876450
345 134 90485769
Notice that row 1 and 4, 2 and 5, 3 and 6 have the same set of a and a_match as well as same b value. So I wanted to retain only one row with a unique combination of a, a_match, and b. Actually a_match is just a copy of a with a new name, the reason I did this is that I wanted to see which two a's share the same b. And I wanted to know how many b's each two a's share.
One way to detect permutations is to order the values across the variables. Then any normal type of duplication check will work.
You might do something like:
proc sql ;
create table merge2 as
select b
, min(a,a_match) as var1
, max(a,a_match) as var2
, a
, a_match
from merge
;
quit;
proc sort data=merge2 nodupkey ;
by b var1 var2 ;
run;
Brilliant!!
Thank you so much!!
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.