BookmarkSubscribeRSS Feed
Lingshu
Calcite | Level 5

Gday,

 

I would like to select unique matching pair from dataset. I post one dataset as example. 

 

DATA Have;
INFILE Cards;
INPUT X Y $ @@;
CARDS;
1 A 2 A 3 A 4 A 5 B 6 B 7 B 8 B 1 B 2 B 3 B 4 B ;
RUN;

 

In dataset Have, x=1 can match with y=A or Y=B, as well as x=2. My goal is to select unique matching pair. I need to complete several steps: first, I select the pair of 1 and A and output to a new dataset,

         second, I need to exclude the pair having either 1 or A for the rest of dataset and update the orignial dataset

         then, after exlusion, I process to select the second pair in the update dataset, which should be the pair of 2 and B in this case.

Ideally, the final dataset I need will include two pairs, 1 and A, 2and B. I tried different way, such DOW do-loop, link statement or sql subqury. All of them failed. 

 

I am open for any suggestions for this complex question. 

 

Thanks

Lingshu

7 REPLIES 7
Norman21
Lapis Lazuli | Level 10

Hi Lingshu,

 

Welcome to the forum!

 

Interesting problem. How do you define "matching"? Is there one rule, or several rules with an order of preference?

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Kurt_Bremser
Super User

This at least solves your test case:

data have;
input x y $;
cards;
1 A
2 A
3 A
4 A
5 B
6 B
7 B
8 B
1 B
2 B
3 B
4 B
;
run;

proc sort data=have;
by x y;
run;

%macro iterate;
%let done=no;
%let start = 1;
%do %until ("&done" = "yes");

data have (drop=x1 y1);
set have;
retain x1 y1;
if _n_ < &start then call symput('done','yes');
else if _n_ = &start
then do;
  x1 = x;
  y1 = y;
  call symput('done','yes');
end;
else do;
if x = x1 or y = y1
then do;
  call symput('done','no');
  delete;
end;
end;
run;
%let start=%eval(&start+1);
%end;
%mend;
%iterate;
Norman21
Lapis Lazuli | Level 10

Hi

 

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Kurt_Bremser
Super User

@Norman21 wrote:

Hi

 


I don't have the slightest idea, but it is the solution the OP wanted.

Maybe further discussion will lead to a clarification of the selection rules.

Lingshu
Calcite | Level 5

Hi Norman21,

 

Thanks for asking. 

My study has 300 hundreds cases and 1000 controls. I need to match 1 case with 3 controls. I did full matching first, and then figured out 1 case can matched with 500 controls (having variable called scarce to calculate the number of matched control for each case) and 1 control can can at least 2 cases (having variable called demand to calculate the number of matched case for each control). My strategy is to sort case and control by "scarce" and "demand". Pair with lowest scarce and demand scores will be selected first.

 

Lingshu

Norman21
Lapis Lazuli | Level 10

Hi Lingshu,

 

Thanks for the additional information, but I'm still not sure what are your matching criteria.

 

Do the examples in the following describe that you are trying to achieve?

 

http://www2.sas.com/proceedings/sugi29/173-29.pdf

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Lingshu
Calcite | Level 5

Thanks so much KurtBremser,

 

That's awesome. I have an advanced question. The example I gave is to select only one pair for x=1, but if I need to select two pairs for x=1, keeping other selection rules same. In other word, I may need pairs of "1, A" "1, B" "2, C" "2, D".  I tried to use array to create y1-y3 to delete, but it seems not work. Do you have any idea?

Attached is my code.

 

data have;
input x y $;
cards;
1 A
2 A
3 A
4 A
5 B
6 B
7 B
8 B
1 B
2 B
3 B
4 B
5 C
6 C
7 C
8 C
1 D
2 C
3 D
4 D
2 D
;
run;

proc sort data=have;
by x y;
run;

%macro iterate;
%let done=no;
%let start = 1;
%do %until ("&done" = "yes");

data have (drop=x1 y1 y2);
set have;
array a{2} y1-y2;
retain x1 y1;
if _n_ < &start then call symput('done','yes');
else if &start=<_n_ < &start+2
then do;
x1 = x;
do i=1 to 2;
if _n_ =&start-1+i then a{i}=control_id;
else a{i}=0;
end;
call symput('done','yes');
end;
else do;
if x = x1 or y = y1 or y=y2
then do;
call symput('done','no');
delete;
end;
end;
run;
%let start=%eval(&start+2);
%end;
%mend;
%iterate;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3314 views
  • 1 like
  • 3 in conversation