I want to create a data set of unique pairs of IDs by groups. So I am selecting 2 IDs from up to 9. In this example it only shows up to ID5. They are siblings (ID1-ID9) within families (ahhrhid).
I have tried the macro COMBO and some SQL commands but I can't crack it.
Any help would be much appreciated.
I start with this (ignore _NAME_ & _LABEL_)
Obs | ahhrhid | _NAME_ | _LABEL_ | ID_1 | ID_2 | ID_3 | ID4 | ID5 |
1 | 000003 | xwaveid | XW Cross wave ID | 0100005 | 0100006 | 0100007 |
|
|
2 | 000006 | xwaveid | XW Cross wave ID | 0100012 | 0100013 |
|
|
|
3 | 000012 | xwaveid | XW Cross wave ID | 0100020 | 0100021 |
|
|
|
4 | 000015 | xwaveid | XW Cross wave ID | 0100025 | 0100026 | 0100027 |
|
|
|
And what I wish to get is the following:”
Ahhrhid _Label_ Sib_ID1 Sib_ID2 Age1 Age2
000003 Pair 1 & 2 0100005 0100006 23 20
000003 Pair 1 & 3 0100005 0100007 23 18
000003 Pair 2 & 3 0100020 0100021 20 18
000006 Pair 1 & 2 0100012 0100013 12 14
000012 Pair 1 & 2 0100020 0100021 18 16
000015 Pair 1 & 2 0100025 0100026 25 23
000015 Pair 1 & 3 0100025 0100027 25 24
000015 Pair 2 & 3 0100026 0100027 23 24
I would like to include other variables beside ID like Age if possible.
I totally agree with @Kurt_Bremser: You want something from us so please try and make our life easy.
You haven't provided the Age variable so here how to get to the rows with the ID pairs:
data have;
infile datalines truncover dlm=',' dsd;
input ahhrhid NAME:$10. LABEL:$30. ID_1 - ID_5;
datalines;
3,xwaveid,XW Cross wave ID,100005,100006,100007,,
6,xwaveid,XW Cross wave ID,100012,100013,,,
12,xwaveid,XW Cross wave ID,100020,100021,,,
15,xwaveid,XW Cross wave ID,100025,100026,100027,100028,
;
run;
data want(keep=ahhrhid NAME LABEL sib_id_:);
set have;
array ids {*} id_:;
do _i=1 to dim(ids)-1;
if missing(ids[_i]) then
do;
_i=dim(ids);
leave;
end;
do _j=_i+1 to dim(ids);
if missing(ids[_j]) then leave;
sib_id_1=ids[_i];
sib_id_2=ids[_j];
output;
end;
end;
run;
Please post your example data in a data step similar to this:
data have;
input ahhrhid $ id_1 $ id_2 $ id3 $;
cards;
000003 0100005 0100006 0100007
;
run;
This shows us the exact structure of the data we should be working with, and we only need to do a copy/paste and run to recreate your dataset.
I totally agree with @Kurt_Bremser: You want something from us so please try and make our life easy.
You haven't provided the Age variable so here how to get to the rows with the ID pairs:
data have;
infile datalines truncover dlm=',' dsd;
input ahhrhid NAME:$10. LABEL:$30. ID_1 - ID_5;
datalines;
3,xwaveid,XW Cross wave ID,100005,100006,100007,,
6,xwaveid,XW Cross wave ID,100012,100013,,,
12,xwaveid,XW Cross wave ID,100020,100021,,,
15,xwaveid,XW Cross wave ID,100025,100026,100027,100028,
;
run;
data want(keep=ahhrhid NAME LABEL sib_id_:);
set have;
array ids {*} id_:;
do _i=1 to dim(ids)-1;
if missing(ids[_i]) then
do;
_i=dim(ids);
leave;
end;
do _j=_i+1 to dim(ids);
if missing(ids[_j]) then leave;
sib_id_1=ids[_i];
sib_id_2=ids[_j];
output;
end;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.