Hi all.
I am struggling with structuring my data set in the correct form.
I have a dataset of persons and where they meet (see example "Dataset; People"). Instead of that structure I want a structure where colunm 1 is person1, column 2 is the meeting point and column 3 is the persons person 1 meets. So some people meet more people and different people.
Dataset; People | |
Meeting point | Person |
Adress 1 | A |
Adress 1 | B |
Adress 2 | A |
Adress 2 | B |
Adress 2 | C |
Adress 3 | B |
Adress 3 | C |
So far I use the code following code:
data people1;
set people1;
run;
data people2;
set people2;
run;
Where the datasets people1 and people2 are exactly the same except the name of column 2:
Dataset; People1 | Dataset; People2 | ||
Meeting point | Person1 | Meeting point | Person2 |
Adress 1 | A | Adress 1 | A |
Adress 1 | B | Adress 1 | B |
Adress 2 | A | Adress 2 | A |
Adress 2 | B | Adress 2 | B |
Adress 2 | C | Adress 2 | C |
Adress 3 | B | Adress 3 | B |
Adress 3 | C | Adress 3 | C |
To create one dataset I use the code:
proc sql;
create table dataset as
select a.meeting_point, a. person1, b.person2
from people1 as a
left join people2 as b
on a.meeting_point=b.meeting_point and a.person1 ne b.person2
where b.person2 ne "";
quit;
The problem with using that code is that the persons in column1 is distinct but not the pairs of people. I get the following:
Meeting point | Person1 | Person2 |
Adress 1 | A | B |
Adress 1 | A | C |
Adress 1 | A | D |
Adress 1 | A | E |
Adress 1 | B | A |
Adress 1 | B | C |
Adress 1 | B | D |
Adress 1 | B | E |
Adress 1 | C | A |
Adress 1 | C | B |
Adress 1 | C | D |
Adress 1 | C | E |
Adress 1 | D | A |
Adress 1 | D | B |
Adress 1 | D | C |
Adress 1 | D | E |
Adress 1 | E | A |
Adress 1 | E | B |
Adress 1 | E | C |
Adress 1 | E | D |
The ones marked with red and dublet pairs and I only need distinct pairs.
Hope someone has a solution.
Thank you!
Your examples make no sense. There are no persons D and E in your initial dataset, and person C was never at address 1.
Then I suggest a small change to your SQL:
data people;
input meeting_point $ person $;
cards;
ad1 a
ad1 b
ad2 a
ad2 b
ad2 c
ad3 b
ad3 c
;
run;
proc sql;
create table dataset as
select a.meeting_point, a.person as person1, b.person as person2
from people as a
left join people as b
on a.meeting_point = b.meeting_point and a.person < b.person
where b.person ne '';
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.