Hi all.
I am struggling with structuring my data set in the correct form.
I have a dataset of persons and where they meet (see example "Dataset; People"). Instead of that structure I want a structure where colunm 1 is person1, column 2 is the meeting point and column 3 is the persons person 1 meets. So some people meet more people and different people.
Dataset; People | |
Meeting point | Person |
Adress 1 | A |
Adress 1 | B |
Adress 2 | A |
Adress 2 | B |
Adress 2 | C |
Adress 3 | B |
Adress 3 | C |
So far I use the code following code:
data people1;
set people1;
run;
data people2;
set people2;
run;
Where the datasets people1 and people2 are exactly the same except the name of column 2:
Dataset; People1 | Dataset; People2 | ||
Meeting point | Person1 | Meeting point | Person2 |
Adress 1 | A | Adress 1 | A |
Adress 1 | B | Adress 1 | B |
Adress 2 | A | Adress 2 | A |
Adress 2 | B | Adress 2 | B |
Adress 2 | C | Adress 2 | C |
Adress 3 | B | Adress 3 | B |
Adress 3 | C | Adress 3 | C |
To create one dataset I use the code:
proc sql;
create table dataset as
select a.meeting_point, a. person1, b.person2
from people1 as a
left join people2 as b
on a.meeting_point=b.meeting_point and a.person1 ne b.person2
where b.person2 ne "";
quit;
The problem with using that code is that the persons in column1 is distinct but not the pairs of people. I get the following:
Meeting point | Person1 | Person2 |
Adress 1 | A | B |
Adress 1 | A | C |
Adress 1 | A | D |
Adress 1 | A | E |
Adress 1 | B | A |
Adress 1 | B | C |
Adress 1 | B | D |
Adress 1 | B | E |
Adress 1 | C | A |
Adress 1 | C | B |
Adress 1 | C | D |
Adress 1 | C | E |
Adress 1 | D | A |
Adress 1 | D | B |
Adress 1 | D | C |
Adress 1 | D | E |
Adress 1 | E | A |
Adress 1 | E | B |
Adress 1 | E | C |
Adress 1 | E | D |
The ones marked with red and dublet pairs and I only need distinct pairs.
Hope someone has a solution.
Thank you!
Your examples make no sense. There are no persons D and E in your initial dataset, and person C was never at address 1.
Then I suggest a small change to your SQL:
data people;
input meeting_point $ person $;
cards;
ad1 a
ad1 b
ad2 a
ad2 b
ad2 c
ad3 b
ad3 c
;
run;
proc sql;
create table dataset as
select a.meeting_point, a.person as person1, b.person as person2
from people as a
left join people as b
on a.meeting_point = b.meeting_point and a.person < b.person
where b.person ne '';
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.