BookmarkSubscribeRSS Feed
juliehn
Fluorite | Level 6

Hi all.

 

I am struggling with structuring my data set in the correct form.

 

I have a dataset of persons and where they meet (see example "Dataset; People"). Instead of that structure I want a structure where colunm 1 is person1, column 2 is the meeting point and column 3 is the persons person 1 meets. So some people meet more people and different people.

 

Dataset; People
Meeting pointPerson
Adress 1A
Adress 1B
Adress 2A
Adress 2B
Adress 2C
Adress 3B
Adress 3C

 

So far I use the code following code:

 

data people1;

set people1;

run;

 

data people2;

set people2;

run;

 

Where the datasets people1 and people2 are exactly the same except the name of column 2:

 

Dataset; People1Dataset; People2
Meeting pointPerson1Meeting pointPerson2
Adress 1AAdress 1A
Adress 1BAdress 1B
Adress 2AAdress 2A
Adress 2BAdress 2B
Adress 2CAdress 2C
Adress 3BAdress 3B
Adress 3CAdress 3C

 

To create one dataset I use the code:

 

proc sql;

create table dataset as

select a.meeting_point, a. person1, b.person2

from people1 as a

left join people2 as b

on a.meeting_point=b.meeting_point and a.person1 ne b.person2

where b.person2 ne "";

quit;

 

The problem with using that code is that the persons in column1 is distinct but not the pairs of people. I get the following:

 

 
Meeting pointPerson1Person2
Adress 1AB
Adress 1AC
Adress 1AD
Adress 1AE
Adress 1BA
Adress 1BC
Adress 1BD
Adress 1BE
Adress 1CA
Adress 1CB
Adress 1CD
Adress 1CE
Adress 1DA
Adress 1DB
Adress 1DC
Adress 1DE
Adress 1EA
Adress 1EB
Adress 1EC
Adress 1ED

 

The ones marked with red and dublet pairs and I only need distinct pairs.

 

Hope someone has a solution.

 

Thank you!

3 REPLIES 3
juliehn
Fluorite | Level 6
You are absolutely right. The dataset is just at small selection/example of the real data set which is extremely large. The important part is the structures and the hurdle is to delete dublicate pairs (i.e A-B is the same as B-A).
Kurt_Bremser
Super User

Then I suggest a small change to your SQL:

data people;
input meeting_point $ person $;
cards;
ad1 a
ad1 b
ad2 a
ad2 b
ad2 c
ad3 b
ad3 c
;
run;

proc sql;
create table dataset as
select a.meeting_point, a.person as person1, b.person as person2
from people as a
left join people as b
on a.meeting_point = b.meeting_point and a.person < b.person
where b.person ne '';
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 692 views
  • 0 likes
  • 2 in conversation