DATA Step, Macro, Functions and more

Simplify joins

Reply
Contributor
Posts: 22

Simplify joins

There are 2 datasets, names and partners. In partners , id is given but we need to find corresponding names using "names" dataset

data names;
input st_id st_names $;
datalines;
1 John
2 Mark
3 Steve
4 Smith
5 Cook
6 Messi
7 Mary
8 Kris
9 Kevin
;
run;


data partners;
input st_id1 st_id2 st_id3;
datalines;
1 4 7
2 5 8
;
run;

I need to add 3 columns in dataset partners name_id1,name_id2,name_id3 . To get names from dataset "names" as per id , i think we need to join dataset 3 times. What if datasets has million of records, is there any way we can simply the joins?

Super User
Super User
Posts: 7,432

Re: Simplify joins

How does parteners dataset relate to the first dataset?  I am confused as there are three name fields in it, but a name could only have one partner no?  If we assume the the first column is the person matching st_id then:

data names;
input st_id st_names $;
datalines;
1 John
2 Mark
3 Steve
4 Smith
5 Cook
6 Messi
7 Mary
8 Kris
9 Kevin
;
run;

data partners;
input st_id1 st_id2 st_id3;
datalines;
1 4 7
2 5 8
;
run;

proc sql;
  create table WANT as
  select  A.*,
          (select ST_NAMES from NAMES where ST_ID=B.ST_ID2) as ST_NAME2,
          (select ST_NAMES from NAMES where ST_ID=B.ST_ID3) as ST_NAME3
  from    NAMES A
  left join PARTNERS B
  on      A.ST_ID=B.ST_ID1;
quit;

But that only gives the first two records two partners?

Super User
Posts: 6,982

Re: Simplify joins

Transpose the partners dataset into long format first, then do the join. Don't keep data (number of partner) in structure (column name).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 22

Re: Simplify joins

In actual scenario, names dataset has 80000 + records and partners has million of records , so cant do the transpose.

Super User
Posts: 6,982

Re: Simplify joins


Ps8813 wrote:

In actual scenario, names dataset has 80000 + records and partners has million of records , so cant do the transpose.


And?

A puny little million is no problem for SAS.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 17,963

Re: Simplify joins

Create a format out of names dataset and then apply format. 

 

Data fmt_def;

set table1;

fmtname='name_fmt';

start=st_id;

label=st_name;

run;

 

proc fmt cntlin=fmt_def;

run;

 

data want;

set table2;

array n_init (3) st_id1-st_id3;

array n_res(3) $ st_id_name1 - st_id_name3;

 

do i  = 1 to 3;

n_res (I)= put(n_init(I), name_fmt.);

end;

run;

 

Ask a Question
Discussion stats
  • 5 replies
  • 264 views
  • 6 likes
  • 4 in conversation