create new column in a data set where same variables between two data sets are identified with a 1

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

create new column in a data set where same variables between two data sets are identified with a 1

I have this data set to deal with, its kinda complicated to explain but I will try my best. Please let me know if I need to elobrate more. 

 

Below I have pasted in what the data sets I have (this is just a small set I made up the actual data set has thousands of rows) and what the desired output needs to be. Basically the variables in Pid (data set b) match some of the variables in Mid (data set a). I need to create a column (F1) where when Mid=Pid the corresponding row is equal to 1, and if the Pid is missing the corresponding row in F1 is 0. I have tried doing this in proq sql, but proc sql is very new to me. I really appreciate any sort of guidance or help. 

Thanks in advance!

 

 

Data set a

MidCid
536
846
1665
1769
2978
3185
4890
5399
56115
58123
63142
71151
180160
540169
120173
160181
175190
142200

 

 

Data set b

Pid
8
16
29
48
53
56
63
71
180
540
120

 

What I want to get

MidCidFid
5360
8461
16651
17690
29781
31850
48901
53991
561151
581230
631421
711511
1801601
5401691
1201731
1601810
1751900
1422000

Accepted Solutions
Solution
‎04-25-2017 11:05 AM
Respected Advisor
Posts: 4,609

Re: create new column in a data set where same variables between two data sets are identified with a

You need a left join:

 

data a;
input Mid	Cid;
datalines;
5	36
8	46
16	65
17	69
29	78
31	85
48	90
53	99
56	115
58	123
63	142
71	151
180	160
540	169
120	173
160	181
175	190
142	200
;

data b;
input Pid;
datalines;
8
16
29
48
53
56
63
71
180
540
120
;

proc sql;
create table c as
select 
    a.*, 
    b.Pid is not missing as Fid
from
    a left join
    b on a.Mid=b.Pid;
select * from c;
quit;


PG

View solution in original post


All Replies
Grand Advisor
Posts: 17,424

Re: create new column in a data set where same variables between two data sets are identified with a

A SQL Left Join is what you need, plus a CASE statement to identify the source record. 

 

 

data have1;
input Mid Cid;
cards;
5 36
8 46
16 65
17 69
29 78
31 85
48 90
53 99
56 115
58 123
63 142
71 151
180 160
540 169
120 173
160 181
175 190
142 200
;
run;
 
Data have2;
input Pid;
cards;
8
16
29
48
53
56
63
71
180
540
120
;
run;

proc sql;
create table want as
select h1.*, 
    case when missing(h2.pid) then 0
    else 1
    end as flag
from have1 as h1
left join have2 as h2
on h1.mid=h2.pid;
quit;
Occasional Contributor
Posts: 6

Re: create new column in a data set where same variables between two data sets are identified with a

Thanks so much for the help! I didn't know about the case statement for proc sql.

Solution
‎04-25-2017 11:05 AM
Respected Advisor
Posts: 4,609

Re: create new column in a data set where same variables between two data sets are identified with a

You need a left join:

 

data a;
input Mid	Cid;
datalines;
5	36
8	46
16	65
17	69
29	78
31	85
48	90
53	99
56	115
58	123
63	142
71	151
180	160
540	169
120	173
160	181
175	190
142	200
;

data b;
input Pid;
datalines;
8
16
29
48
53
56
63
71
180
540
120
;

proc sql;
create table c as
select 
    a.*, 
    b.Pid is not missing as Fid
from
    a left join
    b on a.Mid=b.Pid;
select * from c;
quit;


PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 128 views
  • 2 likes
  • 3 in conversation