DATA Step, Macro, Functions and more

Merge/join

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

Merge/join

I want to combine two data sets but I need all the data from table one to repeat whether or not a match exists in table two. Think of it like table one contains the survey questions and table two the responses. If I look up a respondent, I want to see all the questions even if they left some blank. This works great when merging questions with one respondents responses but with multiple the questions don't repeat so all I see are the questions that were answered.
Table 1
Form_id question
99 1
99 2
99 3
99 4

Table 2
question response respondent_id
2 Apple 1001
3 Car 1001
4 Sky 1001
1 Outside 1002
2 Pear 1002
3 Truck 1002
4 Outside 1002


I want table three to keep question 1 for respondent 1001 even though no record exists for it in table two.
Example of table three for respondent 1001:

Form_id question response respondent_id
99 1
99 2 Apple 1001
99 3 Car 1001
99 4 Outside 1001

Any help you could provide would be greatly appreciated!

Accepted Solutions
Solution
‎03-04-2016 10:19 AM
Occasional Contributor
Posts: 19

Re: Merge/join

You know when you step away from something that's been wracking your brain for hours (or days) and then you return and the solution just smacks you in the face?  Well I just had that moment.  I can't believe I didn't think of this before and it's so simple.  Thank you to those who replied.  I really appreciate the time you put into helping me.
 
data table1;
input form_id question;
cards;
98 1
98 2
98 3
98 4
98 5
99 1
99 2
99 3
99 4
; run;
 
data table2;
input form_id question response $ respondent_id date;
cards;
98 2 Orange 1002 201601
98 3 Bike 1002 201601
98 4 Smurf 1002 201601
98 5 Red 1002 201601
98 1 Inside 1004 201601
98 3 Car 1004 201601
98 4 Crayon 1004 201601
98 5 Yellow 1004 201601
98 1 Inside 1004 201602
98 3 Car 1004 201602
98 4 Crayon 1004 201602
98 5 Yellow 1004 201602
 
99 2 Apple 1001 201601
99 3 Car 1001 201601
99 4 Sky 1001 201601
99 1 Inside 1001 201602
99 2 Apple 1001 201602
99 3 Car 1001 201602
99 4 Sky 1001 201602
99 1 Outside 1002 201601
99 2 Pear 1002 201601
99 3 Truck 1002 201601
99 4 Blueberries 1002 201601
; run;
 
/*Get all unique combinations of form, respondent and date*/
proc sql;
create table tableA as
select distinct form_id, respondent_id, date
from table2;
quit;
 
/*Add the questions from table 1*/
proc sql;
create table tableB as
select a.*, b.question
from tableA a left join table1 b
on a.form_id = b.form_id;
quit;
 
/*Add the responses from table 2*/
proc sql;
create table tableC as
select a.*, b.response
from tableB a left join table2 b
on a.form_id = b.form_id and a.question = b.question and a.respondent_id = b.respondent_id and a.date = b.date;
quit;

View solution in original post


All Replies
Valued Guide
Posts: 858

Re: Merge/join

data table1;
input Form_id question;
cards;
99 1
99 2
99 3
99 4
;run;

data table2;
input question response$ respondent_id;
cards;
2 Apple 1001
3 Car 1001
4 Sky 1001
1 Outside 1002
2 Pear 1002
3 Truck 1002
4 Outside 1002
;run;

proc sql;
create table respondent_id as
select distinct respondent_id
from table2;

data fill (drop=i);
  set respondent_id;
  do I=1 to 4;
    question=I;
    output;
  end;
run;

proc sql;
  create table full_table2 as
  select  COALESCE(A.respondent_id,B.respondent_id) as respondent_id,   
          COALESCE(A.question,B.question) as question,
          A.response
  from    WORK.table2 A
  full join WORK.fill B
  on      A.respondent_id=B.respondent_id
  and     A.question=B.question;
quit;

Regular Contributor
Posts: 234

Re: Merge/join

data dat1;
input form_id question;
cards;
99 1
99 2
99 3
99 4
;
data dat2;
input question response $ respondent_id;
cards;
2 Apple 1001
3 Car 1001
4 Sky 1001
1 Outside 1002
2 Pear 1002
3 Truck 1002
4 Outside 1002
;

proc sort data=dat1;
 by question;
run;

proc sort data=dat2 nodupkey;
 by question;
run;

data want;
  merge dat1(in=a) dat2(in=b);
  by question;
  if a;
  if question=1 then call missing(response,respondent_id);
run;
Occasional Contributor
Posts: 19

Re: Merge/join

Thank you for your solution. The output is just what I was looking for and it fills in data where needed.
Unfortunately I'm having trouble implementing your solution because the real data is more complicated.

For example. I could have multiple form id's and each form could have a different number of questions. As well the question id's are alphanumeric (1v2,2v2...). Your solution only uses table 2 so table 2 would need to include form_id.

Is there any way you could help me again?
Valued Guide
Posts: 858

Re: Merge/join

Sure, give some example data that illustrates some records that have desired results and some that do not.

Occasional Contributor
Posts: 19

Re: Merge/join

Example table desired
Form_id question response respondent_id date
99 1v2 . 1001 201601
99 2v2 apple 1001 201601
99 3v2 car 1001 201601
99 4v2 yellow 1001 201601
99 1v2 3.99 1002 201601
99 2v2 apple 1002 201601
99 3v2 truck 1002 201601
99 4v2 red 1002 201601
98 1v1 out 1001 201601
98 2v1 . 1001 201601
98 3v1 90 1001 201601
98 4v1 piano 1001 201601
98 5v1 radio 1001 201601
98 1v1 in 1001 201602
98 2v1 zoo 1001 201602
98 3v1 . 1001 201602
98 4v1 guitar 1001 201602
98 5v1 radio 1001 201602

I want to see all form_id's and all questions for each respondent_id and date. I only want a response if one was given so it's the only field that should have blanks.

Do you need my desired output table provided in an original input table so you can see what will be provided? It's very similar to what I put in my original post except I added more than one form_id, changed the question from numeric to text and added a date variable.

Thanks in advance!
Valued Guide
Posts: 858

Re: Merge/join

I don't know what ALL form ids or questions means.  I need a list of how many there are and what they are. 

Occasional Contributor
Posts: 19

Re: Merge/join

Ok sure.
For form_id 99 there are 4 questions and the questions are:
1v2
2v2
3v2
4v2
For form_id 98 there are 5 questions and the questions are:
1v1
2v1
3v1
4v1
5v1

This corresponds to the desired table I gave as an example. The real table can have more than 100 form_ids with up to 50 questions each. Each month the number of form_ids and/or the number of questions could change (hence the v1, v2 in the question values). I'm hoping for a solution that can loop through all observations.


Solution
‎03-04-2016 10:19 AM
Occasional Contributor
Posts: 19

Re: Merge/join

You know when you step away from something that's been wracking your brain for hours (or days) and then you return and the solution just smacks you in the face?  Well I just had that moment.  I can't believe I didn't think of this before and it's so simple.  Thank you to those who replied.  I really appreciate the time you put into helping me.
 
data table1;
input form_id question;
cards;
98 1
98 2
98 3
98 4
98 5
99 1
99 2
99 3
99 4
; run;
 
data table2;
input form_id question response $ respondent_id date;
cards;
98 2 Orange 1002 201601
98 3 Bike 1002 201601
98 4 Smurf 1002 201601
98 5 Red 1002 201601
98 1 Inside 1004 201601
98 3 Car 1004 201601
98 4 Crayon 1004 201601
98 5 Yellow 1004 201601
98 1 Inside 1004 201602
98 3 Car 1004 201602
98 4 Crayon 1004 201602
98 5 Yellow 1004 201602
 
99 2 Apple 1001 201601
99 3 Car 1001 201601
99 4 Sky 1001 201601
99 1 Inside 1001 201602
99 2 Apple 1001 201602
99 3 Car 1001 201602
99 4 Sky 1001 201602
99 1 Outside 1002 201601
99 2 Pear 1002 201601
99 3 Truck 1002 201601
99 4 Blueberries 1002 201601
; run;
 
/*Get all unique combinations of form, respondent and date*/
proc sql;
create table tableA as
select distinct form_id, respondent_id, date
from table2;
quit;
 
/*Add the questions from table 1*/
proc sql;
create table tableB as
select a.*, b.question
from tableA a left join table1 b
on a.form_id = b.form_id;
quit;
 
/*Add the responses from table 2*/
proc sql;
create table tableC as
select a.*, b.response
from tableB a left join table2 b
on a.form_id = b.form_id and a.question = b.question and a.respondent_id = b.respondent_id and a.date = b.date;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 339 views
  • 1 like
  • 3 in conversation