BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jude1
Fluorite | Level 6

I am trying to merge the 2 datasets by var1 and create 3 datasets outputs, one that has var1 in both, one that has var1 with no income, one that has var1 with no grad_year

the data looks like this:

 

data mydata;

input var1 income;

datalines;

1 400

2 500

3 500

7 600

8 900

9 1000

run;

 

data mydata2;

var1 grad_year

1 2018

2 2019

4 2020 

5 2019 

6 2017

11 2019

 

merging is not an issue for me. I think i should use the 'if then output' statement for this but i could not figure it out.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Use next code:

data both no_income no_grad_year;
 merge mydata(in=in1)
           mydata2(in=in2)
     ;
   by var1;
         if in1 and in2 then output both; else    /* line edited */
         if in1 and not in2 then output no_grad_year; 
         else output no_income ;
run;
         

View solution in original post

8 REPLIES 8
Shmuel
Garnet | Level 18

Use next code:

data both no_income no_grad_year;
 merge mydata(in=in1)
           mydata2(in=in2)
     ;
   by var1;
         if in1 and in2 then output both; else    /* line edited */
         if in1 and not in2 then output no_grad_year; 
         else output no_income ;
run;
         
jude1
Fluorite | Level 6
Thanks for your help. So I wrote the code and countered a new problem. i keep getting error 22-322 and 202-322 on the data2 if statement. data1 and data3 are coming out correct, but not data2. any tips?

data data1 data2 data3;
merge mydata1(in=a) mydata2(in=b); by var1;
if a and b; output data1;
if b and not a then output data2; else output data3;
run;
proc print data=data1;
proc print data=data2;
proc print data= data3;
JackHamilton
Lapis Lazuli | Level 10

In your code (reformatted to make it easier to read):

 

data data1 data2 data3;
    merge mydata1(in=a) 
          mydata2(in=b); by var1;
if a and b; 
output data1;
if b and not a then 
    output data2; 
else 
    output data3;
run;

When do you think "if b and not a" will be true?  The line "if a and b" has already told SAS that you want only those records where both a and b are true.  So you won't get any records where b is false.  No code after a false subsetting if will be executed (with the exception of subroutines).

 

PGStats
Opal | Level 21

Replace

 

if a and b; output data1;

 

with

 

if a and b then output data1;

PG
Shmuel
Garnet | Level 18

Sorry, I had a typo and missed the word THEN.

I have edited and corrected my first post. 

Sonia2
Calcite | Level 5
I got the same question in exam but I want to clear some concepts, In same question can I use if in1 and in2 then output both; if in1 and not in2 then output no_grad_year;
If not in1 and in2 then output no_income ;
And my second question is when we give answer to question in exam if we look at else if in1 and not in2 then output no_grad_year; in output we will get some missing values if I want answer to question and answer is missing value then I’ll write for or type missing value or what ?? Please explain me.
Shmuel
Garnet | Level 18

Your post:


@Sonia2 wrote:
I got the same question in exam but I want to clear some concepts, In same question can I use if in1 and in2 then output both; if in1 and not in2 then output no_grad_year;
If not in1 and in2 then output no_income ;
And my second question is when we give answer to question in exam if we look at else if in1 and not in2 then output no_grad_year; in output we will get some missing values if I want answer to question and answer is missing value then I’ll write for or type missing value or what ?? Please explain me.

Please notice that IN1 and IN2 - each contains 1 for EXIST and 0 for absent. 

 

As long as you didn't arrive to RETURN statement or to the end of the program, the data is still in the memory and you can write it to output, either the same dataset (making duplicates) or to other dataset.

 

Missing value in a variable is the result of nonexistance record on the dataset that contains this variable.

 

 

 

mkeintz
PROC Star

Consider using the IN= parameters on the MERGE statement, as in:

 

data both  noincome nogradyear;
  merge mydata (in=in1) mydata2 (in=in2);
  by var1;
   ....
run;

Do tests on the values of IN1 and IN2, and then conditional OUTPUT statements based on those tests.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1518 views
  • 5 likes
  • 6 in conversation