DATA Step, Macro, Functions and more

Create a variable based on another dataset

Accepted Solution Solved
Reply
Contributor
Posts: 53
Accepted Solution

Create a variable based on another dataset

I have 2 datasets, call them data1 and data2. They both have the same set of variables, one of them being ID.

 

data2 is a subset of data1. I want to create a variable in data1 that takes on the value of 1 when the ID variable is present in data2, and 0 if not in data2.

 

What is the most efficient way to do this?

 

 

 


Accepted Solutions
Solution
‎06-06-2017 07:21 PM
PROC Star
Posts: 1,562

Re: Create a variable based on another dataset

Efficient means no sorting.

Like this ?

data WANT; 
  if _N_ = 1 then do;  
    declare hash VET(dataset: "DATA2"); 
    VET.definekey("KEYVAR"); 
    VET.definedone();
  end; 
  set DATA1; 
  FLAG=VAT.check();
run;

 

 

 

 

 

View solution in original post


All Replies
Trusted Advisor
Posts: 1,389

Re: Create a variable based on another dataset

If needed sort datasets by ID then use merge:

data want;
merge data1(in=in1)
      data2(in=in2 keep=ID)
 ;  by ID;
       if in1 and in2 then flag=1;
       else flag=2;
run;
Contributor
Posts: 53

Re: Create a variable based on another dataset

Hello - thanks for the response. I tried this and the merged dataset has duplicate observations. Is there a way to just add a variable, flag, to data1, with value 1 if the ID variable is in data2, an value 0 if not? (without doing a merge)?

Solution
‎06-06-2017 07:21 PM
PROC Star
Posts: 1,562

Re: Create a variable based on another dataset

Efficient means no sorting.

Like this ?

data WANT; 
  if _N_ = 1 then do;  
    declare hash VET(dataset: "DATA2"); 
    VET.definekey("KEYVAR"); 
    VET.definedone();
  end; 
  set DATA1; 
  FLAG=VAT.check();
run;

 

 

 

 

 

Super Contributor
Posts: 259

Re: Create a variable based on another dataset

The function check returns 0 if the key was found in the hash object.

 

FLAG = (VAT.check() = 0);

Contributor
Posts: 53

Re: Create a variable based on another dataset

Thank you both! worked perfectly with the edit from andreas.

PROC Star
Posts: 1,562

Re: Create a variable based on another dataset

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 272 views
  • 3 likes
  • 4 in conversation