Solved: Re: How to create a variable based on the matching values of other two...

saramanzella · Posted 07-23-2018 06:26 AM

Hello everyone!

I'm using SAS Studio.

I have the following dataset with three variables: ID is the identification of the respondent, PARTNERID is the identification of the partner of the respondent and YEARSEDU is the number of years of education of the respondent.

ID PARTNERID YEARSEDU

AT-000327-01 AT-000327-02 15

AT-000327-02 AT-000327-01 20

AT-000328-01 . 10

AT-000329-01 AT-000329-02 8

AT-000329-02 AT-000329-01 13

I'd like to create another variable that represents the years of education of the partner of the respondent so that I can obtain the following dataset:

ID PARTNERID YEARSEDU YEARSEDU_P

AT-000327-01 AT-000327-02 15 20

AT-000327-02 AT-000327-01 20 15

AT-000328-01 . 10 .

AT-000329-01 AT-000329-02 8 13

AT-000329-02 AT-000329-01 13 8

I know I should compare the value of ID with every value of PARTNERID and if I find a match use the value of YEARSEDU in the matched PARTNERID row as value for YEARSEDU_P in the ID row.

I really have no idea on how to do that, can you please help me?

Many thanks.

Regards,

Sara

andreas_lds · Posted 07-23-2018 09:11 AM

If the dataset is not to large, using a hash-object can solve the problem:

data work.have;
   length id partnerId $ 15 yearsEdu 8;
   input id partnerId yearsEdu;
   datalines;
AT-000327-01 AT-000327-02 15 
AT-000327-02 AT-000327-01 20 
AT-000328-01 . 10 
AT-000329-01 AT-000329-02 8 
AT-000329-02 AT-000329-01 13 
;
run;

data want;   
   set work.have;

   length yearsEdu_P 8;

   if _n_ = 1 then do;
      declare hash h(dataset: 'work.have(keep=partnerId yearsEdu rename=(yearsEdu = yearsEdu_P))');
      h.defineKey('partnerId');
      h.defineData('yearsEdu_P');
      h.defineDone();
   end;

   if h.find(key: id) ^= 0 then do;
      yearsEdu_P = .;
   end;
run;

View solution in original post

RW9 · Posted 07-23-2018 07:58 AM

This may not be the most efficient method (and not tested as no test data in datastep):

proc sql;
  create table want as 
  select  a.*,
          b.yearsedu_p
  from    have a
  left join (select id,sum(yearsedu) as yearsedu_p from have group by id) b
  on      a.partnerid=b.id;
quit;

andreas_lds · Posted 07-23-2018 09:11 AM

If the dataset is not to large, using a hash-object can solve the problem:

data work.have;
   length id partnerId $ 15 yearsEdu 8;
   input id partnerId yearsEdu;
   datalines;
AT-000327-01 AT-000327-02 15 
AT-000327-02 AT-000327-01 20 
AT-000328-01 . 10 
AT-000329-01 AT-000329-02 8 
AT-000329-02 AT-000329-01 13 
;
run;

data want;   
   set work.have;

   length yearsEdu_P 8;

   if _n_ = 1 then do;
      declare hash h(dataset: 'work.have(keep=partnerId yearsEdu rename=(yearsEdu = yearsEdu_P))');
      h.defineKey('partnerId');
      h.defineData('yearsEdu_P');
      h.defineDone();
   end;

   if h.find(key: id) ^= 0 then do;
      yearsEdu_P = .;
   end;
run;

PeterClemmensen · Posted 07-23-2018 11:43 AM

I think @andreas_ldss solution is probably the most efficient you will find if the data fits in memory.

Here is an alternative data step approach without the hash object.

data work.have;
   length id partnerId $ 15 yearsEdu 8;
   input id partnerId yearsEdu;
   datalines;
AT-000327-01 AT-000327-02 15 
AT-000327-02 AT-000327-01 20 
AT-000328-01 . 10 
AT-000329-01 AT-000329-02 8 
AT-000329-02 AT-000329-01 13 
;
run;

data want(drop=compPartnerID);
   set have nobs=nobs;
   do _N_=1 to nobs;
      set have(rename=(partnerId=compPartnerID yearsEdu=yearsEdu_P)) point=_N_;
      if id=partnerId then output;
   end;
   if missing(partnerId) then do; 
      call missing(yearsEdu_P);
      output;
   end;
run;

The DATA to DATA Step Macro
Blog: SASnrd

How to create a variable based on the matching values of other two variables in the same dataset?

Re: How to create a variable based on the matching values of other two variables in the same dataset

Re: How to create a variable based on the matching values of other two variables in the same dataset

Re: How to create a variable based on the matching values of other two variables in the same dataset

Re: How to create a variable based on the matching values of other two variables in the same dataset

How to create a variable based on the matching values of other two variables in the same dataset?

Re: How to create a variable based on the matching values of other two variables in the same dataset

Re: How to create a variable based on the matching values of other two variables in the same dataset

Re: How to create a variable based on the matching values of other two variables in the same dataset

Re: How to create a variable based on the matching values of other two variables in the same dataset

SAS Innovate 2025: Call for Content

Click image to register for webinar

Classroom Training Available!