BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
saramanzella
Fluorite | Level 6

Hello everyone!

 

I'm using SAS Studio.

I have the following dataset with three variables: ID is the identification of the respondent, PARTNERID is the identification of the partner of the respondent and YEARSEDU is the number of years of education of the respondent.

 

 

ID                          PARTNERID           YEARSEDU        

AT-000327-01       AT-000327-02         15                      

AT-000327-02       AT-000327-01         20                      

AT-000328-01       .                              10                       

AT-000329-01       AT-000329-02           8                      

AT-000329-02       AT-000329-01         13                        

 

I'd like to create another variable that represents the years of education of the partner of the respondent so that I can obtain the following dataset:

 

ID                          PARTNERID           YEARSEDU        YEARSEDU_P  

AT-000327-01       AT-000327-02         15                      20

AT-000327-02       AT-000327-01         20                      15

AT-000328-01       .                              10                       .

AT-000329-01       AT-000329-02           8                      13

AT-000329-02       AT-000329-01         13                        8

 

I know I should compare the value of ID with every value of PARTNERID and if I find a match use the value of YEARSEDU in the matched PARTNERID row as value for YEARSEDU_P in the ID row. 

I really have no idea on how to do that, can you please help me?

 

Many thanks.

Regards,

Sara

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

If the dataset is not to large, using a hash-object can solve the problem:

 

data work.have;
   length id partnerId $ 15 yearsEdu 8;
   input id partnerId yearsEdu;
   datalines;
AT-000327-01 AT-000327-02 15 
AT-000327-02 AT-000327-01 20 
AT-000328-01 . 10 
AT-000329-01 AT-000329-02 8 
AT-000329-02 AT-000329-01 13 
;
run;

data want;   
   set work.have;

   length yearsEdu_P 8;

   if _n_ = 1 then do;
      declare hash h(dataset: 'work.have(keep=partnerId yearsEdu rename=(yearsEdu = yearsEdu_P))');
      h.defineKey('partnerId');
      h.defineData('yearsEdu_P');
      h.defineDone();
   end;

   if h.find(key: id) ^= 0 then do;
      yearsEdu_P = .;
   end;
run;

View solution in original post

3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

This may not be the most efficient method (and not tested as no test data in datastep):

proc sql;
  create table want as 
  select  a.*,
          b.yearsedu_p
  from    have a
  left join (select id,sum(yearsedu) as yearsedu_p from have group by id) b
  on      a.partnerid=b.id;
quit;
andreas_lds
Jade | Level 19

If the dataset is not to large, using a hash-object can solve the problem:

 

data work.have;
   length id partnerId $ 15 yearsEdu 8;
   input id partnerId yearsEdu;
   datalines;
AT-000327-01 AT-000327-02 15 
AT-000327-02 AT-000327-01 20 
AT-000328-01 . 10 
AT-000329-01 AT-000329-02 8 
AT-000329-02 AT-000329-01 13 
;
run;

data want;   
   set work.have;

   length yearsEdu_P 8;

   if _n_ = 1 then do;
      declare hash h(dataset: 'work.have(keep=partnerId yearsEdu rename=(yearsEdu = yearsEdu_P))');
      h.defineKey('partnerId');
      h.defineData('yearsEdu_P');
      h.defineDone();
   end;

   if h.find(key: id) ^= 0 then do;
      yearsEdu_P = .;
   end;
run;
PeterClemmensen
Tourmaline | Level 20

I think @andreas_ldss solution is probably the most efficient you will find if the data fits in memory.

 

Here is an alternative data step approach without the hash object.

 

data work.have;
   length id partnerId $ 15 yearsEdu 8;
   input id partnerId yearsEdu;
   datalines;
AT-000327-01 AT-000327-02 15 
AT-000327-02 AT-000327-01 20 
AT-000328-01 . 10 
AT-000329-01 AT-000329-02 8 
AT-000329-02 AT-000329-01 13 
;
run;

data want(drop=compPartnerID);
   set have nobs=nobs;
   do _N_=1 to nobs;
      set have(rename=(partnerId=compPartnerID yearsEdu=yearsEdu_P)) point=_N_;
      if id=partnerId then output;
   end;
   if missing(partnerId) then do; 
      call missing(yearsEdu_P);
      output;
   end;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 725 views
  • 3 likes
  • 4 in conversation