BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Demographer
Pyrite | Level 9

I have a dataset pop_dm that contains individuals with some characteristics. I have another dataset lfp which contains parameters from a logit regression for the labor force participation. I want to use those parameters to calculate in pop_dm the individual probability of being in the labor force. I would like to know if there is a simple way to merge both datasets. The variable intercept from lfp should for instance match individuals by sex only. Similarly, the variable agegr_edu_p (parameters for the interaction agegr and edu) should match by sex, agegr and education. I thus cannot use a single merge statement, because the “by” are not the same for the different parameters.

 

My first idea was to do it in many data steps, but it looks over complicated.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

You could either use SQL joins or data step hash lookups. Below a SQL approach (code not tested).

proc sql;
  create table want as
    select 
      p.*,
      t1.intercept,
      t2.agegr_edu_p
    from 
      pop_dm p

      left join
      (
        select sex, intercept
        from lfp
        where not missing(intercept)
      ) t1
      on p.sex=t1.sex

      left join
      (
        select sex, agegr, edu, agegr_edu_p
        from lfp
        where not missing(agegr_edu_p)
      )t2
      on p.sex=t2.sex and p.agegr=t2.agegr and p.edu=t2.edu
    ;
quit;

 

And here a hash lookup (not tested).

data want;
  if _n_=1 then
    do;
      if 0 then set lpf(keep=intercept agegr_edu_p);

      dcl hash h1(dataset:'lpf(keep=sex intercept where=(not missing(intercept))');
      h1.defineKey('sex');
      h1.defineData('intercept');
      h1.defineDone();

      dcl hash h2(dataset:'lpf(keep=sex agegr edu agegr_edu_p where=(not missing(agegr_edu_p))');
      h2.defineKey('sex', 'agegr', 'edu');
      h2.defineData('agegr_edu_p');
      h2.defineDone();
    end;

    set pop_dm;
    h1.find();
    h2.find();
run;

 

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

Create a format for the sex:

data cntlin;
set ifp;
where nmiss(sex,intercept) = 0 and nmiss(edu,agegr,young_kid,region) = 4;
fmtname = 'sex';
type = 'n';
keep fmtname type sex intercept;
rename sex=start intercept=label;
run;

proc format cntlin=cntlin;
run;

Similarly, extract the data for the more complex value agegr_edu_p, then join on sex,agegr,edu and apply the format during that.

ballardw
Super User

It might help to share the code used to build the model. Several of the modeling procedures have an option to build a special data that can be used to score another data set using the result from the model using Proc PLM.

 

 

Patrick
Opal | Level 21

You could either use SQL joins or data step hash lookups. Below a SQL approach (code not tested).

proc sql;
  create table want as
    select 
      p.*,
      t1.intercept,
      t2.agegr_edu_p
    from 
      pop_dm p

      left join
      (
        select sex, intercept
        from lfp
        where not missing(intercept)
      ) t1
      on p.sex=t1.sex

      left join
      (
        select sex, agegr, edu, agegr_edu_p
        from lfp
        where not missing(agegr_edu_p)
      )t2
      on p.sex=t2.sex and p.agegr=t2.agegr and p.edu=t2.edu
    ;
quit;

 

And here a hash lookup (not tested).

data want;
  if _n_=1 then
    do;
      if 0 then set lpf(keep=intercept agegr_edu_p);

      dcl hash h1(dataset:'lpf(keep=sex intercept where=(not missing(intercept))');
      h1.defineKey('sex');
      h1.defineData('intercept');
      h1.defineDone();

      dcl hash h2(dataset:'lpf(keep=sex agegr edu agegr_edu_p where=(not missing(agegr_edu_p))');
      h2.defineKey('sex', 'agegr', 'edu');
      h2.defineData('agegr_edu_p');
      h2.defineDone();
    end;

    set pop_dm;
    h1.find();
    h2.find();
run;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 732 views
  • 0 likes
  • 4 in conversation