comparing populations

SAS Community-

Please advise on a way to approach this question. No doubt it's very simple. I just can't seem to figure out the best way to tackle it.

As an example, I have a number of variables including educated (1=yes, 0=no) and subject ID.  Information was collected on both the primary subject and friends of the primary subject. 

To identify friends of the primary subjects, friends were assigned a similar subject ID as the primary subject (same digits except for the last one). 


A primary subject ID = 99980, friend of A primary subject ID = 99982, friend of A primary subject ID =99983

B primary subject ID = 2A345, friend of B primary subject ID = 2A346, friend of B primary subject ID = 2A348

My study focuses on the education practices of the friends of the eligible primary subjects.  Please help me with a code that will identify the friends of the eligible primary subjects where the eligible primary subjects is defined as having 1=yes for variable educated.


Re: comparing populations

Posted in reply to sophia_SAS


do you have a variable to identify which ID is a primary ID? If so, my code maybe helpful to you.

data have;

  input id $ p $ educated;


  99980 p 1

  99981 f 0

  99982 f 1

  99983 f 1

  2A345 p 0

  2A346 f 1

  2A347 f 1

  2A348 f 1

  3B345 p 1

  3B346 f 0

  3B347 f 1

  3B348 f 0


data primary;/* this dataset has all the primary IDs with educated=1 */

  set have;

  if upcase(p)='P' and (educated=1);

  length new_id $4;



proc sortby new_id; run;

data friends;/* this dataset has all the friend IDs */

  set have;

  if upcase(p)='F';

   length new_id $4;


proc sort; by New_id; run;

datawant; /* this dataset has all the friends with educated primary subject */

   merge primary(in=a keep=new_id) friends(in=b);

   by new_id;

   if a and b;


proc print;run;

                             Obs    new_id     id      p    educated

                              1      3B34     3B346    f        0

                              2      3B34     3B347    f        1

                              3      3B34     3B348    f        0

                              4      9998     99981    f        0

                              5      9998     99982    f        1

                              6      9998     99983    f        1


Re: comparing populations

Posted in reply to sophia_SAS

I didn't look at all of Linlin's suggested code but, in the event that you don't have p already defined, you could easily create it with something like the following:

data havenow;

  input id $ educated;


  99980 1

  99981 0

  99982 1

  99983 1

  2A345 0

  2A346 1

  2A347 1

  2A348 1

  3B345 1

  3B346 0

  3B347 1

  3B348 0


proc sql;

  create table have as

    select *,substr(id,1,length(id)-1) as partid,


        when id=min(id) then 'p'

        else 'f'

      end as p

      from havenow

        group by calculated partid



