## comparing populations

SAS Community-

Please advise on a way to approach this question. No doubt it's very simple. I just can't seem to figure out the best way to tackle it.

As an example, I have a number of variables including educated (1=yes, 0=no) and subject ID.  Information was collected on both the primary subject and friends of the primary subject.

To identify friends of the primary subjects, friends were assigned a similar subject ID as the primary subject (same digits except for the last one).

Example:

A primary subject ID = 99980, friend of A primary subject ID = 99982, friend of A primary subject ID =99983

B primary subject ID = 2A345, friend of B primary subject ID = 2A346, friend of B primary subject ID = 2A348

My study focuses on the education practices of the friends of the eligible primary subjects.  Please help me with a code that will identify the friends of the eligible primary subjects where the eligible primary subjects is defined as having 1=yes for variable educated.

Thanks!

Hi,

do you have a variable to identify which ID is a primary ID? If so, my code maybe helpful to you.

data have;

input id \$ p \$ educated;

cards;

99980 p 1

99981 f 0

99982 f 1

99983 f 1

2A345 p 0

2A346 f 1

2A347 f 1

2A348 f 1

3B345 p 1

3B346 f 0

3B347 f 1

3B348 f 0

;

data primary;/* this dataset has all the primary IDs with educated=1 */

set have;

if upcase(p)='P' and (educated=1);

length new_id \$4;

new_id=substr(id,1,4);

run;

proc sortby new_id; run;

data friends;/* this dataset has all the friend IDs */

set have;

if upcase(p)='F';

length new_id \$4;

new_id=substr(id,1,4);

proc sort; by New_id; run;

datawant; /* this dataset has all the friends with educated primary subject */

merge primary(in=a keep=new_id) friends(in=b);

by new_id;

if a and b;

run;

proc print;run;

Obs    new_id     id      p    educated

1      3B34     3B346    f        0

2      3B34     3B347    f        1

3      3B34     3B348    f        0

4      9998     99981    f        0

5      9998     99982    f        1

6      9998     99983    f        1

Linlin

I didn't look at all of Linlin's suggested code but, in the event that you don't have p already defined, you could easily create it with something like the following:

data havenow;

input id \$ educated;

cards;

99980 1

99981 0

99982 1

99983 1

2A345 0

2A346 1

2A347 1

2A348 1

3B345 1

3B346 0

3B347 1

3B348 0

;

proc sql;

create table have as

select *,substr(id,1,length(id)-1) as partid,

case

when id=min(id) then 'p'

else 'f'

end as p

from havenow

group by calculated partid

;

quit;

