Solved: Re: Creating a variable based on subsequent observations - Page 2

art297 · Posted 08-12-2011 06:05 PM

My thought process in creating the code was that both MomID and DadID would both have missing values at the beginning. Thus, if that is true, no additional sort should be necessary. The only purpose of the sort was to put the mothers and fathers at the end of each residence's records.

chuakp · Posted 08-12-2011 08:11 PM

Thanks for the help. If I could extend this problem one step further,I need to now figure out how to store mom's degree and assign it to all of her children. The individual's degree is represented by the variable DEGREE. The variable HASMOM indicates whether a child has a mother in the survey. Here's what the dataset (called temp1) looks like:

DUID	PID	DUPERSID	MOMID	DEGREE	MOTHER	HASMOM
1001	101	1001101		1	1
1001	102	1001102		2
1001	103	1001103		3	1
1001	104	1001104	1001101	8		1
1001	105	1001105	1001103	8		1
1002	101	1002101		1
1003	101	1003101		1
1003	102	1003102		2
1003	103	1003103	1003102	8		1
1003	104	1003104	1003102	8		1

I basically want DUPERSID 1001104 to have a variable MOMDEGREE that equals 1 (which is that child's mom's degree). DUPERSID 1001105 should have MOMDEGREE = 3, DUPERSID 1003103 and 1003104 should both have MOMDEGREE = 2. Using the code below, I was able to successfully assign MOMDEGREE to children in one-mother dwellings, but I run into problems with the multiple mother dwellings.

proc sort data = temp1;

by duid descending mother;

run;

data temp1_temp; set temp1;

by duid;

retain hold_momdegree;

length hold_momdegree $100.;

if first.DUID then do;

hold_momdegree="";

end;

momdegree = 0;

if mother = 1 then hold_momdegree = degree;

if hasmom = 1 then momdegree = hold_momdegree;

run;

art297 · Posted 08-12-2011 11:11 PM

I think that you only need a fairly simple datastep merge or sql join. In a datastep, you could probably do it with something like:

data have;

infile cards truncover;

input (DUID PID DUPERSID MomID DadID) ($);

cards;

1001 101 1001101

1001 102 1001102

1001 103 1001103

1001 104 1001104 1001101 1001102

1001 105 1001105 1001103 1001102

1002 101 1002101

1003 101 1003101

1003 102 1003102

1003 103 1003103 1003102 1003101

1003 104 1003104 1003102 1003101

;

proc sort data=have;

by DUID descending MomID;

run;

data want (drop=Hold:);

set have;

by DUID;

retain Hold_MomID Hold_DadID;

length Hold_MomID Hold_DadID $100;

if first.DUID then do;

Hold_MomID="";

Hold_DadID="";

end;

Mother=0;

Father=0;

if missing(MomID) then do;

if index(Hold_MomID,DUPERSID) gt 0 then Mother=1;

end;

else Hold_MomID=catx(" ",Hold_MomID,MomID);

if missing(DadID) then do;

if index(Hold_DadID,DUPERSID) gt 0 then Father=1;

end;

else Hold_DadID=catx(" ",Hold_DadID,DadID);

run;

data mdegrees;

input (DUID PID DUPERSID MOMID DEGREE) ($) MOTHER HASMOM;

cards;

1001 101 1001101 . 1 1 .

1001 102 1001102 . 2 . .

1001 103 1001103 . 3 1 .

1001 104 1001104 1001101 8 . 1

1001 105 1001105 1001103 8 . 1

1002 101 1002101 . 1 . .

1003 101 1003101 . 1 . .

1003 102 1003102 . 2 . .

1003 103 1003103 1003102 8 . 1

1003 104 1003104 1003102 8 . 1

;

run;

proc sort data=mdegrees (keep=dupersid degree

rename=(degree=mdegree

dupersid=MomID)

)

out=mdegrees;

by MomID mdegree;

run;

proc sort data=want;

by MomID;

run;

data new_want;

merge want (in=a) mdegrees (in=b);

by MomID;

if a;

run;

chuakp · Posted 08-15-2011 07:55 AM

Thank you - this wored well.

DBailey · Posted 08-11-2011 03:26 PM

I think a straightforward sql approach would be easier to understand:

data work.detail_data;
input
Dwelling_ID
Person_ID
ID $
Mom_ID
Dad_ID;
datalines;
1001 101 1001_101 . .
1001 102 1001_102 . .
1001 103 1001_103 . .
1001 104 1001_104 101 .
1001 105 1001_105 . 102
;
run;

proc sql;
select
t1.dwelling_id,
t1.person_id,
t1.id,
t1.Mom_ID,
t1.Dad_ID,
case when count(t2.person_id)>0 then 1 else 0 end as IsMom,
case when count(t3.person_id)>0 then 1 else 0 end as IsDad
from
work.detail_data t1
left outer join work.detail_data t2
on t1.Person_ID=t2.Mom_id
left outer join work.detail_data t3
on t1.Person_ID=t3.Dad_ID
group by
t1.dwelling_id,
t1.person_id,
t1.id,
t1.mom_id;
quit;

produces this output

Dwelling_ID	Person_ID	ID	Mom_ID	Dad_ID	IsMom	IsDad
1001	101	1001_101	.	.	1	0
1001	102	1001_102	.	.	0	1
1001	103	1001_103	.	.	0	0
1001	104	1001_104	101	.	0	0
1001	105	1001_105	.	102	0	0

Ksharp · Posted 08-11-2011 10:39 PM

OK.There are other alternative ways in SAS. Both proc format and Hash Table are best tool for querying question.But Hash Table is faster and need more code.

data one;
input dwellingid : $4. personid : $3. id : $8. (momid dadid) (: $3.);
datalines;
1001 101 1001_101 N/A N/A
1001 102 1001_102 N/A N/A
1001 103 1001_103 N/A N/A
1001 104 1001_104 101 102
1001 105 1001_105 103 102
;
run;
proc sort data=one(keep=momid) out=one1 nodupkey;by momid;run;
proc sort data=one(keep=dadid) out=one2 nodupkey;by dadid;run;
data monid;
 set one1(rename=(momid=start) where=(start ne 'N/A'));
 retain fmtname 'momid' label 'Y' type 'C';
run; 
data dadid;
 set one2(rename=(dadid=start) where=(start ne 'N/A'));
 retain fmtname 'dadid' label 'Y' type 'C';
run;
proc format  cntlin=monid;run;
proc format cntlin=dadid;run;
data want;
 set one;
 mother=ifn(put(personid,$momid.)='Y',1,0);
 father=ifn(put(personid,$dadid.)='Y',1,0);
run;

Ksharp

Re: Creating a variable based on subsequent observations

Re: Creating a variable based on subsequent observations

Re: Creating a variable based on subsequent observations

Re: Creating a variable based on subsequent observations

Re: Creating a variable based on subsequent observations

Re: Creating a variable based on subsequent observations

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away