BookmarkSubscribeRSS Feed
krishnaonline
Calcite | Level 5
Kindly help me with the logic for this one.

If the relation is husband, then all the other observations following it until another husband occurs should be read into a single record, and i need to get corrresponding spouse and child name in the same husbands record but with spouse and child name.

Data to be processed:
NAME AGE GENDER RELATION
A 40 M HUSBAND
B 30 F sPOUSE
C 10 M CHILD
D 40 M HUSBAND
E 30 F SPOUSE
F 10 M CHILD


REPORT:

Name age gender relation spouse child
a 40 m husband b c
d 40 m husband e f
3 REPLIES 3
SushilNayak
Obsidian | Level 7
Hey Krishna,
Try this out

data dx;
infile datalines missover;
input NAME :$10. AGE :3. GENDER :$1. RELATION :$10.;
datalines;
A 40 M HUSBAND
B 30 F sPOUSE
C 10 M CHILD
D 40 M HUSBAND
E 30 F SPOUSE
F 10 M CHILD
;
data dxx;
set dx;
retain counter 0;
if strip(relation) = 'HUSBAND' then counter=counter+1 ;
run;

proc summary nway missing data= dxx;
class counter ;
output out=sushil(drop=_: relation_2 relation_3 gender_2 gender_3 age_2 age_3 counter)
idgroup( out{3} (NAME AGE GENDER RELATION)=);
run;
proc print data=sushil;run;

if you don't like proc summary, you can do datastep manipulation after the dxx dataset creation. I kinda love summary procedure, hence used it over here.

hope that helps 🙂
GertNissen
Barite | Level 11
If your husband ALWAYS has only one SPOUSE and one CHILD, you can simply use this
[pre]
data dx;
infile datalines ;
input NAME :$10. AGE :3. GENDER :$1. RELATION :$10. / S_NAME :$10. / C_NAME :$10.;
datalines;
A 40 M HUSBAND
B 30 F SPOUSE
C 10 M CHILD
D 40 M HUSBAND
E 30 F SPOUSE
F 10 M CHILD
;
[/pre]

But I guess that is not always the case - Below I have added an extra child in the datalines.

[pre]
data TEST;
length name_HUSBAND type R_Name $10;
retain name_HUSBAND count;
input @8 type :$10. @; put type=;
if type='HUSBAND' then
do;
count=0;
input @1 name_HUSBAND :$10.;
end;
else
do;
if type='CHILD' then
do;
count+1;
type=cats(type,count);
end;
input @1 R_Name :$10.;
output TEST;
end;
datalines;
A 40 M HUSBAND
B 30 F SPOUSE
C 10 M CHILD
D 40 M HUSBAND
E 30 F SPOUSE
F 10 M CHILD
G 12 F CHILD
;
run;

proc transpose data=test out=trans(drop=_name_) prefix=name_;
by name_HUSBAND;
id type;
var r_name;
run;
[/pre]
Cynthia_sas
SAS Super FREQ
Hi:
With just a few modifications to this program, you do NOT need the PROC TRANSPOSE step. You can create one obs for every family by using ARRAYS to hold the childrens' names, ages and genders and by RETAINING all the family information until you've collected 1 family.

Support you had this data stored in c:\temp\famdata.txt:
[pre]
A 40 M HUSBAND
B 30 F SPOUSE
C 10 M CHILD
D 40 M HUSBAND
E 30 F SPOUSE
F 10 M CHILD
G 12 F CHILD
H 40 M HUSBAND
I 30 F SPOUSE
J 12 M CHILD
K 14 F CHILD
L 40 M HUSBAND
M 12 M CHILD
N 14 F CHILD
O 40 M HUSBAND
P 38 F SPOUSE
[/pre]

(Note how the last family has no children and the second to the last family is a single-parent family with a husband and no spouse.)

Now the COUNT variable shows the number of children and could be used in subsequent programs as a DO loop counter. Here is the output from the program shown below.

cynthia
The Output
[pre]
test

name_ name_
Obs HUSBAND h_age h_gen Spouse s_age s_gen cn1 cn2 cn3 cn4 cg1 cg2 cg3 cg4 a1 a2 a3 a4 count

1 A 40 M B 30 F C M 10 . . . 1
2 D 40 M E 30 F F G M F 10 12 . . 2
3 H 40 M I 30 F J K M F 12 14 . . 2
4 L 40 M NONE . M N M F 12 14 . . 2
5 O 40 M P 38 F . . . . 0

[/pre]

using this program:
The ChangedProgram
[pre]
options nodate nonumber nocenter;
data TEST;
keep name_HUSBAND h_age h_gen
name_SPOUSE s_age s_gen
cn1 cn2 cn3 cn4
cg1 cg2 cg3 cg4
a1 a2 a3 a4 count;

length name_HUSBAND $10 h_age 8 h_gen $1
name_Spouse $10 s_age 8 s_gen $1
type R_Name $10;
array cn $10 cn1-cn4;
array cg $1 cg1-cg4;
array a a1-a4;
retain name_HUSBAND h_age h_gen
name_SPOUSE s_age s_gen
cn1 cn2 cn3 cn4 cg1 cg2 cg3 cg4
a1 a2 a3 a4 count;
label name_Husband = 'Husband'
name_Spouse = 'Spouse'
count = 'Number Children';

** note: can only use END= with a FILE, not with DATALINES;
infile 'c:\temp\famdata.txt' pad end=eof;
input @8 type :$10. @;
put type=;
if type='HUSBAND' then do;
** Output "saved" info every time a new HUSBAND dataline;
** is read -- after first dataline.;
** (when on 1st dataline, have not read the whole family yet);
** But when on a HUSBAND record, the previous family has been read.;
if _n_ gt 1 then output TEST;

** input new HUSBAND record;
input @1 name_HUSBAND $ h_age h_gen $;

** clear arrays for retained children info and ;
** reset spouse info for new family;
do i = 1 to 4 by 1;
cn(i) = ' ';
cg(i) = ' ';
a(i) = .;
end;
name_SPOUSE='NONE';
s_age = .;
s_gen = ' ';
count = 0;
end;
else do;
** Read input dataline into 2 holding values and;
** then assign to variables based on type.;
input @1 R_Name :$10. r_age r_gen $;

if type='CHILD' then do;
count+1;
cn(count) = R_Name;
cg(count) = r_gen;
a(count) = R_Age;
end;
if type = 'SPOUSE' then do;
name_SPOUSE = R_Name;
s_age = R_Age;
s_gen = r_gen;
end;
end;

** on last dataline, output the last family;
if eof = 1 then output TEST;
run;

proc print data=test;
title 'test';
run;

[/pre]

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 682 views
  • 0 likes
  • 4 in conversation