I have a raw data file that needs to be read into a dataset.
it looks like this:
----+----1----+----2----+----3----+----4----+----5
Harry Mary 123 Vermont Ave.
Joe 3
Lisa 2
Felix Laura 6 Baltic Drive
Karen 12
Thomas Felicia 424 Suffolk place
Edward Bella 666 Vampire Lane
Jeff 9
Jared 7
Tim 11
The variables are Father, Mother, Address followed by Child (if any) and Child Age.
I need to create a dataset that will have one observation per child with repeating Father, Mother and Address information. If no children then Child and age value should be blank. Please Help .. Thanks
Desired Result would look like this:
Father Mother Address Child Age
---------------------------------------------------
Harry Mary 123 Vermont Ave. Joe 3
Harry Mary 123 Vermont Ave. Lisa 2
Felix Laura 6 Baltic Drive Karen 12
Thomas Felicia 424 Suffolk place .
Edward Bella 666 Vampire Lane Jeff 9
Edward Bella 666 Vampire Lane Jared 7
Edward Bella 666 Vampire Lane Tim 11
How about this?
data want;
infile cards truncover eof=done;
input @1 father $9. mother $10. address $40.;
do _N_=1 by 1 until(0);
input child $9. age ?? 3.0 @@;
if missing(age) then leave;
output;
input;
end;
done: /* This is to avoid the "missing card" message in log */
if _N_=1 then do; /* output when no children read */
call missing(Child); /* Father variable has been read as Child */
output;
end;
cards;
Harry Mary 123 Vermont Ave.
Joe 3
Lisa 2
Felix Laura 6 Baltic Drive
Karen 12
Thomas Felicia 424 Suffolk place
Edward Bella 666 Vampire Lane
Jeff 9
Jared 7
Tim 11
;
run;
Is this a .txt file?
Assuming you have no delimiters between variables and
each name contains one string only without blanks embedded
then try next code:
data test;
length father $10 mother $10 address $40; /* addapt length to max expected */
retain father mother address no_of_childs;
infile datalines;
input a_line $80.
/* check is there an address */
if scan(a_line,3) ne ' ' then do;
if _N_ > 1 then do;
if no_of_childs = 0 then do;
child= ' ';
child_age = .;
output;
end;
a_line = compbl(a_line);
father = scan(a_line,1);
mother = scan(a_line,2);
add_start = index(a_line, scan(a_line, 3));
address = substr(a_line, add_start);
no_of_childs = 0;
end;
end;
else do;
input child $ child_age;
output;
no_of_childs +1;
end;
datalines;
Harry Mary 123 Vermont Ave.
Joe 3
Lisa 2
Felix Laura 6 Baltic Drive
Karen 12
Thomas Felicia 424 Suffolk place
Edward Bella 666 Vampire Lane
Jeff 9
Jared 7
Tim 11
;
run;
Hi Shmuel
I ran your program verbatim (had to add the ";" for first input statement) ... sorry, but not getting the required result.
I have edited the initial question text to include the desired result ..
Though solution is already selected, please find attached alternative code debugged and fixed
with hope you find some interest in it:
data test;
length father $10 mother $10 child $10 tst $3
address $40; /* addapt length to max expected */
retain father mother address no_of_childs;
infile datalines truncover;
input a_line $80.;
/* check is there an address */
tst = scan(a_line,3);
if tst ne ' ' then do;
if _N_ > 1 then do;
if no_of_childs = 0 then do;
child= ' ';
child_age = .;
output;
end;
end;
a_line = compbl(a_line);
father = scan(a_line,1);
mother = scan(a_line,2);
add_start = index(a_line, scan(a_line, 3));
address = substr(a_line, add_start);
no_of_childs = 0;
return;
end;
child = scan(a_line,1);
child_age = scan(a_line,2);
output;
no_of_childs +1;
drop a_line no_of_childs add_start;
datalines;
Harry Mary 123 Vermont Ave.
Joe 3
Lisa 2
Felix Laura 6 Baltic Drive
Karen 12
Thomas Felicia 424 Suffolk place
Edward Bella 666 Vampire Lane
Jeff 9
Jared 7
Tim 11
;
run;
How about this?
data want;
infile cards truncover eof=done;
input @1 father $9. mother $10. address $40.;
do _N_=1 by 1 until(0);
input child $9. age ?? 3.0 @@;
if missing(age) then leave;
output;
input;
end;
done: /* This is to avoid the "missing card" message in log */
if _N_=1 then do; /* output when no children read */
call missing(Child); /* Father variable has been read as Child */
output;
end;
cards;
Harry Mary 123 Vermont Ave.
Joe 3
Lisa 2
Felix Laura 6 Baltic Drive
Karen 12
Thomas Felicia 424 Suffolk place
Edward Bella 666 Vampire Lane
Jeff 9
Jared 7
Tim 11
;
run;
Sorry, I overdid that a bit. "do _N_=1 by 1;" is enough. It is just that I started out with "do until(0);" (meaning "do forever", or at least until you reach a LEAVE or DELETE statement), but then I found I that I needed the loop counter (_N_) to identify couples with no children. And I forgot to remove the "until(0)" bit.
"do _N_=1 by 1;" means the same, but with a counter, like "do _N_=1 to infinity;".
If your infile is a real file, and not a Cards statement, the DO UNTIL construct can be used like this (CARDS statement does not accept the END= option):
data want;
infile <filename here!> truncover end=done;
input @1 father $9. mother $10. address $40.;
do _N_=1 by 1 until(done);
input child $9. age ?? 3.0 @@;
if missing(age) then leave;
output;
input;
end;
if _N_=1 then do; /* output when no children read */
call missing(Child);
output;
end;
run;
Which will make the program a little bit simpler.
data want;
input;
length Father Mother Address $ 100;
retain Father Mother Address ;
temp=scan(_infile_,2,' ');
if not missing(temp) and notdigit(strip(temp)) then do;
Father=scan(_infile_,1,' ');
Mother=scan(_infile_,2,' ');
call scan(_infile_,2,p,l,' ');
Address=left(substr(_infile_,p+l));
end;
else do;
Child=scan(_infile_,1,' ');
Age=scan(_infile_,2,' ');
end;
drop temp p l;
cards;
Harry Mary 123 Vermont Ave.
Joe 3
Lisa 2
Felix Laura 6 Baltic Drive
Karen 12
Thomas Felicia 424 Suffolk place
Edward Bella 666 Vampire Lane
Jeff 9
Jared 7
Tim 11
;
run;
data final_want;
set want;
by Father Mother Address notsorted;
if missing(child) and not (first.Address and last.Address) then delete;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.