BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi all,

I have data set of three columns, name, year and age. My purpose is to fill up age record based on year.

The data set looks like,

Name year age
A 1998
A 1999
A 2000 44
A 2001
B 1996 36
B 1997
B 1998
..........................
C 1999
C 2000 38
C 2002
D 1994 33
D 1995 34
D 1996
D 1997


Here is the thing. For a specific person, I can always identify his/her age by year by one or more records, like A, if he is 44 in 2000, then in 1998 and 1999, he should be 42 and 43, respectively.
The problem is that; the identifier (44 in A's case) can show up irregularly. In A's case, it shows up at the 3rd record; in B's case, the age identifier shows up at the 1st place.
Another problem is that, year record can skip. In C's case, it goes from 1999, 2000, and then skip 2001, jumping to 2002 directly.

Basically, I want my data look like,

Name year age
A 1998 42
A 1999 43
A 2000 44
A 2001 45
B 1996 36
B 1997 37
B 1998 38
..........................
C 1999 37
C 2000 38
C 2002 39
D 1994 33
D 1995 34
D 1996 35
D 1997 36

Please help and many thanks!
8 REPLIES 8
Vasile01
Fluorite | Level 6
Hi,

You provided a part of the desired outcome as:
.........................
C 1999 37
C 2000 38
C 2002 39

Is that correct?
Warm regards,
Vasile
deleted_user
Not applicable
My bad.

It should be 40 for 2002.
Vasile01
Fluorite | Level 6
Hi,

Here it is an idea using SQL statements:


proc sql;

create table temp as
select name, year-age as yb from person
where age is not missing;

create table temp2 as
select distinct name, yb from temp;

update person
set age=year-(select yb from temp2
where name=person.name
)
where age is missing
;
quit;

Warm regards,
Vasile
Ksharp
Super User
Or the following code is more readable ,But it is not suited for large dataset.


[pre]
data test;
infile datalines missover;
input Name $ year age;
datalines;
A 1998
A 1999
A 2000 44
A 2001
B 1996 36
B 1997
B 1998
C 1999
C 2000 38
C 2002
D 1994 33
D 1995 34
D 1996
D 1997
;
run;
proc sort data=test;
by name;
run;
data result;
merge test test(rename=(year=_year age=_age) where=(_age is not missing));
by name;
if missing(age) then age=_age + (year - _year);
drop _:;
run;

[/pre]




Ksharp
deleted_user
Not applicable
hello,

I changed my code as you made a correction to the output data:

[pre]
data out;
merge test (drop=age) test(where=(^missing(_age)) rename=(age=_age year=_year));
by name;

retain born;

if first.name then do born=_year-_age;
age=year-born;
end;

else age=year-born;

drop born _age _year;

run;
[/pre]

Marius
Ksharp
Super User
Hi.
deleted_user
Not applicable
Hello,

On solution may be merging the data with itself:

[pre]
data test;
infile datalines missover;
input Name $ year age;
datalines;
A 1998
A 1999
A 2000 44
A 2001
B 1996 36
B 1997
B 1998
C 1999
C 2000 38
C 2002
D 1994 33
D 1995 34
D 1996
D 1997
;

data out;
merge test (drop=age) test(where=(^missing(_age)) rename=(age=_age year=_year));
by name;
if first.name then do born=_year-_age;
age=year-born;
end;

else age+1;

drop born _age _year;

run;
[/pre]

Marius
Ksharp
Super User
It looks like Cartesian Product is helpful for you.


[pre]
data test;
infile datalines missover;
input Name $ year age;
datalines;
A 1998
A 1999
A 2000 44
A 2001
B 1996 36
B 1997
B 1998
C 1999
C 2000 38
C 2002
D 1994 33
D 1995 34
D 1996
D 1997
;
run;
data result;
declare hash hh(hashexp: 10 );
hh.definekey('name');
hh.definedata('_year','_age');
hh.definedone();

do until(last);
set test(rename=(year=_year age=_age)) end=last;
if not missing(_age) then hh.replace();
end;

do until(_last);
set test end=_last;
call missing(_year,_age);
hh.find();
if missing(age) then age=_age + ( year - _year);
output;
end;
drop _:;
run;

[/pre]



Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1018 views
  • 0 likes
  • 3 in conversation