Hi all,
Here is the scenario from the attached dataset. The variables are STRUCTURE_ID, YEAR_BUILT, YEAR_RECONSTRUCTED, Inspection_Year, and CONDITION. Each structure is inspected as shown in the Inspection_Year column. The data are sorted based on the STRUCTURE_ID and Inspection_Year. I would like to add another column Structure_Age based on the following conditions.
If the structure is not reconstructed (YEAR_RECONSTRUCTED=0) or CONDITION data before reconstruction is not available then;
Structure_Age = Inspection_Year - YEAR_BUILT
If the structure is reconstructed, I want to see if the CONDITION is improved (higher CONDITION value than the year before YEAR_RECONSTRUCTED in CONDITION column. then,
Structure_Age = Inspection_Year - YEAR_RECONSTRUCTED
If the structure is reconstructed but the CONDITION is not improved (same CONDITION value as the year before YEAR_RECONSTRUCTED in CONDITION column. then,
Structure_Age = Inspection_Year - YEAR_BUILT
data want;
input STRUCTURE_ID $ YEAR_BUILT YEAR_RECONSTRUCTED INSPECTION YEAR CONDITION $ STRUCTURE_AGE;
cards;
/*STRUCTURE_AGE is 1983-1958 = 25 for CONDITION 5 below. */
1 1958 0 1983 5 25;
/*STRUCTURE_AGE is 1984-1958 = 26 for CONDITION 4 below.
The rest of CONDITION 4 are 0, I dont need it. */
1 1958 0 1984 4 26;
1 1958 0 1985 4 0;
1 1958 0 1986 4 0;
1 1958 0 1987 4 0;
1 1958 0 1988 4 0;
/* The structure is reconstructed in 1992 and CONDITION is improved in 1993 inspection
and it is different than 1988 CONDITION 4, therefore STRUCTIRE_AGE is 1993-1992 =1 as below.
For the rest of CONDITION 8, STRUCTURE_AGE = 0, I dont need it. */
1 1958 1992 1993 8 1;
1 1958 1992 1994 8 0;
1 1958 1992 1995 8 0;
/* The CONDITION drops to 7 in 1996 and STRCUTURE_AGE= 1196-1992 = 4 as below.
STRUCTURE_AGE for the rest of CONDITION 7 is 0, I dont need it. if the CONDIION changes
to 6, I need STRUCTURE_AGE to be computed, if not to be 0. */
1 1958 1992 1996 7 4;
1 1958 1992 1997 7 0;
1 1958 1992 1998 7 0;
1 1958 1992 1999 7 0;
1 1958 1992 2000 7 0;
1 1958 1992 2001 7 0;
/* The CONDITION in 1983 is 6 and STRUTURE_AGE = 1983 - 1937 = 46 and for the rest
of CODNTION 6 it is 0 as I don't need it. in 1990 the CONDIION is 5 and STRUCTURE_AGE
1990 - 1937 = 53 and for the rest of CONDITION 5 it is 0 as below. */
10027 1937 0 1983 6 46;
10027 1937 0 1984 6 0;
10027 1937 0 1985 6 0;
10027 1937 0 1986 6 0;
10027 1937 0 1987 6 0;
10027 1937 0 1988 6 0;
10027 1937 0 1989 6 0;
10027 1937 0 1990 5 53;
10027 1937 0 1991 5 0;
10027 1937 0 1992 5 0;
10027 1937 0 1993 5 0;
10027 1937 0 1994 5 0;
/* The bridge is reconstructed in 1994, but CONDITION is not improved in 1995, as it is
the same CONDITION as 1994 inspection,therefore STRCUTURE_AGE is 0 as below.*/
10027 1937 1994 1995 5 0;
I appreciate your time and help!
proc sort data=inputdata;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;
data output data;
set input data;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;;
retain year_to_subtract 0;
x=lag(condition);
if first.structure_id then year_to_subtract=Year_Built;
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x then year_to_subtract = Year_reconstructed;
end;
Age=inspection_year-year_to_subtract;
if ^first.YEAR_RECONSTRUCTED then Age=0;
run;
Please provide, in a SAS data step form, how you want the output data to look.
@mmhxc5 wrote:
If the structure is not reconstructed (YEAR_RECONSTRUCTED=0) then, Structure_Age = Inspection_Year - YEAR_BUILT for each first distinct CONDITION.
If the structure is reconstructed, I wan to see if the CONDITION is improved (higher CONDITION value than the year before YEAR_RECONSTRUCTED. then, Structure_Age = Inspection_Year - YEAR_RECONSTRUCTED, for each first distinct CONDITION.
If the structure is reconstructed but the CONDITION is not improved (same CONDITION value as the year before YEAR_RECONSTRUCTED. then, Structure_Age = Inspection_Year - YEAR_BUILT, for each first distinct CONDITION.
If a bridge is reconstructed but CONDITION data before reconstruction is not available then, Structure_Age = Inspection_Year - YEAR_BUILT, because I don't know if the reconstruction was effective.
You almost alway want the structure_age "for each first distinct condition". What do you want for all the other records? This question could be answered with an output data set.
@mkeintz, thank you for your help. Please see the output I want in my original post.
proc sort data=inputdata;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;
data output data;
set input data;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;;
retain year_to_subtract 0;
x=lag(condition);
if first.structure_id then year_to_subtract=Year_Built;
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x then year_to_subtract = Year_reconstructed;
end;
Age=inspection_year-year_to_subtract;
if ^first.YEAR_RECONSTRUCTED then Age=0;
run;
Hi @smantha , I revised a little bit of your code as below and it worked.
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x and year_reconstructed ne 0 then year_to_subtract = Year_reconstructed;
Thanks for your help!
Some parts of your explanation are very fuzzy to me but will try to do the problem anyway
proc sort data=inputdata;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;
data output data;
set input data;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;;
retain year_to_subtract 0;
x=lag(condition);
if first.structure_id then year_to_subtract=Year_Built;
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x then year_to_subtract = Year_reconstructed;
end;
Age=inspection_year-year_to_subtract;
run;
@smantha wrote:
Some parts of your explanation are very fuzzy to me but will try to do the problem anyway
proc sort data=inputdata; by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year; retain year_to_subtract 0; x=lag(condition); if first.structure_id then year_to_subtract=Year_Built; if first.structure_id = 0 and first.year_reconstructed then do; if condition > x then year_to_subtract = Year_reconstructed; end; Age=inspection_year-year_to_subtract; run;
It is - at least for my internal clock - very, very early in the morning, so maybe i miss something, but afaik proc sort does not support if-statement and everything else after BY-statement.
thank you for pointing that out. Corrected my code to what is required.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.