- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
Here is the scenario from the attached dataset. The variables are STRUCTURE_ID, YEAR_BUILT, YEAR_RECONSTRUCTED, Inspection_Year, and CONDITION. Each structure is inspected as shown in the Inspection_Year column. The data are sorted based on the STRUCTURE_ID and Inspection_Year. I would like to add another column Structure_Age based on the following conditions.
If the structure is not reconstructed (YEAR_RECONSTRUCTED=0) or CONDITION data before reconstruction is not available then;
Structure_Age = Inspection_Year - YEAR_BUILT
If the structure is reconstructed, I want to see if the CONDITION is improved (higher CONDITION value than the year before YEAR_RECONSTRUCTED in CONDITION column. then,
Structure_Age = Inspection_Year - YEAR_RECONSTRUCTED
If the structure is reconstructed but the CONDITION is not improved (same CONDITION value as the year before YEAR_RECONSTRUCTED in CONDITION column. then,
Structure_Age = Inspection_Year - YEAR_BUILT
data want;
input STRUCTURE_ID $ YEAR_BUILT YEAR_RECONSTRUCTED INSPECTION YEAR CONDITION $ STRUCTURE_AGE;
cards;
/*STRUCTURE_AGE is 1983-1958 = 25 for CONDITION 5 below. */
1 1958 0 1983 5 25;
/*STRUCTURE_AGE is 1984-1958 = 26 for CONDITION 4 below.
The rest of CONDITION 4 are 0, I dont need it. */
1 1958 0 1984 4 26;
1 1958 0 1985 4 0;
1 1958 0 1986 4 0;
1 1958 0 1987 4 0;
1 1958 0 1988 4 0;
/* The structure is reconstructed in 1992 and CONDITION is improved in 1993 inspection
and it is different than 1988 CONDITION 4, therefore STRUCTIRE_AGE is 1993-1992 =1 as below.
For the rest of CONDITION 8, STRUCTURE_AGE = 0, I dont need it. */
1 1958 1992 1993 8 1;
1 1958 1992 1994 8 0;
1 1958 1992 1995 8 0;
/* The CONDITION drops to 7 in 1996 and STRCUTURE_AGE= 1196-1992 = 4 as below.
STRUCTURE_AGE for the rest of CONDITION 7 is 0, I dont need it. if the CONDIION changes
to 6, I need STRUCTURE_AGE to be computed, if not to be 0. */
1 1958 1992 1996 7 4;
1 1958 1992 1997 7 0;
1 1958 1992 1998 7 0;
1 1958 1992 1999 7 0;
1 1958 1992 2000 7 0;
1 1958 1992 2001 7 0;
/* The CONDITION in 1983 is 6 and STRUTURE_AGE = 1983 - 1937 = 46 and for the rest
of CODNTION 6 it is 0 as I don't need it. in 1990 the CONDIION is 5 and STRUCTURE_AGE
1990 - 1937 = 53 and for the rest of CONDITION 5 it is 0 as below. */
10027 1937 0 1983 6 46;
10027 1937 0 1984 6 0;
10027 1937 0 1985 6 0;
10027 1937 0 1986 6 0;
10027 1937 0 1987 6 0;
10027 1937 0 1988 6 0;
10027 1937 0 1989 6 0;
10027 1937 0 1990 5 53;
10027 1937 0 1991 5 0;
10027 1937 0 1992 5 0;
10027 1937 0 1993 5 0;
10027 1937 0 1994 5 0;
/* The bridge is reconstructed in 1994, but CONDITION is not improved in 1995, as it is
the same CONDITION as 1994 inspection,therefore STRCUTURE_AGE is 0 as below.*/
10027 1937 1994 1995 5 0;
I appreciate your time and help!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sort data=inputdata;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;
data output data;
set input data;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;;
retain year_to_subtract 0;
x=lag(condition);
if first.structure_id then year_to_subtract=Year_Built;
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x then year_to_subtract = Year_reconstructed;
end;
Age=inspection_year-year_to_subtract;
if ^first.YEAR_RECONSTRUCTED then Age=0;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please provide, in a SAS data step form, how you want the output data to look.
@mmhxc5 wrote:
If the structure is not reconstructed (YEAR_RECONSTRUCTED=0) then, Structure_Age = Inspection_Year - YEAR_BUILT for each first distinct CONDITION.
If the structure is reconstructed, I wan to see if the CONDITION is improved (higher CONDITION value than the year before YEAR_RECONSTRUCTED. then, Structure_Age = Inspection_Year - YEAR_RECONSTRUCTED, for each first distinct CONDITION.
If the structure is reconstructed but the CONDITION is not improved (same CONDITION value as the year before YEAR_RECONSTRUCTED. then, Structure_Age = Inspection_Year - YEAR_BUILT, for each first distinct CONDITION.
If a bridge is reconstructed but CONDITION data before reconstruction is not available then, Structure_Age = Inspection_Year - YEAR_BUILT, because I don't know if the reconstruction was effective.
You almost alway want the structure_age "for each first distinct condition". What do you want for all the other records? This question could be answered with an output data set.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mkeintz, thank you for your help. Please see the output I want in my original post.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sort data=inputdata;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;
data output data;
set input data;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;;
retain year_to_subtract 0;
x=lag(condition);
if first.structure_id then year_to_subtract=Year_Built;
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x then year_to_subtract = Year_reconstructed;
end;
Age=inspection_year-year_to_subtract;
if ^first.YEAR_RECONSTRUCTED then Age=0;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @smantha , I revised a little bit of your code as below and it worked.
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x and year_reconstructed ne 0 then year_to_subtract = Year_reconstructed;
Thanks for your help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Some parts of your explanation are very fuzzy to me but will try to do the problem anyway
proc sort data=inputdata;
by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;data output data;set input data;by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year;;
retain year_to_subtract 0;
x=lag(condition);
if first.structure_id then year_to_subtract=Year_Built;
if first.structure_id = 0 and first.year_reconstructed then do;
if condition > x then year_to_subtract = Year_reconstructed;
end;
Age=inspection_year-year_to_subtract;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@smantha wrote:
Some parts of your explanation are very fuzzy to me but will try to do the problem anyway
proc sort data=inputdata; by STRUCTURE_ID YEAR_BUILT YEAR_RECONSTRUCTED Inspection_Year; retain year_to_subtract 0; x=lag(condition); if first.structure_id then year_to_subtract=Year_Built; if first.structure_id = 0 and first.year_reconstructed then do; if condition > x then year_to_subtract = Year_reconstructed; end; Age=inspection_year-year_to_subtract; run;
It is - at least for my internal clock - very, very early in the morning, so maybe i miss something, but afaik proc sort does not support if-statement and everything else after BY-statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
thank you for pointing that out. Corrected my code to what is required.