Hi dear programmers,
I have a dataset of over 100,000 rows and three variables. The variables are;
STRUCTURE_NUMBER_008
SUPERSTRUCTURE_COND_059, and
Inspection_Year
The SUPERSTRUCTURE_COND_059 gets value from 9 to 0 and any value can be repeated, STRUCTURE_NUMBER_008 is the ID for a bridge, and the inspecto_year is self explanatory and get value from 1992 to 2017.
STRUCTURE_NUMBER_008 can enter and leave any time between 1992 to 2017.
The dataset is sorted based on STRUCTURE_NUMBER_008 and Inspection_Year.
I would like to generate a frequency table of STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 condional to the following criteria.
If the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts any row except the last reading of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 I want a variable call it CENSOR to be generated as 1, and if the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts the last row of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 the variable CENSOR to be generated as 0.
I would be grateful of your time and help for writing a code to do the job.
Please let me know if more explanation or if the data set is required.
Thanks
You will get a better and likely quicker result if you provide a small example data set with data similar to your actual data and what you want for the result, in actual values, given that example data set.
@ballardw, Thank you for asking more clarification. I am attaching the dataset and the following is an example for what I want.
Based on the attached image, there are two values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008 =00000000000H311. That is, SUPERSTRUCTURE_COND_059=9 which is repeated two times, and SUPERSTRUCTURE_COND_059=8 which is repeated 22 times as shown in Frequency Count column. I want to generate a variable called CENSOR to have a value of 1 for SUPERSTRUCTURE_COND_059=9 with Frequency_count =2, and CENSOR=0 for SUPERSTRUCTURE_COND_059=8 with Frequenency_count=22 becuase SUPERSTRUCTURE_COND_059=8 is the last observation for STRUCTURE_NUMBER_008=00000000000H311.
Similarly, There are three values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008=00000000000P403. That is, 6 which is repeated 13 times, 5 which is repeated 10 times, and 3 which is repeated 1 time as shown in Frequency_count column. Here, I want the variable CENSOR=1 for SUPERSTRUCTURE_COND_059=6 and SUPERSTRUCTURE_COND_059=5, and CENSOR =0 for SUPERSTRUCTURE_COND_059=3 because the SUPERSTRUCTURE_COND_059=3 is the last observation for STRUCTURE_NUMBER_008=00000000000P403.
The attached is the dataset that I want to work with.
Thanks,
I have attached the file that I created the frequncy_column, but the original file is psc_post02 that I referred in my first post.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.