Hi dear programmers,
I have a dataset of over 100,000 rows and three variables. The variables are;
STRUCTURE_NUMBER_008
SUPERSTRUCTURE_COND_059, and
Inspection_Year
The SUPERSTRUCTURE_COND_059 gets value from 9 to 0 and any value can be repeated, STRUCTURE_NUMBER_008 is the ID for a bridge, and the inspecto_year is self explanatory and get value from 1992 to 2017.
STRUCTURE_NUMBER_008 can enter and leave any time between 1992 to 2017.
The dataset is sorted based on STRUCTURE_NUMBER_008 and Inspection_Year.
I would like to generate a frequency table of STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 condional to the following criteria.
If the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts any row except the last reading of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 I want a variable call it CENSOR to be generated as 1, and if the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts the last row of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 the variable CENSOR to be generated as 0.
I would be grateful of your time and help for writing a code to do the job.
Please let me know if more explanation or if the data set is required.
Thanks
You will get a better and likely quicker result if you provide a small example data set with data similar to your actual data and what you want for the result, in actual values, given that example data set.
@ballardw, Thank you for asking more clarification. I am attaching the dataset and the following is an example for what I want.
Based on the attached image, there are two values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008 =00000000000H311. That is, SUPERSTRUCTURE_COND_059=9 which is repeated two times, and SUPERSTRUCTURE_COND_059=8 which is repeated 22 times as shown in Frequency Count column. I want to generate a variable called CENSOR to have a value of 1 for SUPERSTRUCTURE_COND_059=9 with Frequency_count =2, and CENSOR=0 for SUPERSTRUCTURE_COND_059=8 with Frequenency_count=22 becuase SUPERSTRUCTURE_COND_059=8 is the last observation for STRUCTURE_NUMBER_008=00000000000H311.
Similarly, There are three values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008=00000000000P403. That is, 6 which is repeated 13 times, 5 which is repeated 10 times, and 3 which is repeated 1 time as shown in Frequency_count column. Here, I want the variable CENSOR=1 for SUPERSTRUCTURE_COND_059=6 and SUPERSTRUCTURE_COND_059=5, and CENSOR =0 for SUPERSTRUCTURE_COND_059=3 because the SUPERSTRUCTURE_COND_059=3 is the last observation for STRUCTURE_NUMBER_008=00000000000P403.
The attached is the dataset that I want to work with.
Thanks,
I have attached the file that I created the frequncy_column, but the original file is psc_post02 that I referred in my first post.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.