BookmarkSubscribeRSS Feed
mmhxc5
Quartz | Level 8

Hi dear programmers,

I have a dataset of over 100,000 rows and three variables. The variables are;

STRUCTURE_NUMBER_008

SUPERSTRUCTURE_COND_059, and

Inspection_Year

 

The SUPERSTRUCTURE_COND_059 gets value from 9 to 0 and any value can be repeated, STRUCTURE_NUMBER_008 is the ID for a bridge, and the inspecto_year is self explanatory and get value from 1992 to 2017. 

STRUCTURE_NUMBER_008 can enter and leave any time between 1992 to 2017.

The dataset is sorted based on STRUCTURE_NUMBER_008 and Inspection_Year.

 

I would like to generate a frequency table of  STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 condional to the following criteria.

 

If the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts any row except the last reading of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 I want a variable call it CENSOR to be generated as 1, and if the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts the last row of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 the variable CENSOR to be generated as 0.

 

I would be grateful of your time and help for writing a code to do the job.

 

Please let me know if more explanation or if the data set is required.

Thanks

2 REPLIES 2
ballardw
Super User

You will get a better and likely quicker result if you provide a small example data set with data similar to your actual data and what you want for the result, in actual values, given that example data set.

mmhxc5
Quartz | Level 8

@ballardw, Thank you for asking more clarification. I am attaching the dataset and the following is an example for what I want.

Capture.JPG

Based on the attached image, there are two values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008 =00000000000H311. That is, SUPERSTRUCTURE_COND_059=9 which is repeated two times, and SUPERSTRUCTURE_COND_059=8 which is repeated 22 times as shown in Frequency Count column. I want to generate a variable called CENSOR to have  a value of 1 for SUPERSTRUCTURE_COND_059=9 with Frequency_count =2, and CENSOR=0 for SUPERSTRUCTURE_COND_059=8 with Frequenency_count=22 becuase SUPERSTRUCTURE_COND_059=8 is the last observation for  STRUCTURE_NUMBER_008=00000000000H311. 

 

Similarly, There are three values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008=00000000000P403. That is, 6 which is repeated 13 times, 5 which is repeated 10 times, and 3 which is repeated 1 time as shown in Frequency_count column. Here, I want the variable CENSOR=1 for SUPERSTRUCTURE_COND_059=6 and SUPERSTRUCTURE_COND_059=5, and CENSOR =0 for SUPERSTRUCTURE_COND_059=3 because the SUPERSTRUCTURE_COND_059=3 is the last observation for  STRUCTURE_NUMBER_008=00000000000P403.

 

The attached is the dataset that I want to work with.

Thanks,

 

I have attached the file that I created the frequncy_column, but the original file is psc_post02 that I referred in my first post.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 658 views
  • 0 likes
  • 2 in conversation