BookmarkSubscribeRSS Feed
mmhxc5
Quartz | Level 8

Hi dear programmers,

I have a dataset of over 100,000 rows and three variables. The variables are;

STRUCTURE_NUMBER_008

SUPERSTRUCTURE_COND_059, and

Inspection_Year

 

The SUPERSTRUCTURE_COND_059 gets value from 9 to 0 and any value can be repeated, STRUCTURE_NUMBER_008 is the ID for a bridge, and the inspecto_year is self explanatory and get value from 1992 to 2017. 

STRUCTURE_NUMBER_008 can enter and leave any time between 1992 to 2017.

The dataset is sorted based on STRUCTURE_NUMBER_008 and Inspection_Year.

 

I would like to generate a frequency table of  STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 condional to the following criteria.

 

If the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts any row except the last reading of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 I want a variable call it CENSOR to be generated as 1, and if the frequency STRUCTURE_NUMBER_008*SUPERSTRUCTURE_COND_059 counts the last row of STRUCTURE_NUMBER_008 and SUPERSTRUCTURE_COND_059 the variable CENSOR to be generated as 0.

 

I would be grateful of your time and help for writing a code to do the job.

 

Please let me know if more explanation or if the data set is required.

Thanks

2 REPLIES 2
ballardw
Super User

You will get a better and likely quicker result if you provide a small example data set with data similar to your actual data and what you want for the result, in actual values, given that example data set.

mmhxc5
Quartz | Level 8

@ballardw, Thank you for asking more clarification. I am attaching the dataset and the following is an example for what I want.

Capture.JPG

Based on the attached image, there are two values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008 =00000000000H311. That is, SUPERSTRUCTURE_COND_059=9 which is repeated two times, and SUPERSTRUCTURE_COND_059=8 which is repeated 22 times as shown in Frequency Count column. I want to generate a variable called CENSOR to have  a value of 1 for SUPERSTRUCTURE_COND_059=9 with Frequency_count =2, and CENSOR=0 for SUPERSTRUCTURE_COND_059=8 with Frequenency_count=22 becuase SUPERSTRUCTURE_COND_059=8 is the last observation for  STRUCTURE_NUMBER_008=00000000000H311. 

 

Similarly, There are three values for SUPERSTRUCTURE_COND_059 of STRUCTURE_NUMBER_008=00000000000P403. That is, 6 which is repeated 13 times, 5 which is repeated 10 times, and 3 which is repeated 1 time as shown in Frequency_count column. Here, I want the variable CENSOR=1 for SUPERSTRUCTURE_COND_059=6 and SUPERSTRUCTURE_COND_059=5, and CENSOR =0 for SUPERSTRUCTURE_COND_059=3 because the SUPERSTRUCTURE_COND_059=3 is the last observation for  STRUCTURE_NUMBER_008=00000000000P403.

 

The attached is the dataset that I want to work with.

Thanks,

 

I have attached the file that I created the frequncy_column, but the original file is psc_post02 that I referred in my first post.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 859 views
  • 0 likes
  • 2 in conversation