BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmhxc5
Quartz | Level 8

Hi,

I have a dataset of over 3000 observations for bridges inspected since 1992 -2017. Each bridge is inspected once each year. I would like to delete the first five years and last five years of inspection result for each bridge. Chosing STRUCTURE_NUMBER_008 or Inspection_Year might work as the variable to delete, because the beginning and last year of inspection is not the same for all bridges. The attached image shows two bridge IDs sorted by STRUCTURE_NUMBER_008 (First column) and Inspection_Year(Last column). STRUCTURE_NUMBER_008 is random name given for each bridge. I would be glad if anyone could help me to do this task. I use SAS 9.4 in Windows 10 and I have the dataset as permanent SAS files.

Thanks,

Capture.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Please follow the comments and logic and see if this helps

 

/*Creating a sample data with 120 bridges with each bridge having 25 observations*/
data sample;
do bridge=1 to 120;
	do var=1 to 25;
	output;
	end;
end;
run;

/*Output */
data want;
/*1st loop count the number of records in each bridge*/
do n=1 by 1 until(last.bridge);
set sample;
by bridge;
end;
/*2nd loop remove first five and last five observation for each bridge 
and output the rest*/
do _n_=1 by 1 until(last.bridge);
set sample;
by bridge;
if 6<=_n_<= n-5 then output;
end;
run;

View solution in original post

10 REPLIES 10
novinosrin
Tourmaline | Level 20
data want;
do p=6 to nobs-5;
   set have nobs=nobs point=p;
    output;
end;
stop;
run;
mmhxc5
Quartz | Level 8

Thank you for you response. Your code only removes the first five and last five observation of the whole dataset which is over 3000 observations.

What I want is to remove first five and last five observation for each bridge which has 25 observation each. The dataset of 120 bridges makes the 3000 observations. I only want to retain the middle 15 observation for each bridge. The STRUCTURE _NUMBER_008 is a mix of character and numeric. The dataset is ordered based on the STRUCTURE_NUMBER_008 and Inspection_Year.

Thanks,

novinosrin
Tourmaline | Level 20

Please follow the comments and logic and see if this helps

 

/*Creating a sample data with 120 bridges with each bridge having 25 observations*/
data sample;
do bridge=1 to 120;
	do var=1 to 25;
	output;
	end;
end;
run;

/*Output */
data want;
/*1st loop count the number of records in each bridge*/
do n=1 by 1 until(last.bridge);
set sample;
by bridge;
end;
/*2nd loop remove first five and last five observation for each bridge 
and output the rest*/
do _n_=1 by 1 until(last.bridge);
set sample;
by bridge;
if 6<=_n_<= n-5 then output;
end;
run;

mmhxc5
Quartz | Level 8

Thank you for your timely reply. I really appreciate it. Based on your code I created a "Sample" data, then I defined the "want" and also the set "sample" to the end of your code. How can I link this code to the Bridge dataset I have? I mean, which part of your code will call the bridge dataset which has all the data that I want to process?

 

 

 

Ksharp
Super User

It is forwards and backwards tricks .

 

data sample;
do bridge=1 to 10;
	do var=1 to 25;
	output;
	end;
end;
run;

data want;
 merge sample sample(firstobs=6 keep=bridge rename=(bridge=_bridge));
 if bridge ne lag5(bridge) or bridge ne _bridge then delete;
 drop _bridge;
 run;

proc print;run;
mmhxc5
Quartz | Level 8

Thank you for your timely reply. I really appreciate it. There is only two data calling in your code,  "sample" and "want". How can I link this code to the Bridge dataset I have? I mean, which part of your code will call the bridge dataset which has all the data that I want to process?

Ksharp
Super User

‘Sample' dataset is 'Bridge' dataset ,according to Art.T 's code.

art297
Opal | Level 21

Both @novinosrin and @Ksharp created a file called sample to represent your bridge dataset. They did not expect you to create a sample dataset but, rather, replace it with your actual bridge dataset.

 

However, you have to decide whether you want to drop the first and last 5 records for each structure_no_008 OR, for each structure_no_008, you want to drop any inspection_year that was less than the earliest+4 and/or greater than the last inspection_year -4.

 

My suggested code does the latter:

 

data bridge (drop=structure_no var);
  do structure_no=1 to 120;
    structure_no_008=catt('000000000000H',put(structure_no,z3.));
    inspection_year=1996;
    do var=1 to 25;
      inspection_year+1;
      if mod(structure_no,40) eq 0 then do;
        if inspection_year ne 2000 then output;
      end;
      else output;
    end;
  end;
run;

/*Output */
data want (drop=start last);
/*1st loop count the number of records in each bridge*/
  do until(last.structure_no_008);
    set bridge;
    by structure_no_008;
    if first.structure_no_008 then start=inspection_year;
    else if last.structure_no_008 then last=inspection_year;
  end;
/*2nd loop only keep inspection_years if they are greater than the
  earliest inspection_year+4 and lower than latest inspection-4*/
  do until(last.structure_no_008);
    set bridge;
    by structure_no_008;
    if start+5 <=inspection_year<=last-5 then output;
  end;
run;

Art, CEO, AnalystFinder.com

 

mmhxc5
Quartz | Level 8

Thank you @novinosrin, @Ksharp, @Rick_SAS, and @art297 for your help. Appreciated. @art297 your clarification was very helpful.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 5849 views
  • 6 likes
  • 5 in conversation