BookmarkSubscribeRSS Feed
dustychair
Pyrite | Level 9

Hi all,

I want to select when itemstatus_01 and itemstatus_02...itemstatus_48 are all  "O". The problem is number of itemstatus variable changes for each grade. For example for grade 5 itemstatus_45 variable is blank since grade 5 kids did not take itemstatus_45 however grade 6 kids took it. Any ideas?

Thanks

 

 

 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

So when itemstatus_45 is blank, and the rest are 'O', do you want to select it or not?

--
Paige Miller
dustychair
Pyrite | Level 9

Hi @PaigeMiller  Thank you for your respones.

The data set is something like that:

grade itemstatus_01 itemstatus_02 itemstatus_45

5 O O

5 O F

6 F F O

6 O O O

6 F O O 

so I want to select when grade is 5 and all itemstatus variables are "O". For grade 5 it is up to itemstatus_44; for grade 6 it is 48. I am trying to write a loop but I could not.

ballardw
Super User

How many different rules for the number of "items" are to be processed?

Are all of these "items" all always exactly one character long? I ask because the shortest code will work if "O" does not also appear in a composite form such as "OOP".

 

Here is one way that may get you started. Since you did not provide any example I made a small set with 5 variables to demonstrate ways to count if the 'O' is not duplicated in any variable.

data junk;
   informat i1 - i6 $1.;
   input i1 - i6;
   num_0 = countc(cats(of i:),'O');
   num_0_2= countc(cats(of i1-i3,i5),'O');
datalines;
O O O O O O
O A O A O A
;

The above code uses the CATS function to combine the variables into a single string and then uses the CountC function to see how many times the 'O' appears which is why knowledge of actual codes used is important. If 'OP' is legal then the above does not work as it would count that as one of the 'O' values.

The Cats function will allow use of variable lists when the key word "of" is used. The colon following part of variable name says "use all variable names that start with" up to the colon. You would use Itemstatus_: for that purpose. The second shows a limited sequential list "of i1-i3" uses i1,12 and 13. You can add other individual variables or other lists.

 

The question about grade 5 vs 6 may not be important at the counting step as the Itemstatus_: list with the CATS function means that blanks get excluded. The question would be LATER when you are using that count with different grades.

mkeintz
PROC Star

So for grade 5 you want to detect 44 consecutive O's in itemstatus_01 through itemstatus_44.

And for grade 6 you want to detect 48 consecutive O's in itemstatus_01 through itemstatus_48.

 

You can concatenate the values of itemstatus_01 through itemstatus_44 (or 48) and test against a constant string value of 44 (or 48) O's:

 

data mysample (drop=_:);
  retain _qualify5 "OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO";      **44 letter O's;
  retain _qualify6 "OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO0000";  **48 letter O's;
  set have;
  if (grade=5 and cats(of itemstatus_01-itemstatus_44)=_qualify5)
  or (grade=6 and cats(of itemstatus_01-itemstatus_48)=_qualify6);
run;

If you're worried about typing exactly correct number of O's, then:

 

data mysample (drop=_:);
  retain _qualify5   "%sysfunc(repeat(O,43))";   **44 letter O's;
  retain _qualify6   "%sysfunc(repeat(O,47))";   **48 letter O's;
  set have;
  if (grade=5 and cats(of itemstatus_01-itemstatus_44)=_qualify5)
  or (grade=6 and cats(of itemstatus_01-itemstatus_48)=_qualify6);
run;

Note: to produce N x's, you have to use N-1 as the repetition count in the repeat function.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

Here's my attempt. I am using variables named Y1 Y2 etc instead of typing the much longer variable names you have. Flag=1 indicates a value that was not either 'O' or missing was found. Flag=0 indicates all variables have 'O' or missing.

 

data want;
    set have;
    array y y:;
    flag=0;
    do i=1 to dim(y);
        if y(i) not in ('O',' ') then do;
            flag=1;
            leave;
        end;
    end;
    drop i;
run;
--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1219 views
  • 2 likes
  • 4 in conversation