BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ManitobaMoose
Quartz | Level 8

Hi,

 

I am learning from SAS by Example. Here is a relatively simple problem that  I do not have the answer for, but someone on here probably knows.

 

Essentially, each student has taken up to 5 quizzes, with a score from 1 to 5, but some quizzes are missing for some students. I have learned from Learning SAS by Example how to exclude Max Min, mean etc. calculations using the N function if, for example, the total number of missing quizzes is more than 7. However, what if I want to exclude calculating the max, min, mean, etc for any given student who hasn't taken at least 4 quizes (ie. who has 2 or more quizzes missing). .

 

All my programming is below. Any help would be very much appreciated. Thanks!

 

----------------------------------------------------

Here is the original data file:

----------------------------------------

libname SASdata '/folders/myfolders/SASData' ;

Data SASData.surveyMy ;
Input Subj$ Gender$ Age Income Q1 Q2 Q3 Q4 Q5 ;

Datalines ;
001 M 23 28000 1 . 1 2 3
002 F 55 76123 4 5 2 1 1
003 M 38 36500 2 2 2 2 1
004 F 67 128000 4 3 2 2 4
005 M 22 23060 3 3 3 4 2
006 M 63 90000 2 3 5 4 3
007 F 45 76100 5 . . 3 3
;

 

As you can see, there are 7 total missing quizzes. Student 007 has two missing quizzes.

 

Here is the data that calculates max, min, and mean only if a total of 7 or less quizzes is missing:

libname sasdata '/folders/myfolders/SASData' ;

Data Prob11_4 ;
    Set sasdata.surveymy ;
    If n(of Q1-Q5) lt 7 then Maxscore = Max(of Q1-Q5) ;
    If n(of Q1-Q5) lt 7 then SecondHiScore = Max(Q1-Q5, 2) ;
    If n(of Q1-Q5) lt 7 then Minscore = Min(of Q1-Q5) ;
    If n(of Q1-Q5) lt 7 then QuesAve = Mean(of Q1-Q5) ;
run ;

Now, how do I write code which calculates the Max, Min, and Mean for all students EXCEPT a student which has 2 missing quizzes. In this example, that is student 007.

 

Thanks for your help!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Instead of runing the same IF test 4 times, run it once.  If the test is successful (NMISS>=2) then do the 4 calculations inside a "do group":

 

    if nmiss(of Q1-Q5)  ge 2 then do;
      Maxscore = Max(of Q1-Q5) ;
      SecondHiScore = Max(Q1-Q5, 2) ;
      Minscore = Min(of Q1-Q5) ;
      QuesAve = Mean(of Q1-Q5) ; 
    end;

 

But your second stat   "SecondHiScore=max(q1-a5,2)" is meaningless.  It just adds the value 2 to your list of values for X1 through X5  Take a look at the documentation on the ORDINAL function.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
ManitobaMoose
Quartz | Level 8

Sorry. I meant to say, "As you can see. there are 3 missing quizzes". (not 7).

ballardw
Super User

Perhaps you want to use NMISS, the number of missing to exclude.

I was very curious about about why you used n(of Q1-Q5) lt 7. Since there are only 5 variables the number of nonmissing values is always less than 7.

 

Provide your exact rule for when not to calculate some statistic. One example doesn't provide a rule. Do you require 3 , 4 or 5 values to calculate a valid mean? What value to you want if the mean isn't valid? Missing or some other code value?

 

You may also be interested in the Largest function.

largest(2,q1,q2,q3,q4,q5) will return the second largest value. Max with the 2 as you used would return 2 if all of the Q values were 1.

ManitobaMoose
Quartz | Level 8

Hi,

 

Thanks for answering.

 

The reason I used n(of Q1-Q5) lt 7. is that I was following the problem in the Book Learning SAS by example. The great frustration I am having with that book is that it used to come with a CD that contained the practice problems that are referred to in the book, but no longer does. I downloaded files associated with the book from th internet, but that link only contains a few of the problems in the book, and so in other cases I am using other datasets that came with the downloadable link to substitute for those not available to me. I didn't realize that the lt n(of Q1-Q5) lt 7 was referring to the number of errors per observation. I thought that meant the total for all observations. Now it makes sense to use a n(of Q1-Q5) lt 2 to exclude more than 2 missing data points per observation.

 

I did switch out the largest function in place of the max function. It appears the max function only takes the max, not the second or third etc largest.

 

IF ANYONE KNOWS WHERE I CAN FIND THE LINK FOR LEARNING SAS BY EXAMPLE THAT INCLUDES ALL THE DATA SETS THAT WERE SUPPOSED TO COME WITH THE BOOK THAT WOULD BE EXTREMELY HELPFUL.  I have the book with the red cover.  It says it was originally published in 2007.

 

Thanks!

ManitobaMoose
Quartz | Level 8

This is the website I used to download the files. I would say that less than 25% of the files referred to in the book problems can be found at this link:

 

https://support.sas.com/publishing/authors/extras/60864_code.pdf

Reeza
Super User

https://support.sas.com/downloads/package.htm?pid=1709#

 

There's 120 files of various types in the link above. 

 

The link you referenced had 58 files.

mkeintz
PROC Star

Instead of runing the same IF test 4 times, run it once.  If the test is successful (NMISS>=2) then do the 4 calculations inside a "do group":

 

    if nmiss(of Q1-Q5)  ge 2 then do;
      Maxscore = Max(of Q1-Q5) ;
      SecondHiScore = Max(Q1-Q5, 2) ;
      Minscore = Min(of Q1-Q5) ;
      QuesAve = Mean(of Q1-Q5) ; 
    end;

 

But your second stat   "SecondHiScore=max(q1-a5,2)" is meaningless.  It just adds the value 2 to your list of values for X1 through X5  Take a look at the documentation on the ORDINAL function.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2725 views
  • 2 likes
  • 4 in conversation