DATA Step, Macro, Functions and more

N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given student

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 121
Accepted Solution

N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given student

Hi,

 

I am learning from SAS by Example. Here is a relatively simple problem that  I do not have the answer for, but someone on here probably knows.

 

Essentially, each student has taken up to 5 quizzes, with a score from 1 to 5, but some quizzes are missing for some students. I have learned from Learning SAS by Example how to exclude Max Min, mean etc. calculations using the N function if, for example, the total number of missing quizzes is more than 7. However, what if I want to exclude calculating the max, min, mean, etc for any given student who hasn't taken at least 4 quizes (ie. who has 2 or more quizzes missing). .

 

All my programming is below. Any help would be very much appreciated. Thanks!

 

----------------------------------------------------

Here is the original data file:

----------------------------------------

libname SASdata '/folders/myfolders/SASData' ;

Data SASData.surveyMy ;
Input Subj$ Gender$ Age Income Q1 Q2 Q3 Q4 Q5 ;

Datalines ;
001 M 23 28000 1 . 1 2 3
002 F 55 76123 4 5 2 1 1
003 M 38 36500 2 2 2 2 1
004 F 67 128000 4 3 2 2 4
005 M 22 23060 3 3 3 4 2
006 M 63 90000 2 3 5 4 3
007 F 45 76100 5 . . 3 3
;

 

As you can see, there are 7 total missing quizzes. Student 007 has two missing quizzes.

 

Here is the data that calculates max, min, and mean only if a total of 7 or less quizzes is missing:

libname sasdata '/folders/myfolders/SASData' ;

Data Prob11_4 ;
    Set sasdata.surveymy ;
    If n(of Q1-Q5) lt 7 then Maxscore = Max(of Q1-Q5) ;
    If n(of Q1-Q5) lt 7 then SecondHiScore = Max(Q1-Q5, 2) ;
    If n(of Q1-Q5) lt 7 then Minscore = Min(of Q1-Q5) ;
    If n(of Q1-Q5) lt 7 then QuesAve = Mean(of Q1-Q5) ;
run ;

Now, how do I write code which calculates the Max, Min, and Mean for all students EXCEPT a student which has 2 missing quizzes. In this example, that is student 007.

 

Thanks for your help!

 

 

 


Accepted Solutions
Solution
‎09-11-2017 08:03 PM
Trusted Advisor
Posts: 1,288

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

Posted in reply to ManitobaMoose

Instead of runing the same IF test 4 times, run it once.  If the test is successful (NMISS>=2) then do the 4 calculations inside a "do group":

 

    if nmiss(of Q1-Q5)  ge 2 then do;
      Maxscore = Max(of Q1-Q5) ;
      SecondHiScore = Max(Q1-Q5, 2) ;
      Minscore = Min(of Q1-Q5) ;
      QuesAve = Mean(of Q1-Q5) ; 
    end;

 

But your second stat   "SecondHiScore=max(q1-a5,2)" is meaningless.  It just adds the value 2 to your list of values for X1 through X5  Take a look at the documentation on the ORDINAL function.

View solution in original post


All Replies
Frequent Contributor
Posts: 121

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

Posted in reply to ManitobaMoose

Sorry. I meant to say, "As you can see. there are 3 missing quizzes". (not 7).

Super User
Posts: 13,083

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

Posted in reply to ManitobaMoose

Perhaps you want to use NMISS, the number of missing to exclude.

I was very curious about about why you used n(of Q1-Q5) lt 7. Since there are only 5 variables the number of nonmissing values is always less than 7.

 

Provide your exact rule for when not to calculate some statistic. One example doesn't provide a rule. Do you require 3 , 4 or 5 values to calculate a valid mean? What value to you want if the mean isn't valid? Missing or some other code value?

 

You may also be interested in the Largest function.

largest(2,q1,q2,q3,q4,q5) will return the second largest value. Max with the 2 as you used would return 2 if all of the Q values were 1.

Frequent Contributor
Posts: 121

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

Hi,

 

Thanks for answering.

 

The reason I used n(of Q1-Q5) lt 7. is that I was following the problem in the Book Learning SAS by example. The great frustration I am having with that book is that it used to come with a CD that contained the practice problems that are referred to in the book, but no longer does. I downloaded files associated with the book from th internet, but that link only contains a few of the problems in the book, and so in other cases I am using other datasets that came with the downloadable link to substitute for those not available to me. I didn't realize that the lt n(of Q1-Q5) lt 7 was referring to the number of errors per observation. I thought that meant the total for all observations. Now it makes sense to use a n(of Q1-Q5) lt 2 to exclude more than 2 missing data points per observation.

 

I did switch out the largest function in place of the max function. It appears the max function only takes the max, not the second or third etc largest.

 

IF ANYONE KNOWS WHERE I CAN FIND THE LINK FOR LEARNING SAS BY EXAMPLE THAT INCLUDES ALL THE DATA SETS THAT WERE SUPPOSED TO COME WITH THE BOOK THAT WOULD BE EXTREMELY HELPFUL.  I have the book with the red cover.  It says it was originally published in 2007.

 

Thanks!

Frequent Contributor
Posts: 121

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

This is the website I used to download the files. I would say that less than 25% of the files referred to in the book problems can be found at this link:

 

https://support.sas.com/publishing/authors/extras/60864_code.pdf

Super User
Posts: 22,873

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

Posted in reply to ManitobaMoose

https://support.sas.com/downloads/package.htm?pid=1709#

 

There's 120 files of various types in the link above. 

 

The link you referenced had 58 files.

Solution
‎09-11-2017 08:03 PM
Trusted Advisor
Posts: 1,288

Re: N (not missing) for Q1-Q5. How to calculate Not missing two or more quizzes for each given stud

Posted in reply to ManitobaMoose

Instead of runing the same IF test 4 times, run it once.  If the test is successful (NMISS>=2) then do the 4 calculations inside a "do group":

 

    if nmiss(of Q1-Q5)  ge 2 then do;
      Maxscore = Max(of Q1-Q5) ;
      SecondHiScore = Max(Q1-Q5, 2) ;
      Minscore = Min(of Q1-Q5) ;
      QuesAve = Mean(of Q1-Q5) ; 
    end;

 

But your second stat   "SecondHiScore=max(q1-a5,2)" is meaningless.  It just adds the value 2 to your list of values for X1 through X5  Take a look at the documentation on the ORDINAL function.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 195 views
  • 1 like
  • 4 in conversation