turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Programming
- /
- N (not missing) for Q1-Q5. How to calculate Not m...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-11-2017 05:03 PM

Hi,

I am learning from SAS by Example. Here is a relatively simple problem that I do not have the answer for, but someone on here probably knows.

Essentially, each student has taken up to 5 quizzes, with a score from 1 to 5, but some quizzes are missing for some students. I have learned from Learning SAS by Example how to exclude Max Min, mean etc. calculations using the N function if, for example, the total number of missing quizzes is more than 7. However, what if I want to exclude calculating the max, min, mean, etc for any given student who hasn't taken at least 4 quizes (ie. who has 2 or more quizzes missing). .

All my programming is below. Any help would be very much appreciated. Thanks!

----------------------------------------------------

Here is the original data file:

----------------------------------------

libname SASdata '/folders/myfolders/SASData' ;

Data SASData.surveyMy ;

Input Subj$ Gender$ Age Income Q1 Q2 Q3 Q4 Q5 ;

Datalines ;

001 M 23 28000 1 . 1 2 3

002 F 55 76123 4 5 2 1 1

003 M 38 36500 2 2 2 2 1

004 F 67 128000 4 3 2 2 4

005 M 22 23060 3 3 3 4 2

006 M 63 90000 2 3 5 4 3

007 F 45 76100 5 . . 3 3

;

As you can see, there are 7 total missing quizzes. Student 007 has two missing quizzes.

Here is the data that calculates max, min, and mean only if a total of 7 or less quizzes is missing:

libname sasdata '/folders/myfolders/SASData' ;

Data Prob11_4 ;

Set sasdata.surveymy ;

If n(of Q1-Q5) lt 7 then Maxscore = Max(of Q1-Q5) ;

If n(of Q1-Q5) lt 7 then SecondHiScore = Max(Q1-Q5, 2) ;

If n(of Q1-Q5) lt 7 then Minscore = Min(of Q1-Q5) ;

If n(of Q1-Q5) lt 7 then QuesAve = Mean(of Q1-Q5) ;

run ;

Now, how do I write code which calculates the Max, Min, and Mean for all students EXCEPT a student which has 2 missing quizzes. In this example, that is student 007.

Thanks for your help!

Accepted Solutions

Solution

09-11-2017
08:03 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ManitobaMoose

09-11-2017 06:27 PM

Instead of runing the same IF test 4 times, run it once. If the test is successful (NMISS>=2) then do the 4 calculations inside a "do group":

```
if nmiss(of Q1-Q5) ge 2 then do;
Maxscore = Max(of Q1-Q5) ;
SecondHiScore = Max(Q1-Q5, 2) ;
Minscore = Min(of Q1-Q5) ;
QuesAve = Mean(of Q1-Q5) ;
end;
```

But your second stat "SecondHiScore=max(q1-a5,2)" is meaningless. It just adds the value __2__ to your list of values for X1 through X5 Take a look at the documentation on the

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ManitobaMoose

09-11-2017 05:06 PM

Sorry. I meant to say, "As you can see. there are 3 missing quizzes". (not 7).

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ManitobaMoose

09-11-2017 06:01 PM

Perhaps you want to use NMISS, the number of missing to exclude.

I was very curious about about why you used n(of Q1-Q5) lt 7. Since there are only 5 variables the number of nonmissing values is always less than 7.

Provide your exact rule for when not to calculate some statistic. One example doesn't provide a rule. Do you require 3 , 4 or 5 values to calculate a valid mean? What value to you want if the mean isn't valid? Missing or some other code value?

You may also be interested in the Largest function.

largest(2,q1,q2,q3,q4,q5) will return the second largest value. Max with the 2 as you used would return 2 if all of the Q values were 1.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

09-11-2017 07:09 PM

Hi,

Thanks for answering.

The reason I used n(of Q1-Q5) lt 7. is that I was following the problem in the Book Learning SAS by example. The great frustration I am having with that book is that it used to come with a CD that contained the practice problems that are referred to in the book, but no longer does. I downloaded files associated with the book from th internet, but that link only contains a few of the problems in the book, and so in other cases I am using other datasets that came with the downloadable link to substitute for those not available to me. I didn't realize that the lt n(of Q1-Q5) lt 7 was referring to the number of errors per observation. I thought that meant the total for all observations. Now it makes sense to use a n(of Q1-Q5) lt 2 to exclude more than 2 missing data points per observation.

I did switch out the largest function in place of the max function. It appears the max function only takes the max, not the second or third etc largest.

IF ANYONE KNOWS WHERE I CAN FIND THE LINK FOR LEARNING SAS BY EXAMPLE THAT INCLUDES ALL THE DATA SETS THAT WERE SUPPOSED TO COME WITH THE BOOK THAT WOULD BE EXTREMELY HELPFUL. I have the book with the red cover. It says it was originally published in 2007.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

09-11-2017 07:19 PM

This is the website I used to download the files. I would say that less than 25% of the files referred to in the book problems can be found at this link:

https://support.sas.com/publishing/authors/extras/60864_code.pdf

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ManitobaMoose

09-12-2017 11:12 AM

https://support.sas.com/downloads/package.htm?pid=1709#

There's 120 files of various types in the link above.

The link you referenced had 58 files.

Solution

09-11-2017
08:03 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ManitobaMoose

09-11-2017 06:27 PM

Instead of runing the same IF test 4 times, run it once. If the test is successful (NMISS>=2) then do the 4 calculations inside a "do group":

```
if nmiss(of Q1-Q5) ge 2 then do;
Maxscore = Max(of Q1-Q5) ;
SecondHiScore = Max(Q1-Q5, 2) ;
Minscore = Min(of Q1-Q5) ;
QuesAve = Mean(of Q1-Q5) ;
end;
```

But your second stat "SecondHiScore=max(q1-a5,2)" is meaningless. It just adds the value __2__ to your list of values for X1 through X5 Take a look at the documentation on the