Help using Base SAS procedures

Do Loops

Accepted Solution Solved
Reply
Super Contributor
Posts: 1,040
Accepted Solution

Do Loops

Hi,

The following example deletes the records whose missing values(numeric and char) are ge 10

in the following example why has the author used the j=1to dim(nm)?? Cant it be written as i=1 to dim(nm) ?

i dint understand this concept. Could anyone explain


Thanks


data complete;

set maybeokay;

misscount = 0;

array ch(*) _character_ ;

array nm(*) _numeric_ ;

do i = 1 to dim(ch);

if ch(i) = ' ' then misscount + 1;

end;

do j = 1 to dim(nm);

if nm(j) = . then misscount + 1;

end;

drop i j ;

if misscount ge 10 then delete;

run;

http://www2.sas.com/proceedings/sugi26/p052-26.pdf


Accepted Solutions
Solution
‎06-02-2013 05:46 PM
Super User
Super User
Posts: 6,502

Re: Do Loops

CATX() will skip the missing values.  So catx('|','A',' ',9,.,'C') will be 'A|9|C' .

View solution in original post


All Replies
Super User
Super User
Posts: 6,502

Re: Do Loops

I assume it is because that is what made sense at the time.

You are right that they could have re-used I as the loop variable for the second loop. But if the loops were nested that would cause trouble.


In the past I would normally use DO OVER construct for situations like this where the index variable is meaningless and avoid having to worry about dropping the variable.  (Note that it doesn't solve the issue of nested do loops as DO OVER use the automatic _I_ variable for indexing.)

Note that setting MISSCOUNT to zero is critical in this program because of the use of the sum statement (var + value;)  since it means MISSCOUNT is retained.

Now that SAS has added so many new functions over the years I would use the NMISS() function to count the _NUMERIC_ missings.  You could probably do something with COUNTW() and CATX() to count missing characters, but you would need to find an unused character to serve as the delimiter.

misscount=0;

array ch _character_;

misscount=nmiss(of _numeric_) + dim(ch) - countw(catx('00'x,of ch(*)),'00'x);

Note that setting MISSCOUNT to zero is critical here also because it will be included in the _NUMERIC_ variable list.

Both programs will have trouble when there are no character variables in the input datastep since you cannot define an array with no elements. You could solve that by creating a dummy character variable.

misscount=0;

retain _ch 'A';

drop _ch;

array ch _character_;

misscount=nmiss(of _numeric_) + dim(ch) - countw(catx('00'x,of ch(*)),'00'x) ;

Super Contributor
Posts: 1,040

Re: Do Loops

Hi,

Thanks for the detailed explanation.

could you explain whats happening here??????its a lil bit confusing as to why the '00'x was used???????/

also why the substraction logic is done????/

misscount=nmiss(of _numeric_) + dim(ch) - countw(catx('00'x,of ch(*)),'00'x) ;


Thanks

Super User
Super User
Posts: 6,502

Re: Do Loops

COUNTW() counts words by using a delimiter. It is counting non-missing hence the need to subtract from the number of items in the array. So I use binary zero ('00'x) as the delimiter as it is extremely unlikely to be an actual character in your data.  But see response below from that uses another function I didn't find.  CMISS() will work for both numeric and character variables.

misscount=cmiss(of _all_);

It would help if SAS was more consistent in its documentation as I looked for a See Also section in the documentation for NMISS() and missed that CMISS() was instead only included in a Comparisons section instead.

Comparisons

The NMISS function returns the number of missing values, whereas the N function returns the number of nonmissing values. NMISS requires numeric values, whereas CMISS works with both numeric and character values. NMISS works with multiple numeric values, whereas MISSING works with only one value that can be either numeric or character.

Super Contributor
Posts: 1,040

Re: Do Loops

So when you concatenate all the values under all the char variables with '00'x wont the dim(ch) be equalant to the countw???

Is it like when a char value is missing '00'x concatenates with '00'x. And since there is no word between these two countw does not have a value now

only when there is '00'xword'00'x countw has a value to count???

Does it work like that??/

Also what the little x is doing beside the Zeros and why is it outside the braces????????

Thanks

Solution
‎06-02-2013 05:46 PM
Super User
Super User
Posts: 6,502

Re: Do Loops

CATX() will skip the missing values.  So catx('|','A',' ',9,.,'C') will be 'A|9|C' .

Super Contributor
Posts: 1,040

Re: Do Loops

Finally why the little x in '00'x is outside of the inverted comas?????

Super User
Super User
Posts: 6,502

Re: Do Loops

That is how to represent a literal value using hexadecimal digits.  For example a space is represented by '20'x, tab by '09'x.

There are many other literals such date ('01JAN1960'd), time ('09:30't), datetime ('01JAN1960:09:30'dt).

SAS Constants in Expressions

Respected Advisor
Posts: 4,654

Re: Do Loops

How about :

misscount = 0;

misscount = cmiss(of _all_);

or, if you don't need the number of missing values:

data complete;

set maybeokay;

if cmiss( of _all_)  <= 10;

run;

PG

PG
Respected Advisor
Posts: 4,654

Re: Do Loops

Yes, the author could have used i instead of j for the second loop. The result would have been the same. The difference in efficiency would have been insignificant. The name of the do loop variable matters only if you want to use it after the end of the loop. For example :

do i = 1 to dim(ch);

     if ch(i) = ' ' then leave;

end;

do j = 1 to dim(nm);

     if ch(j) = ' ' then leave;

end;

hasMissing = i < dim(ch) or j < dim(nm);

PG

PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 326 views
  • 8 likes
  • 3 in conversation