turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Changing values of a macro-variable (as a boundary...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-14-2014 03:32 AM

Hello everyone,

I have a little problem. I have this kind of data set with say, 1 000 000 observations :

VAR_1 VAR_2 VAR_3 [...] VAR_122 SPAN

The SPAN variable gives me a number between 1 and 122 and what I'd like to do is **count **for each **observation **the number of missing values between VAR_1 and VAR_&SPAN.

I don't really know how to perform that calculation since it's a "row calculation". Writing it wrong, it would look like

N_MISS = count missing(VAR_1-VAR_&SPAN)

But I have no idea about how to perform that calculation for each **row, ** I only know how to do that in column (more simple ahah)

Could you help me please ?

Thanks a lot

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 03:37 AM

use array-s Base SAS(R) 9.4 Procedures Guide, Second Edition.

---->-- ja karman --<-----

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jakarman

05-14-2014 05:17 AM

Thank you Jaap for your answer.

I am not that familiar with arrays, but I tried looking at that.

So I guess I should start doing something like :

array var

array span

array n_miss

Is that correct ?

But then i don't see how i should do my loop.

I thought it would be :

do i=1 to 122

n_miss = cmiss (of var[1]-var[SPAN];

leave;

end;

run;

But again, I am not at ease yet with arrays.

Could you correct me please ?

Thanks a lot

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 05:39 AM

I think you would need something more like:

n_miss=0;

do I=1 to 122;

if var*=. then n_miss=n_miss+1;*

end;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-14-2014 05:44 AM

Thank you RW9,

the problem is that your code seems to count the missing values for the 122 variables "VAR" for each observation.

In my case I need for each obs to count the missing values from VAR_1 to VAR_j where j is the value of variable SPAN.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 05:51 AM

Maybe I misunderstand, but for the code above you would get:

datastep start.

arrive at observation1.

set n_miss to 0

start do loop with 1

if var[1] is missing the n_miss+1;

do loop with 2

if var[2] is missing then n_miss+1;

...

end loop, n_miss contains, for the first observation, a count of all columns var1-var122 where the value is missing.

now move onto observation 2.

This seems to be what you are asking?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 05:54 AM

My fault Id did not include a good link for array-s, this one is better. SAS(R) 9.4 Language Reference: Concepts, Second Edition fo concepts and as reference SAS(R) 9.4 Statements: Reference, Second Edition your array declaration should be something like:

array aspan{*} var_1 - var_122 ; /* the * is counting the 122 automatically */

After that you can refer to the element of an array using loop-processing.

This is offering full freedom like matrix processing.

do i=1 to ..;

span= <some processing aspan{i} > ;

end;

Reviewing your question, There are a lot of functions that are working on a list of variables. My first link was one in that direction.

you are just wanting to count the number of missings? SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (NMISS)

span=nmiss(of var_1 - var_122) ;

That is using the list processing of variables SAS(R) 9.4 Language Reference: Concepts, Second Edition.

That a list of variables can be the argument is not a very common concept in programming languages.

---->-- ja karman --<-----

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jakarman

05-14-2014 06:06 AM

Thank you both !

I am going to put a little example to show what I want to count (in bold the variables I already have, I want to create the variable N_MISS):

**VAR_1 VAR_2 VAR_3 VAR_4 VAR_5 SPAN** N_MISS

** 5 10 . . 10 5 ** 2

**2 3 . . . 3 ** 1

** . . 1 5 . 4 ** 2

------------------------------------------------

I am sorry if I get it wrong but I have the impression that when writing things like nmiss(of var_1 - var_122) ; or

n_miss=0;

do I=1 to 122;

if var*=. then n_miss=n_miss+1;*

end;

I will get N_MISS values 2 3 3 instead of the 2 1 2 above

My problem seems to me (again, I am probably wrong !) that my loop isn't fixed since for each observation (row) I don't use the same span to count. For my example above, 1st line I count on var1-var5, 2nd line I count on var1-var3 and 3rd line I count on var1-var4.

Thank you very much for your help, those arrays are totally new to me (discovered them this morning ahah)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 06:12 AM

Ok, so you want to count the missing based on another variable span, that's fine just change to:

n_miss=0;

do I=1 to 122;

if var*=. and I <= span then n_miss=n_miss+1;*

end;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-14-2014 06:44 AM

I think this would be easier and more intuitive:

do I=1 to span;

if var*=. then n_miss=n_miss+1;*

end;

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to KurtBremser

05-14-2014 10:49 AM

I mixed all of your suggestions and it did exactly what I wanted !

Thanks a lot guys for your help ! Plus, it made me discover the arrays ! Great !

Have a nice day !

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 10:50 AM

Is it really necessary to do the iteration?

data work.want;

set work.have;

length n_miss 8;

array vars

n_miss = nmiss(of var_ - (dim(vars) - span);

run;

The result of nmiss(of var_ has to be reduced by the number of variables which are expected to missing.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to andreas_lds

05-15-2014 01:30 AM

If it is not guaranteed that variables beyond SPAN are always missing, this approach will give wrong results.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GuiVtzl

05-14-2014 12:54 PM

Hello,

Here is a macro which also provides the desired result.

%macro tocalc (nbvar);

data want;

set have;

mc = span-(%eval(&nbvar) - N(of var1-var&nbvar));

%put var&nbvar;

run;

%mend tocalc;

%tocalc(5);