turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- NOBS= variable and DIM(array) as array subscript?

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2015 07:05 AM

Hi,

My objective is to get NOBS=variable and the values of dim(char) and dim(numeric) from one dataset as use the same as array subscript in another datastep or datasteps(plural) or if possible even in the same datastep. Although, I understand the concept, i just don't know how to technically write a statement to get that to nobs and dim from let's say dataset1.

I'd appreciate if anyone can demonstrate an example.

Please and Thanks,

Charlotte

Accepted Solutions

Solution

01-27-2015
09:07 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-27-2015 09:07 AM

For use in many DATA steps, macro language would be appropriate:

data _null_;

set first_dataset nobs=_total_obs_;

array nums {*} _numeric_;

array chars {*} _character_;

call symputx('n_obs', _total_obs_);

call symputx('n_nums', dim(nums));

call symputx('n_chars', dim(chars));

stop;

run;

That gives you three macro variables to work with in later steps &N_OBS, &N_NUMS, and &N_CHARS.

You can get these values in the same DATA step, without macro language:

data want;

if 0 then do;

set first_datset nobs=_total_obs_;

array nums {*} _numeric_;

array chars {*} _character_;

end;

set second_dataset;

*** Add statements to process second_dataset;

run;

This gives you three DATA step numbers to work with: _total_obs_, dim(nums), and dim(chars). Note that the contents of the arrays depend only on the first data set, and the second SET statement does not impact the arrays.

Good luck.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-27-2015 07:13 AM

Please describe a bit more about what you wish to achieve (and why).

Also, some sample input/output data, and some pseudo code might help us follow your thoughts.

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-27-2015 08:04 AM

Yes, perhaps more information. You can get obs from sashelp.vtable - variable nobs plus various other methods. You could also query sashelp.vcolumn with a count function to get number of array eg.

proc sql;

select count(1)

into :NUM_VARS

from SASHELP.VCOLUMN

where LIBNAME="WORK"

and MEMNAME="ABC"

and substr(NAME,1,3)="COL";

quit;

Will give you a number in the macro variable of how many columns would have COL prefix.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-27-2015 08:54 AM

Let us take a small example.

Have the Data Set, CLASS, from SASHELP.CLASS.

data class;

set sashelp.class;

run;

We wish to count the Frequency of AGE. We assume that the Maximum Age of the CLASS is 16. (This assumption is not needed as we could find the Max and Min Age in the previous step).

We use an array K with a dimension of 16 as no AGE can exceed it in the data set.

We contrive a trivial way to use the number observations of the data set so that we read one observation at a time using I as a counter until the counter(I) does not exceed NUM which is the total number of observations of the Data Set. When I = NUM, the do-loop ends and the next do-loop,

do i = 1 to dim(k);

is carried out. The Subscript of the array K takes values from 1 to dim(k) which happens to be 1 to 16. In this do-loop, if the array-cell contains a value > 0, then it is output. The Output data set is shown below.

Hope this helps you to understand what you seek.

data want;

array k[16] _temporary_;

do i = 1 by 1 while ( i <= num);

set class nobs = num;

k[Age] + 1;

end;

do i = 1 to dim(k);

if k* > 0 then do;*

Age = i;

Freq = k*;*

output;

end;

end;

keep Age Freq;

run;

Obs Age Freq

1 11 2

2 12 5

3 13 3

4 14 4

5 15 4

6 16 1

Solution

01-27-2015
09:07 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-27-2015 09:07 AM

For use in many DATA steps, macro language would be appropriate:

data _null_;

set first_dataset nobs=_total_obs_;

array nums {*} _numeric_;

array chars {*} _character_;

call symputx('n_obs', _total_obs_);

call symputx('n_nums', dim(nums));

call symputx('n_chars', dim(chars));

stop;

run;

That gives you three macro variables to work with in later steps &N_OBS, &N_NUMS, and &N_CHARS.

You can get these values in the same DATA step, without macro language:

data want;

if 0 then do;

set first_datset nobs=_total_obs_;

array nums {*} _numeric_;

array chars {*} _character_;

end;

set second_dataset;

*** Add statements to process second_dataset;

run;

This gives you three DATA step numbers to work with: _total_obs_, dim(nums), and dim(chars). Note that the contents of the arrays depend only on the first data set, and the second SET statement does not impact the arrays.

Good luck.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

01-27-2015 11:46 AM

Beautiful and neat, Thank you Sir,

In your second example wouldn't the contents of the PDV include variable descriptors of both datasets with missing values to variables of dataset1?

If yes, dropping or keeping them be necessary later?

/*coz i thought if 0 then set would still fully compile the entire descriptive portions occupying more memory but doing it one pass is awesome too, may i ask what do you reckon would be most efficient?*/

Please accept my apologies if i am wrong and silly.

@hkuo @datsp and RW, Many thanks for the quick response. I really appreciate it

Regards,

Charlotte from England!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-28-2015 09:49 AM

Yes, you're correct about the PDV. You could add this statement:

drop _numeric_ _character_;

Just be sure to add it before the second SET statement ... and test it of course. But the definition of _numeric_ and _character_ in a DATA step is supposed to mean all the numeric (or character) variables defined so far. So this should work.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to CharlotteCain

01-27-2015 09:20 AM

This?

**PROC** **SQL** NOPRINT;

SELECT NOBS, NUM_CHARACTER, NUM_NUMERIC INTO:NOBS TRIMMED, :NUM_CHARACTER TRIMMED, :NUM_NUMERIC TRIMMED

FROM DICTIONARY.TABLES WHERE LIBNAME='SASHELP' AND MEMNAME='CLASS';

**QUIT**;

%PUT NOBS=&NOBS NUM_CHARACTER=&NUM_CHARACTER NUM_NUMERIC=&NUM_NUMERIC;

Haikuo