turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- proc freq: accross the dataset

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-22-2009 03:35 PM

Hello,

Is it possible to get a frequency report across a whole dataset without having the results separated by variables?

I have a dataset of roughly 1200 obs and 400 variables and I'd like to know how many cells contain X numerical value. That's it, that's all.

Thanks.

Is it possible to get a frequency report across a whole dataset without having the results separated by variables?

I have a dataset of roughly 1200 obs and 400 variables and I'd like to know how many cells contain X numerical value. That's it, that's all.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-22-2009 03:37 PM

This would probably be better done in a data step, where you can count the number of times you encounter a value of X in each row. Then, you can sum up the counts across all rows.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2009 09:05 AM

Think you can detail that a bit more? I get the general idea obviously but have no idea where to start when implementing the solution.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2009 10:01 AM

Perhaps...

[pre]

data have;

input a b c;

cards;

1 3 4

3 2 1

4 5 6

;;;;

run;

data count4;

set have end=eof;

array x _numeric_;

do _n_ = 1 to dim(x);

count4 + x[_n_] eq 4;

end;

if eof then output;

keep count4;

run;

proc print;

run;

[/pre]

[pre]

data have;

input a b c;

cards;

1 3 4

3 2 1

4 5 6

;;;;

run;

data count4;

set have end=eof;

array x

do _n_ = 1 to dim(x);

count4 + x[_n_] eq 4;

end;

if eof then output;

keep count4;

run;

proc print;

run;

[/pre]

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2009 12:59 PM

That works very well. Thanks a lot.

I have 32 datasets that I need to run this on, is it possible to list all of them and display the results ?

dataset1 X

dataset2 Y

...

I have 32 datasets that I need to run this on, is it possible to list all of them and display the results ?

dataset1 X

dataset2 Y

...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2009 04:17 PM

I not sure my example program is going to be the best approach for that. You’re doling out the specs one a time making it some difficult to understand the ins and outs of your process.

How about example data and example output, then you should get better help.

How about example data and example output, then you should get better help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2009 09:52 PM

The following should be easy to automate or modify to suit your needs.

[pre]

%let value_to_match=8.9;

proc summary data=SASHELP.CITIDAY(drop=DATE); *count occurences of each value;

class _numeric_;

ways 1;

output out=TEST_SUM;

run;

data TEST_FINAL; *count occurences of a given value;

set TEST_SUM end=LASTOBS;

VALUE=coalesce(of _numeric_);

if abs(sum(VALUE, - &value_to_match))<1e-2 then COUNT+_FREQ_;

if LASTOBS then put "VALUE_TO_MATCH=&value_to_match " COUNT=;

run;

[pre]

%let value_to_match=8.9;

proc summary data=SASHELP.CITIDAY(drop=DATE); *count occurences of each value;

class _numeric_;

ways 1;

output out=TEST_SUM;

run;

data TEST_FINAL; *count occurences of a given value;

set TEST_SUM end=LASTOBS;

VALUE=coalesce(of _numeric_);

if abs(sum(VALUE, - &value_to_match))<1e-2 then COUNT+_FREQ_;

if LASTOBS then put "VALUE_TO_MATCH=&value_to_match " COUNT=;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2009 10:10 PM

Can't seem to be able to put all in one reply.

Maybe doing it in one go would be possible, depending on how you want to test;

[pre]

data TEST;

length VALUES $200;

set SASHELP.CITIDAY(drop=DATE) end=LASTOBS;

VALUES=catx(' ',of _numeric_);

COUNT+count(' '||VALUES," &value_to_match ");

if LASTOBS then put "VALUE_TO_MATCH=&value_to_match " COUNT=;

run;[/pre]

Maybe doing it in one go would be possible, depending on how you want to test;

[pre]

data TEST;

length VALUES $200;

set SASHELP.CITIDAY(drop=DATE) end=LASTOBS;

VALUES=catx(' ',of _numeric_);

COUNT+count(' '||VALUES," &value_to_match ");

if LASTOBS then put "VALUE_TO_MATCH=&value_to_match " COUNT=;

run;[/pre]

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2009 11:46 AM

Well, it does work very well. All I'm trying to do now is either

1) Replicate this procedure for each of the 32 datasets I have to count in and merge the outputs to get a total or

2) Find a way to count through each of my 32 datasets and get one total for all of them.

Any help is appreciated.

Thanks.

1) Replicate this procedure for each of the 32 datasets I have to count in and merge the outputs to get a total or

2) Find a way to count through each of my 32 datasets and get one total for all of them.

Any help is appreciated.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2009 03:01 PM

Actually, my problem has evolved.

I am running the above code to fetch the number times a particular value appears in a dataset. I would like to produce a report containing the results of this code for 32 different datasets.

The report could be in HTML format or Excel and look like the following:

ds1 588

ds2 122

ds3 112

...

Since I am running this code through a shell in a VB application, it would be easy to loop through the 32 datasets. For that matter, is it possible to extract the count value from each iteration and store it in a variable to display further on? Does anyone see any other possibilities?

I am running the above code to fetch the number times a particular value appears in a dataset. I would like to produce a report containing the results of this code for 32 different datasets.

The report could be in HTML format or Excel and look like the following:

ds1 588

ds2 122

ds3 112

...

Since I am running this code through a shell in a VB application, it would be easy to loop through the 32 datasets. For that matter, is it possible to extract the count value from each iteration and store it in a variable to display further on? Does anyone see any other possibilities?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2009 03:53 PM

If you could put all of your data sets into a single location and issue a LIBNAME on it, then you could do something like the following:

%macro count;

proc sql noprint;

select count(memname) into :table_count separated by ''

from dictionary.members where memtype='DATA' and upcase(libname)="&libname";

select memname into :table1-:table&table_count

from dictionary.members where memtype='DATA' and upcase(libname)="&libname";

quit;

%do i=1 %to &table_count;

%let this_table=&&&table&i;

%put NOTE: running matcing count on table &this_table..;

data this_count;

set &libname..&this_table end=lastobs;

length table_name $ 32;

array _values {*} _numeric_;

do idx=1 to dim(_values);

if abs( _values(idx) - &value_to_match ) le 1e-12 then match_count + 1;

end;

if lastobs then do;

table_name = "&this_table";

output;

end;

keep table_name match_count;

run;

proc append base=report new=this_count;

run;

%end;

%mend count;

%let value_to_match=0;

%let libname=MY_LIBNAME;

%count;

[revised to remove use of FUZZ]

Message was edited by: kmg

%macro count;

proc sql noprint;

select count(memname) into :table_count separated by ''

from dictionary.members where memtype='DATA' and upcase(libname)="&libname";

select memname into :table1-:table&table_count

from dictionary.members where memtype='DATA' and upcase(libname)="&libname";

quit;

%do i=1 %to &table_count;

%let this_table=&&&table&i;

%put NOTE: running matcing count on table &this_table..;

data this_count;

set &libname..&this_table end=lastobs;

length table_name $ 32;

array _values {*} _numeric_;

do idx=1 to dim(_values);

if abs( _values(idx) - &value_to_match ) le 1e-12 then match_count + 1;

end;

if lastobs then do;

table_name = "&this_table";

output;

end;

keep table_name match_count;

run;

proc append base=report new=this_count;

run;

%end;

%mend count;

%let value_to_match=0;

%let libname=MY_LIBNAME;

%count;

[revised to remove use of FUZZ]

Message was edited by: kmg

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2009 03:55 PM

Opps -- in my prior posting, instead of the function FUZZ, you should compare the absolute difference to a reasonable precision value (as shown in other people's posts), for example, something like:

if abs( _values(idx) - &value_to_match ) le 1e-12 then match_count + 1;

if abs( _values(idx) - &value_to_match ) le 1e-12 then match_count + 1;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-29-2009 02:01 AM

So that mismatches involving missing values are detected, I prefer:

if abs(sum( _values(idx), - &value_to_match) ) le 1e-12 then match_count + 1;

if abs(sum( _values(idx), - &value_to_match) ) le 1e-12 then match_count + 1;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2009 03:54 PM

Certainly, you could use macros to loop over all data sets.

Since your answer is already stored in a SAS data set, I'm not sure I understand your question: "is it possible to extract the count value from each iteration and store it in a variable to display further on?"

Since your answer is already stored in a SAS data set, I'm not sure I understand your question: "is it possible to extract the count value from each iteration and store it in a variable to display further on?"