Hi,
suppose I have a very big table with millions of rows. Some columns are almost full, some less, some have very few data and some are completely empty.
Is it possible to count the number of entries that are in each column?
Thank you
You could do something like the following:
data test;
set sashelp.class;
if _n_ eq 7 then do;
call missing(name);
call missing(height);
end;
run;
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
run;
proc freq data=test;
format _CHARACTER_ $missfmt.;
tables _CHARACTER_ / missing nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing nocum nopercent;
run;
You could do something like the following:
data test;
set sashelp.class;
if _n_ eq 7 then do;
call missing(name);
call missing(height);
end;
run;
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
run;
proc freq data=test;
format _CHARACTER_ $missfmt.;
tables _CHARACTER_ / missing nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing nocum nopercent;
run;
Hi Arthur and Patrick, thanks for the quick reply and both of your codes are very helpful!!
Just some small extra questions if I may:
when doing the proc freq, I realized that some columns which I thought were completely empty and which I was going to delete actually had some data, albeit small, which I might use in the future.
1) Is it possible to see what is the number of the observation where these data are present ?
2) is it possible to make an output file of all the columns but conditional on data being present on columns of my choice?
Again thank you very much!!!
Not sure what you're asking. You could always use a datastep, with a where statement, only selecting records that have non-missing values for the variables you are interested in.
Hi arthur,
I have millions of lines of code, and in a special column of interest there are around 1000 rows with nonmissing data, and the rest are missing.
Is it possible to know where those nonmissing values are, i.e to output their observation number?
Thank you
Hi Arthur,
I actually figured out how to do my second little extra question, its quite simple if I'm correct and is of the following:
data a;
input name$ nu ;
datalines;
a 1
s 3
d .
g 4
t .
kk .
f 6
;
run;
data b;
set a;
if nu ne " ";
run;
thanks!
Just about. But that, alone, won;t show you which observations were missing. How about adding one extra variable to your code:
data b;
set a;
if nu ne " ";
observation=_n_;
run;
Thanks!!
Below code counting the number of distinct values per column in a table. Is that what you're after? It could become quite a big table if you're having a lot of different values.
data sample;
set sashelp.class;
if _n_=5 then call missing(sex);
run;
ods _all_ close;
ods output OneWayFreqs = Freqs(keep=table F_: Frequency rename=(table=Varname));;
proc freq data=sample ;
table _all_ /nocum nopercent missing;
quit;
ods _all_ close;
ods listing;
data want(keep=Varname Value Frequency);
set freqs;
Varname=prxchange('s/^table(.*)/\1/oi',1,Varname);
Value=coalescec(of F_:);
run;
/*proc print data=want;*/
/*run;*/
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.