BookmarkSubscribeRSS Feed
thummala
Obsidian | Level 7

Hi All,

 

Can you please help me with the following query. My goal is to count the number of non-missing values and stack those counts to original dataset. Since the counts are all generated in numeric format, when I stack the counts to original dataset it throws an error when it encounters character variable.

 

data final;
set sashelp.electric;
run;

 

ods output cnt=cnt;
proc iml;
use final;
read all var _NUM_ into x[colname=nNames];
n = countn(x,"col");
nmiss = countmiss(x,"col");

read all var _CHAR_ into x[colname=cNames];
close final;
c = countn(x,"col");
cmiss = countmiss(x,"col");

/* combine results for num and char into a single table */
Names = cNames || nNames;
rNames = {" Missing", "Not Missing"};
cnt = (cmiss // c) || (nmiss // n);

print cnt[r=rNames c=Names label=""];
ods trace off;

 

data final;
set final cnt;
run;

 

6 REPLIES 6
Reeza
Super User

Is this question in a course or SAS tutorial online somewhere?

 

It has been asked and answered recently, so search on here. 

I think this is what you're looking for:

https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Send-this-IML-output-to-dataset-instead-o...

 

 

And if you genuinely need a way to count missing, PROC FREQ works faster and is easier. 

https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

Rick_SAS
SAS Super FREQ

Nice job counting the missing/nonmissing observations, but you cannot append a numeric value to a character variable.  Your choices are 

1. Keep the statistics separate from the data (which I recommend)

2. Convert the numbers to character values, which is probably not a good idea.

 

If you decide to choose (2), use the CHAR function to convert from numeric to character values, as follows:

 

proc iml;
....

cnt1 = char(cmiss // c);  /* Yuck! Convert to character */
cnt2 = (nmiss // n);

create cnt1 from cnt1[c=cNames r=rNames];
append from cnt1[r=rNames];
close;
create cnt2 from cnt2[c=nNames r=rNames];
append from cnt2[r=rNames];
close;

data cnt;
merge cnt1 cnt2;
by rNames;
run;

data final;
length rNames $11;
set final cnt;
run;
mkeintz
PROC Star

Once a variable type is specified as character, you can't treat it as numeric selectively for specific rows.  What you can do for those variables is convert the numeric counts of non-missing to characters, and right justify the results.  Will character representations of numeric values be of use to you?

 

 

Even though this is an IML forum, here's a DATA step solution:

 

data final;
set sashelp.electric;
run;


data want (drop=i);
  set final end=eof;
  array nvar {*} _numeric_;
  array cvar {*} _character_;

  array n_n {100} _temporary_;
  array n_c {100} _temporary_;

  do i=1 to dim(nvar);
      n_n{i}+n(nvar{i});
  end;
  do i=1 to dim(cvar);
      n_c{i}+(cvar{i}^=' ');
  end;
  output;
  if eof then do;
    do i=1 to dim(nvar); nvar{i}=n_n{i}; end;
    do i=1 to dim(cvar); cvar{i}=put(n_n{i},8.);cvar{i}=right(cvar{i});end;
    output;
  end;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
thummala
Obsidian | Level 7
Hi All,

I am working on a clinical data where I have to check about 100 datasets in each study and the variables differ on each dataset. I di try using PROC FREQ , but due to the size of the dataset and number of variables associated in each dataset, it is taking long time and when searching for the work around solution I did get that piece of code(provided by Rick) and wanted to tweak it as per my need. But wasn't able to successfully convert the character variables into character and reached out to the forum.
Tom
Super User Tom
Super User

PROC FREQ shouldn't need much memory to count missings.

You should first define some formats to categorize the values in misisng or non-mising.

 

proc sql ;
 value miss low-high='Non-Missing' other='Missing';
 value $miss ' '='Missing' other='Non-Missing;
run;

You can then use PROC FREQ.  You can use the ODS OUTPUT option to collect the statistics.

proc freq data=have ;
  format _numeric_ miss. _character_ $miss. ;
  tables _all_;
run;

Or use PROC SUMMARY instead using the CLASS statement.

proc summary data=have missing chartype ;
  format _numeric_ miss. _character_ $miss. ;
  class _all_;
  ways 1 ;
  output out=want n=count;
run;

 

Ksharp
Super User

data final;
set sashelp.electric;
run;

proc iml;
use final nobs nobs;
read all var _NUM_ into x[colname=nNames]; 
nmiss = countmiss(x,"col");
n=nobs-nmiss;

read all var _CHAR_ into y[colname=cNames]; 
cmiss = countmiss(y,"col");
c=nobs-cmiss;
close final;

Names = cNames || nNames;
rNames = {" Missing", "Not Missing"};
cnt = (cmiss // c) || (nmiss // n);

create want from cnt[r=rnames c=names];
append from cnt[r=rnames];
close;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 5869 views
  • 0 likes
  • 6 in conversation