BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
belboy
Obsidian | Level 7

Hi SAS community,

I'm working with a Hash object and I need help in figuring out how to return a count of total missing and non-missing values for all observations.  Here is the code I'm using:

 


data _null_;
set &dataset end=last;
array _xchar{*} _character_;
array _xnum{*} _numeric_;

if _n_ = 1 then do;
*call missing(min_val, max_val);
length
_colName $ 32
_nMiss 8
_n 8
_type $9;


declare hash hmiss( ordered: "Y");
hmiss.defineKey("_colName");
hmiss.defineData("_colName", "_n", "_nMiss", "_Type", "min_val", "max_val");
hmiss.defineDone();
end;

do i = 1 to dim(_xchar);
_nMiss=0;
_n=0;
_colName = vname(_xchar{i});
_rc = hmiss.find();
_nMiss + missing(_xchar{i}) = 1;
_n + missing(_xchar{i}) = 0;
_Type = vtype(_xchar{i});


if _Type = "C" then _Type = "Character";

if _rc = 0 then do;
if _xchar{i} < min_val then do;
min_val = _xchar{i};
hmiss.replace();
if _xchar{i} = "" then count+1;
end;
if _xchar{i} > max_val then do;
max_val = _xchar{i};
hmiss.replace();
end;
end;

else do;
min_val = _xchar{i};
max_val = _xchar{i};
hmiss.add();
end;
end;

do i = 1 to dim(_xnum);
_nMiss=0;
_n=0;
_colName = vname(_xnum{i});
_rc = hmiss.find();
_nMiss + missing(_xnum{i}) = 1;
_n + _xnum{i} = 0;
_Type = vtype(_xnum{i});
/*_min_colName = min(of _xnum{i});
_max_colName = max(of _xnum{i});*/

if _Type = "N" then _Type = "Numeric";

if _rc = 0 then do;
if _xnum{i} < min_val then do;
min_val = _xnum{i};
hmiss.replace();
end;


if _xnum{i} > max_val then do;
max_val = _xnum{i};
hmiss.replace();
end;
end;
else do;
min_val = _xnum{i};
max_val = _xnum{i};
hmiss.add();
end;

end;

if last = 1 then do;
hmiss.output(dataset: "hashmiss");
end;
run;

 

proc print data=hashmiss /*(drop=_n _nmiss )*/;
run;

 

Here are my results below.  I'm able to get the min value and max value of all variables by using character and numeric arrays.  However, total observations for the &dataset = 16467.  I would like for the sum of  _n and _nMiss to total 16467.  So for example,  INPUT_VALUE  will have _n = 16467 and _nMiss = 0, or if BIL_NUMBER has missing values then _n = 16400 and _nMiss = 67 for example.

Obs _colName _n _nMiss _type min_val max_val
2INPUT_VALUE20CharacterNot found in blsSpans bLS & SP
3DSINPUT_VALUE820Character048127326133458128770333
4BIL_NUMBER1151Character 7600017342
5MBIL_NUMBER50Character7500727273UNKNOWN
6NEW_BIL1131Numeric.750550764394
7OLD_BIL271Numeric.803048128221
8SUSSTR_BIL164640Character27326180288471561111
9SUSSTR_XREF191Character 458128258407
10UNI_ID271Character 8048128220250
11USER_09_131571Character 2884715555556
12USER_35_1911Character 03000048228549312

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Why do you want to do it ? Hash is not right way to do it.

 



data _null_;


set sashelp.heart end=last;
array _xchar{*} _character_;
array _xnum{*} _numeric_;

if _n_=1 then do;
 length _var_name $ 40;
 declare hash h();
 h.definekey('_var_name');
 h.definedata('_var_name','_n','_n_miss');
 h.definedone();
end;


do i=1 to dim(_xchar);
  _var_name=vname(_xchar{i});
  rc=h.find();
  if rc=0 then do;
   if missing(_xchar{i}) then  _n_miss=_n_miss+1;
   else  _n=_n+1;
   h.replace();
  end;
  else do;
   if missing(_xchar{i}) then  do;_n_miss=1;_n=0;end;
   else do;_n_miss=0;_n=1; end;
   h.replace();
  end;
end;



do i=1 to dim(_xnum);
  _var_name=vname(_xnum{i});
  rc=h.find();
  if rc=0 then do;
   if missing(_xnum{i}) then  _n_miss=_n_miss+1;
   else  _n=_n+1;
   h.replace();
  end;
  else do;
   if missing(_xnum{i}) then  do;_n_miss=1;_n=0;end;
   else do;_n_miss=0;_n=1; end;
   h.replace();
  end;
end;


if last then h.output(dataset:'want');
run;

View solution in original post

7 REPLIES 7
andreas_lds
Jade | Level 19

Can you post the data you are using? Or post the expected result if the following modified version of sashlelp.class is processed:

data work.class;
    set sashelp.class;
    
    if _n_ in (1,3,7,11,17) then Age = .;
    if _n_ in (2, 3, 8, 12) then Sex = ' ';
    if _n_ in (2, 7, 15, 18) then Weight =.;
    if _n_ in (3, 11, 15, 16) then Height =.;
 run;
Astounding
PROC Star

This is sort of like using a sledgehammer to kill a fly.  For numerics, SAS does all this for you:

 

proc means data=have n nmiss min max;

run;

 

This type of report has been asked for a few times, so you will find it addressed on these message boards.

 

Are your reporting requirements rigid or will the simple approaches be just fine?

SASKiwi
PROC Star

While I admire the intellectual challenge of your approach, @Astounding's solution is way better if other SAS users need to understand or maintain what you have done.

belboy
Obsidian | Level 7
Thanks - unfortunately proc means does not generate the results I need for
character variables. Is there a simple way to do this with proc freq? I'd
like to have min max _n _nMiss for all variables if possible with a simple
procedure. If not, I'd like the community to offer their suggestions or
ideas!
Astounding
PROC Star

You can handle character variables separately, by creating a format.  For example:

 

proc format;
value $missfmt ' '='Missing' other='NonMissing';
run;
proc freq data=have;
tables _character_ / missing;
format _character_ $missfmt.;
run;

This is truly a quick-and-dirty report.  But there are various ODS options that can reformat the results.

 

Also note, there is not much point that I see in computing a minimum and maximum for a character variable.  Especially when your character variables contain digits, note that "9" is greater than "100" as a character string.

Ksharp
Super User

Why do you want to do it ? Hash is not right way to do it.

 



data _null_;


set sashelp.heart end=last;
array _xchar{*} _character_;
array _xnum{*} _numeric_;

if _n_=1 then do;
 length _var_name $ 40;
 declare hash h();
 h.definekey('_var_name');
 h.definedata('_var_name','_n','_n_miss');
 h.definedone();
end;


do i=1 to dim(_xchar);
  _var_name=vname(_xchar{i});
  rc=h.find();
  if rc=0 then do;
   if missing(_xchar{i}) then  _n_miss=_n_miss+1;
   else  _n=_n+1;
   h.replace();
  end;
  else do;
   if missing(_xchar{i}) then  do;_n_miss=1;_n=0;end;
   else do;_n_miss=0;_n=1; end;
   h.replace();
  end;
end;



do i=1 to dim(_xnum);
  _var_name=vname(_xnum{i});
  rc=h.find();
  if rc=0 then do;
   if missing(_xnum{i}) then  _n_miss=_n_miss+1;
   else  _n=_n+1;
   h.replace();
  end;
  else do;
   if missing(_xnum{i}) then  do;_n_miss=1;_n=0;end;
   else do;_n_miss=0;_n=1; end;
   h.replace();
  end;
end;


if last then h.output(dataset:'want');
run;
belboy
Obsidian | Level 7

Ah yes, thanks!  Sometimes help from others will produce a "lightbulb" effect....