My data looks like below. How do I keep the non missing values but also where the data is missing for the entire column (which happens) put a zero in its place?
I have tried:
data tryit2;
set check2 (keep=Complete_n where=(Complete_n > 0));
if Complete_n = . then Complete_n = 0;
set check2 (keep=partial_n where=(partial_n > 0));
if partial_n =. then partial_n = 0;
set check2 (keep=com_par_percent where=(com_par_percent > 0));
if com_par_percent =. then com_par_percent = 0;
set check2 (keep=no_res_percent where=(no_res_percent > 0));
if com_par_percent =. then com_par_percent = 0;
set check2 (keep=Death_n where=(Death_n > 0));
if Death_n =. then Death_n = 0;
set check2 (keep=Revised_n where=(Revised_n > 0));
if Revised_n =. then Revised_n = 0;
set check2 (keep=Conversion_n where=(Conversion_n > 0));
if Conversion_n =. then Conversion_n = 0;
set check2 (keep=declined_n where=(declined_n >= 0));
if declined_n =. then declined_n = 0;
set check2 (keep=tot_pat_for_comp where=(tot_pat_for_comp > 0));
if com_par_percent =. then com_par_percent = 0;
set check2 (keep=clin_visit_percent where=(clin_visit_percent > 0));
if clin_visit_percent =. then clin_visit_percent = 0;
set check2 (keep=xray_percent where=(xray_percent > 0));
if xray_percent =. then xray_percent = 0;
set check2 (keep=faam_percent where=(faam_percent > 0));
if faam_percent =. then faam_percent = 0;
set check2 (keep=ffi_percent where=(ffi_percent > 0));
if ffi_percent =. then ffi_percent = 0;
set check2 (keep=vr12_percent where=(vr12_percent > 0));
if vr12_percent =. then vr12_percent = 0;
set check2 (keep=pt_percent where=(pt_percent > 0));
if pt_percent =. then pt_percent = 0;
set check2 (keep=md_percent where=(md_percent > 0));
if md_percent =. then md_percent = 0;
set check2 (keep=vas_percent where=(vas_percent > 0));
if vas_percent =. then vas_percent = 0;
set check2 (keep=survey_percent where=(survey_percent > 0));
if survey_percent =. then survey_percent = 0;
run;
However, the problem with this code is that a missing value is returned before I can change the missing value to 0. When this happens the data set is empty because it there were no values greater than 0. I have converted all blanks to a zero but I cannot returned where >=0 because if a 0 comes first then that value is returned. So I would want to return anything >0 first but if missing/zero return a zero. Thank you
Obs | Complete_n | partial_n | Conversion_n | Death_n | Revised_n | declined_n | com_par_percent | no_res_percent | tot_pat_for_comp | clin_visit_percent | xray_percent | faam_percent | ffi_percent | vr12_percent | pt_percent | md_percent | vas_percent | survey_percent |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 59 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
2 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
3 | . | 200 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
4 | . | . | 10 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
5 | . | . | . | 10 | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
6 | . | . | . | . | 23 | . | . | . | . | . | . | . | . | . | . | . | . | . |
7 | . | . | . | . | . | . | 50.6 | . | . | . | . | . | . | . | . | . | . | . |
8 | . | . | . | . | . | . | . | 49.4 | 512 | . | . | . | . | . | . | . | . | . |
9 | . | . | . | . | . | . | . | . | . | . | 40 | . | . | . | . | . | . | . |
10 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
Please replace lines:
do i=1 to dim(vout);
if vin(i) = . then vout)i) = 0;
else vout)i) = vin(i);
end;
with next code:
do i=1 to dim(vout);
if vin(i) = . then vout)i) = max(0 , vout(i));
else vout)i) = vin(i);
end;
drop v1-v18;
I hope this will fix the issue.
Does next code end with wanted results?
data new;
set check2;
array vx {*} Complete_n partial_n Conversion_n Death_n Revised_n
declined_n com_par_percent no_res_percent tot_pat_for_comp
clin_visit_percent xray_percent faam_percent ffi_percent
vr12_percent pt_percent md_percent vas_percent survey_percent;
do i=1 to dim(vx);
if missing(vx(i)) then vx(i)=0;
end;
drop i;
run;
Please supply a sample data (few observations, few variables) in a data step with cards or datalines and the wanted result.
Is there no any ID variable? If all your input should result into one observation then do:
%let vars = Complete_n partial_n Conversion_n Death_n Revised_n
declined_n com_par_percent no_res_percent tot_pat_for_comp
clin_visit_percent xray_percent faam_percent ffi_percent
vr12_percent pt_percent md_percent vas_percent
survey_percent;
data want;
set have(rename=(Complete_n=v1 partial_n=v2 Conversion_n=v3
Death_n=v4Revised_n=v5 declined_n=v6 com_par_percent =v7
no_res_percent=v8 tot_pat_for_comp=v9 clin_visit_percent=v10
xray_percent=v11 faam_percent=v12 ffi_percent=v13
vr12_percent=v14 pt_percent=v15 md_percent=v16 vas_percent=v17
survey_percent=v18 )) end=eof ;
retain &vars;
array vin v1-v18; /* adapt to number of variables */
array vout &vars;
do i=1 to dim(vout);
if vin(i) = . then vout)i) = max(0 , vout(i));
else vout)i) = vin(i);
end;
drop v1-v18;
if eof then output;
run;
In case there is an ID (one or more variables) then do:
%let vars = Complete_n partial_n Conversion_n Death_n Revised_n
declined_n com_par_percent no_res_percent tot_pat_for_comp
clin_visit_percent xray_percent faam_percent ffi_percent
vr12_percent pt_percent md_percent vas_percent
survey_percent;
data want;
set have(rename=(Complete_n=v1 partial_n=v2 Conversion_n=v3
Death_n=v4 ........<up to last variable> ));
by ID;
retain &vars;
array vin v1-v20; /* adapt to number of variables */
array vout &vars;
if first.ID then do i=1 to dim(vout); vout(i)=0; end;
do i=1 to dim(vout);
if vin(i) ^= . then vout(i) = vin(i);
end;
if last.ID then output;
run;
I ran the code:
%let vars = Complete_n partial_n Conversion_n Death_n Revised_n
declined_n com_par_percent no_res_percent tot_pat_for_comp
clin_visit_percent xray_percent faam_percent ffi_percent
vr12_percent pt_percent md_percent vas_percent
survey_percent;
data want;
set check2(end=eof
rename=(Complete_n=v1 partial_n=v2 Conversion_n=v3
Death_n=v4 Revised_n=v5 declined_n=v6 com_par_percent =v7 no_res_percent=v8 tot_pat_for_comp=v9
clin_visit_percent=v10 xray_percent=v11 faam_percent=v12 ffi_percent=v13 vr12_percent=v14
pt_percent=v15 md_percent=v16 vas_percent=v17 survey_percent=v18 ));
retain &vars;
array vin v1-v18; /* adapt to number of variables */
array vout &vars;
do i=1 to dim(vout);
if vin(i) = . then vout(i) = 0;
else vout(i) = vin(i);
end;
if eof then output;
run;
When I run it I get the following error:
449 %let vars = Complete_n partial_n Conversion_n Death_n Revised_n
450 declined_n com_par_percent no_res_percent tot_pat_for_comp
451 clin_visit_percent xray_percent faam_percent ffi_percent
452 vr12_percent pt_percent md_percent vas_percent
453 survey_percent;
454 data want;
455 set check2(end=eof
---
22
ERROR 22-7: Invalid option name END.
456 rename=(Complete_n=v1 partial_n=v2 Conversion_n=v3
457 Death_n=v4 Revised_n=v5 declined_n=v6 com_par_percent =v7 no_res_percent=v8
457! tot_pat_for_comp=v9
458 clin_visit_percent=v10 xray_percent=v11 faam_percent=v12 ffi_percent=v13
458! vr12_percent=v14
459 pt_percent=v15 md_percent=v16 vas_percent=v17 survey_percent=v18 ));
460 retain &vars;
461 array vin v1-v18; /* adapt to number of variables */
462 array vout &vars;
463 do i=1 to dim(vout);
464 if vin(i) = . then vout(i) = 0;
465 else vout(i) = vin(i);
466 end;
467 if eof then output;
468 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 38 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
Sorry, my fault.
The correct syntax is:
set have(rename=(Complete_n=v1 partial_n=v2 Conversion_n=v3
Death_n=v4 ........<up to last variable> )) end=eof ;The original post
I'll edit the original post.
With the following code:
data want;
set check2(rename=(Complete_n=v1 partial_n=v2 Conversion_n=v3 Death_n=v4 Revised_n=v5 declined_n=v6 com_par_percent =v7
no_res_percent=v8 tot_pat_for_comp=v9 clin_visit_percent=v10 xray_percent=v11 faam_percent=v12
ffi_percent=v13 vr12_percent=v14 pt_percent=v15 md_percent=v16 vas_percent=v17 survey_percent=v18 )) end=eof ;
retain &vars;
array vin v1-v18; /* adapt to number of variables */
array vout &vars;
do i=1 to dim(vout);
if vin(i) = . then vout(i) = 0;
else vout(i) = vin(i);
end;
if eof then output;
run;
My data set "want" looks like this:
data new;
input v1-v18 Complete_n partial_n Conversion_n Death_n Revised_n declined_n com_par_percent no_res_percent tot_pat_for_comp
clin_visit_percent xray_percent faam_percent ffi_percent vr12_percent pt_percent md_percent vas_percent survey_percent;
cards;
. . . . . . . . . . . . . . . . . . 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Please replace lines:
do i=1 to dim(vout);
if vin(i) = . then vout)i) = 0;
else vout)i) = vin(i);
end;
with next code:
do i=1 to dim(vout);
if vin(i) = . then vout)i) = max(0 , vout(i));
else vout)i) = vin(i);
end;
drop v1-v18;
I hope this will fix the issue.
As those 18 variables are the only variables of your dataset and all of them are numeric,
next code is much more simple and should give you the wanted result:
proc summary data=have noprint;
class;
var _all_;
output out=want sum=;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.