Folks,
I've a number of variables which I would like to figure out how many values are greater than 0. For easy I've read them into an array but I'm having trouble getting the count of values which are greater than 0 or .
Any help would be most welcome.
data want;
set test;
array vars {*} untaxedother adjustednetprofit
amount
amtcarersallowance
amtcharged
annuitypension
balcharges
bills
cae
distminretire
distprsaretire
dividends
employer_prsi
eudirtamt
exempt
exemptnotded
farmretirementpension
farmretscheme
foreignnotpaye
foreigntradenotax
foreigntradetaxded
gaindisposalhigher
gaindisposallower
gainhitaxableoffshorefunds
gaintaxablegainrtlifepolicies
gaintaxablelifepolicies
do _i_=1 to dim(vars);
if vars{_i_}=0 then vars{_i_}=.;
end;
run;
proc means data=want n min max mean median;
var vars;
run;
how can I output to a data set with the desired output that give for each numeric variable ;
number rows with Neg value
number rows with pos value
number rows with zero value
data HAVE;
input COUNTY $ SCHOOL $ ENROLLMENT VAXA VAXB SCHOOLTYPE $ ;
cards;
countyA littlet 50 48 45 private
countyA happyda 100 88 77 public
countyA playtim -25 22 23 private
countyB busybee -23 22 21 public
countyB childti -27 25 25 public
;
run;
proc format;
value posneg
low - <0 = 'Negative'
0= 'Zero'
0-high = 'Positive'
;
run;
proc freq data=HAVE;
tables _numeric_;
format _numeric_ posneg.;
run;
You could just use the sum() function:
data want; set test; array vars {*} untaxedother adjustednetprofit ...; want_sum=sum(of vars{*}); run;
An array is a temporary construct that only exists while the datastep it was defined in is running. It is not written into the output datasets, so you cannot use it in the following proc means.
Are you running your means on all numeric variables in the dataset?
If yes, consider
proc means data=want n min max mean median;
var _numeric_;
run;
It is very easy for IML .
data HAVE;
input COUNTY $ SCHOOL $ ENROLLMENT VAXA VAXB SCHOOLTYPE $ ;
cards;
countyA littlet 50 48 45 private
countyA happyda 100 88 77 public
countyA playtim -25 22 23 private
countyB busybee -23 22 21 public
countyB childti -27 25 25 public
;
run;
proc iml;
use have;
read all var _num_ into x;
close;
count=(x>0)[,+];
create count var{count};
append;
close;
quit;
data want;
merge have count;
run;
You can try this,
data want(drop=i);
set test;
array vars {*} untaxedother adjustednetprofit ....;
gt_0_cnt=0;
do i=1 to dim (vars);
gt_0_cnt + ifn(vars(i) > 0,1,0);
end;
run;
The new variable (gt_0_cnt) should show you, the number/sum of variables in your array, that have a value greater than 0, for every row/record in your data set.
Hope this helps,
Ahmed
I'll often do something like this to get a feel for what is in the data, but with a little more information. For example:
proc format;
value posneg low - <0 = 'Negative' 0 = 'Zero' 0 - high = 'Positive';
run;
proc freq data=have;
tables _numeric_;
format _numeric_ posneg.;
run;
If you only want to report on some of the numeric variables, you can replace _numeric_ with the actual list as part of the TABLES statement. (The FORMAT statement can stay as is.)
how can I output to a data set with the desired output that give for each numeric variable ;
number rows with Neg value
number rows with pos value
number rows with zero value
data HAVE;
input COUNTY $ SCHOOL $ ENROLLMENT VAXA VAXB SCHOOLTYPE $ ;
cards;
countyA littlet 50 48 45 private
countyA happyda 100 88 77 public
countyA playtim -25 22 23 private
countyB busybee -23 22 21 public
countyB childti -27 25 25 public
;
run;
proc format;
value posneg
low - <0 = 'Negative'
0= 'Zero'
0-high = 'Positive'
;
run;
proc freq data=HAVE;
tables _numeric_;
format _numeric_ posneg.;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.