Hello,
I have a dataset 'ds' and I would like to calculate 95CI using _denX and _numX. There are 3 set of _den and _num. I would like to use do loop to calculate them all at once. I am new to sas and tried but failed. 😅 Could anyone guide me on this?
If do loop is not the right way, could you show me the right way? I would like to know as many approach as possible.
Thanks.
/*Sample datset*/ data ds; infile datalines dsd truncover; input id _den1 _den2 _den3 _num1 _num2 _num3; datalines; 1,4,7,6,0,3,2, 2,4,7,6,1,0,3, 3,4,7,6,0,2,1, 4,4,7,6,2,1,0 ; /*Looping code. Failed.*/ data want; set ds; do i=1 to 3; _pi = round((_numi/_deni),.0001); if _pi=0 then _lowi=0; if _pi=1 then _highi=100; if _ pi ne 0 then _lowi=round((1-betainv(.975,(_deni-_numi+1),_numi)),.0001)*100; if _pi ne 1 then _highi=round((1-betainv(.025,(_deni-_numi),_numi+1)),.0001)*100; resulti = '['||strip(put(_lowi, 5.1))||', '||strip(put(_highi, 5.1))||']'; end; run;
You did not include any ARRAY statements in your data step. Read the documentation on how arrays work.
So perhaps something like this:
data ds;
input id _den1-_den3 _num1-_num3;
datalines;
1 4 7 6 0 3 2
2 4 7 6 1 0 3
3 4 7 6 0 2 1
4 4 7 6 2 1 0
;
data want;
set ds;
array _den _den1-_den3;
array _num _num1-_num3;
array _low _low1-_low3;
array _high _high1-_high3;
array result $15 result1-result3;
do i=1 to dim(_den);
_pi = round((_num[i]/_den[i]),.0001);
if _pi=0 then _low[i]=0;
else _low[i]=round((1-betainv(.975,(_den[i]-_num[i]+1),_num[i])),.0001)*100;
if _pi=1 then _high[i]=100;
else _high[i]=round((1-betainv(.025,(_den[i]-_num[i]),_num[i]+1)),.0001)*100;
result[i] = cats('[',put(_low[i], 5.1),',',put(_high[i], 5.1),']');
end;
drop i _pi ;
run;
Result
Your code also looks for a variable named deni and numi, neither of those exist in the data set. You have to use variable names that exist in the data set, otherwise the code will fail.
This seems like a job for an ARRAY.
data ds;
infile datalines dsd truncover;
input id _den1 _den2 _den3 _cnt1 _cnt2 _cnt3;
datalines;
1,4,7,6,0,3,2,
2,4,7,6,1,0,3,
3,4,7,6,0,2,1,
4,4,7,6,2,1,0
;
data want;
set ds;
array n _cnt1-_cnt3;
array d _den1-_den3;
array low _low1-_low3;
array h _high1-_high3;
array r $16 result1-result3;
do i=1 to dim(n);
_pi = round((n(i)/d(i)),.0001);
if _pi=0 then low(i)=0;
if _pi=1 then h(i)=100;
if _pi ne 0 then low(i)=round((1-betainv(.975,(d(i)-n(i)+1),d(i))),.0001)*100;
if _pi ne 1 then h(i)=round((1-betainv(.025,(d(i)-n(i)),n(i)+1)),.0001)*100;
r(i)= '['||strip(put(low(i), 5.1))||', '||strip(put(h(i), 5.1))||']';
end;
run;
This code could be further simplified, but I will leave it as is. Also, wide data sets are usually not preferred, if this was a long data set, no arrays would be needed.
It would help tremendously if you would use the same variable names in your text as in your data set. There is no _numX variable in your data set. I assume you mean _cntX.
Thanks so much for the prompt reply! I did notice the variable name issue and updated sample data codes. 😅
Please pardon me as the following question might sounds silly:
I am new to do loops. I thought when we defined 'do i=1 to 3' , and then when we call variable name _deni, it will auto updated to _den1, _den2, _den3 in the looping process. Did I misunderstand it? if it is incorrect, is it possible to create/calculate such variables in a do loop codes?
Thanks.
Do loops work fine. So something like:
do i=1 to 3;
put "Current value of I is ' i ;
end;
works fine.
But your mistake is trying to reference other variables. If you code something like:
i=2;
deni=45;
You have created two variables. One name i and one named DENi. You have not made any attempt to reference a variable named DEN2.
_deni is the name of a variable, not an array element. There is no variable by that name in your data set. _highi is the name of a variable, not an array element. There is no variable by that name in your data set.
However, if you have an ARRAY statement and refer to h(i), this means the i-th variable of array H, which is one of several arrays I defined, containing the _HIGH variable.
You did not include any ARRAY statements in your data step. Read the documentation on how arrays work.
So perhaps something like this:
data ds;
input id _den1-_den3 _num1-_num3;
datalines;
1 4 7 6 0 3 2
2 4 7 6 1 0 3
3 4 7 6 0 2 1
4 4 7 6 2 1 0
;
data want;
set ds;
array _den _den1-_den3;
array _num _num1-_num3;
array _low _low1-_low3;
array _high _high1-_high3;
array result $15 result1-result3;
do i=1 to dim(_den);
_pi = round((_num[i]/_den[i]),.0001);
if _pi=0 then _low[i]=0;
else _low[i]=round((1-betainv(.975,(_den[i]-_num[i]+1),_num[i])),.0001)*100;
if _pi=1 then _high[i]=100;
else _high[i]=round((1-betainv(.025,(_den[i]-_num[i]),_num[i]+1)),.0001)*100;
result[i] = cats('[',put(_low[i], 5.1),',',put(_high[i], 5.1),']');
end;
drop i _pi ;
run;
Result
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.