Hello,
I am pretty new to SAS and I was wondering if someone could help me out with this. Suppose I have the following data set:
ID Var1 Var2 Var3 Var4
1 1 1 1 1
2 1 0 1 0
3 0 1 0 0
4 0 0 0 1
I need to be able to count the number of ones in each row before there is a zero and once there is a zero the count stops. I need to be able to do this starting from var1 going to var4, and also starting from var2 going to var4, and also from var3 to var4.
My first output variable should look like this:
ID count_A
1 4
2 1
3 0
4 0
The second output variable should look like this:
ID count_B
1 3
2 0
3 1
4 0
The third output variable should look like this:
ID count_C
1 2
2 1
3 0
4 0
Thanks
@lpn wrote:
Hello,
I am pretty new to SAS and I was wondering if someone could help me out with this. Suppose I have the following data set:
ID Var1 Var2 Var3 Var4
1 1 1 1 1
2 1 0 1 0
3 0 1 0 0
4 0 0 0 1
I need to be able to count the number of ones in each row before there is a zero and once there is a zero the count stops. I need to be able to do this starting from var1 going to var4, and also starting from var2 going to var4, and also from var3 to var4.
My first output variable should look like this:
ID count_A
1 4
2 1
3 0
4 0
The second output variable should look like this:
ID count_B
1 3
2 0
3 1
4 0
The third output variable should look like this:
ID count_C
1 2
2 1
3 0
4 0
Thanks
It is best to post data as a data step to remove any possibly confusion as to data type or variable names. Post in a code box opened using the forum's {I} or "running man" icon as the main message windows will reformat text and sometimes the pasted code will not run.
When you need to do something repetitive and especially with groups of variables arrays are the typical tool to address the values without having to create lots of extra variables.
For example with your data:
data have; input ID Var1 Var2 Var3 Var4; datalines; 1 1 1 1 1 2 1 0 1 0 3 0 1 0 0 4 0 0 0 1 ; data want; set have; array v var1-var4; array c count1-count3; do i=1 to dim(c); tcount=0; do j= i to dim(v); if v[j]=1 then tcount+1; else leave; end; c[i]=tcount; end; drop i j tcount; run;
There are several reasons to prefer using a numeric suffix such as 1, 2, 3 (or if you insist _1, _2 , _3) because of the way SAS creates new variables on an array statement such as the Array C above, instead of _A _B _C.
Typical tools used above: loop counter I and j , accumulator Tcount. Leave is a special instruction to exit a do loop when encountered. In the above case it leaves the inner loop when the value of the currently examine VAR variable is not 1.
Hello @lpn and welcome to the SAS Support Communities!
A nice feature of SAS is that there are often several different ways to achieve the same result. They may differ in performance, length of code, elegance, robustness, readability, time to develop, ease of maintenance or generalization, etc.
In your case you could, for example, apply character functions such as LENGTHN, SCAN and SUBSTR to the concatenation of Var1, ..., Var4 to obtain the desired counts directly (i.e., each count in a single expression). Alternatively, you could compute them just numerically (assuming that Var1, ..., Var4 are numeric variables):
data want;
set have;
count3=var3*( var4+1);
count2=var2*(count3+1);
count1=var1*(count2+1);
run;
@FreelanceReinh wrote:
Hello @lpn and welcome to the SAS Support Communities!
A nice feature of SAS is that there are often several different ways to achieve the same result. They may differ in performance, length of code, elegance, robustness, readability, time to develop, ease of maintenance or generalization, etc.
In your case you could, for example, apply character functions such as LENGTHN, SCAN and SUBSTR to the concatenation of Var1, ..., Var4 to obtain the desired counts directly (i.e., each count in a single expression). Alternatively, you could compute them just numerically (assuming that Var1, ..., Var4 are numeric variables):
data want; set have; count3=var3*( var4+1); count2=var2*(count3+1); count1=var1*(count2+1); run;
And you will develop your own tools based on scale of the exercise.
How many lines of code would be needed if instead of Var1 to Var4 you had Var1 to Var100 and needed Count1 to Count99?
@ballardw wrote:
How many lines of code would be needed if instead of Var1 to Var4 you had Var1 to Var100 and needed Count1 to Count99?
Nine.
data want(drop=count100);
set have;
array var[100];
array count[100];
count100=var100;
do _n_=99 to 1 by -1;
count[_n_]=var[_n_]*(count[_n_+1]+1);
end;
run;
Or did you mean this as an exercise for the OP?
@FreelanceReinh wrote:
@ballardw wrote:
How many lines of code would be needed if instead of Var1 to Var4 you had Var1 to Var100 and needed Count1 to Count99?
Nine.
data want(drop=count100); set have; array var[100]; array count[100]; count100=var100; do _n_=99 to 1 by -1; count[_n_]=var[_n_]*(count[_n_+1]+1); end; run;
Or did you mean this as an exercise for the OP?
@FreelanceReinh You read my mind. I realized a bit after my post that I should have more directly aimed that at the OP.
data have;
input ID Var1 Var2 Var3 Var4;
datalines;
1 1 1 1 1
2 1 0 1 0
3 0 1 0 0
4 0 0 0 1
;
data want;
set have;
count_a=lengthn(scan(cats(of var1-var4),1,'0','m'));
count_b=lengthn(scan(cats(of var2-var4),1,'0','m'));
count_c=lengthn(scan(cats(of var3-var4),1,'0','m'));
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.