BookmarkSubscribeRSS Feed
lpn
Calcite | Level 5 lpn
Calcite | Level 5

Hello,

 

I am pretty new to SAS and I was wondering if someone could help me out with this. Suppose I have the following data set: 

  

ID           Var1   Var2    Var3   Var4

1               1         1          1     1

2               1         0          1     0

3               0         1          0     0

4               0         0          0     1

 

I need to  be able to count the number of ones in each row before there is a zero and once there is a zero the count stops. I need to be able to do this starting from var1 going to var4, and also starting from var2 going to var4, and also from var3 to var4.

 

My first output variable should look like this:

 

ID   count_A

1     4

2     1   

3     0

4     0

 

The second output variable should look like this:

 

ID   count_B

1     3

2     0  

3     1

4     0

 

The third output variable should look like this:

 

ID   count_C

1     2

2     1  

3     0

4     0

 

 

Thanks

6 REPLIES 6
ballardw
Super User

@lpn wrote:

Hello,

 

I am pretty new to SAS and I was wondering if someone could help me out with this. Suppose I have the following data set: 

  

ID           Var1   Var2    Var3   Var4

1               1         1          1     1

2               1         0          1     0

3               0         1          0     0

4               0         0          0     1

 

I need to  be able to count the number of ones in each row before there is a zero and once there is a zero the count stops. I need to be able to do this starting from var1 going to var4, and also starting from var2 going to var4, and also from var3 to var4.

 

My first output variable should look like this:

 

ID   count_A

1     4

2     1   

3     0

4     0

 

The second output variable should look like this:

 

ID   count_B

1     3

2     0  

3     1

4     0

 

The third output variable should look like this:

 

ID   count_C

1     2

2     1  

3     0

4     0

 

 

Thanks


It is best to post data as a data step to remove any possibly confusion as to data type or variable names. Post in a code box opened using the forum's {I} or "running man" icon as the main message windows will reformat text and sometimes the pasted code will not run.

 

When you need to do something repetitive and especially with groups of variables arrays are the typical tool to address the values without having to create lots of extra variables.

For example with your data:

data have;
input ID           Var1   Var2    Var3   Var4;
datalines;
1               1         1          1     1
2               1         0          1     0
3               0         1          0     0
4               0         0          0     1
;

data want;
   set have;
   array v var1-var4;
   array c count1-count3;
   do i=1 to dim(c);
      tcount=0;
      do j= i to dim(v);
         if v[j]=1 then tcount+1;
         else leave;
      end;
      c[i]=tcount;
   end;
   drop i j tcount;
run;

There are several reasons to prefer using a numeric suffix such as 1, 2, 3 (or if you insist _1, _2 , _3) because of the way SAS creates new variables on an array statement such as the Array C above, instead of _A _B _C.

 

Typical tools used above: loop counter I and j , accumulator Tcount. Leave is a special instruction to exit a do loop when encountered. In the above case it leaves the inner loop when the value of the currently examine VAR variable is not 1.

FreelanceReinh
Jade | Level 19

Hello @lpn and welcome to the SAS Support Communities!

 

A nice feature of SAS is that there are often several different ways to achieve the same result. They may differ in performance, length of code, elegance, robustness, readability, time to develop, ease of maintenance or generalization, etc.

 

In your case you could, for example, apply character functions such as LENGTHN, SCAN and SUBSTR to the concatenation of Var1, ..., Var4 to obtain the desired counts directly (i.e., each count in a single expression). Alternatively, you could compute them just numerically (assuming that Var1, ..., Var4 are numeric variables):

data want;
set have;
count3=var3*(  var4+1);
count2=var2*(count3+1);
count1=var1*(count2+1);
run;
ballardw
Super User

@FreelanceReinh wrote:

Hello @lpn and welcome to the SAS Support Communities!

 

A nice feature of SAS is that there are often several different ways to achieve the same result. They may differ in performance, length of code, elegance, robustness, readability, time to develop, ease of maintenance or generalization, etc.

 

In your case you could, for example, apply character functions such as LENGTHN, SCAN and SUBSTR to the concatenation of Var1, ..., Var4 to obtain the desired counts directly (i.e., each count in a single expression). Alternatively, you could compute them just numerically (assuming that Var1, ..., Var4 are numeric variables):

data want;
set have;
count3=var3*(  var4+1);
count2=var2*(count3+1);
count1=var1*(count2+1);
run;

And you will develop your own tools based on scale of the exercise.

How many lines of code would be needed if instead of Var1 to Var4 you had Var1 to Var100 and needed Count1 to Count99?

FreelanceReinh
Jade | Level 19

@ballardw wrote:

How many lines of code would be needed if instead of Var1 to Var4 you had Var1 to Var100 and needed Count1 to Count99?


Nine.

data want(drop=count100);
set have;
array var[100];
array count[100];
count100=var100;
do _n_=99 to 1 by -1;
  count[_n_]=var[_n_]*(count[_n_+1]+1);
end;
run;

Or did you mean this as an exercise for the OP?

ballardw
Super User

@FreelanceReinh wrote:

@ballardw wrote:

How many lines of code would be needed if instead of Var1 to Var4 you had Var1 to Var100 and needed Count1 to Count99?


Nine.

data want(drop=count100);
set have;
array var[100];
array count[100];
count100=var100;
do _n_=99 to 1 by -1;
  count[_n_]=var[_n_]*(count[_n_+1]+1);
end;
run;

Or did you mean this as an exercise for the OP?


@FreelanceReinh You read my mind. I realized a bit after my post that I should have more directly aimed that at the OP.

Ksharp
Super User
data have;
input ID           Var1   Var2    Var3   Var4;
datalines;
1               1         1          1     1
2               1         0          1     0
3               0         1          0     0
4               0         0          0     1
;

data want;
 set have;
 count_a=lengthn(scan(cats(of var1-var4),1,'0','m'));
 count_b=lengthn(scan(cats(of var2-var4),1,'0','m'));
 count_c=lengthn(scan(cats(of var3-var4),1,'0','m'));
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1564 views
  • 6 likes
  • 4 in conversation