BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Valor123
Calcite | Level 5

I have a table of data that I want to add an extra line to, indicating if two variables stay the same or if one of them changes.

 

For example if I had a table like this: 

 

A1
A1
A2
A3
B1
B1
B4
C1
C2
C2

 

I would want to add a new column starting at 1 and going up by 1 every time either column 1 or column 2 changes. In the example this would create the following result:

 

A11
A11
A22
A33
B14
B14
B45
C16
C27
C27

 

The real dataset is much larger than this, about 400.000 entries, but the principle remains the same. I need an automated and scalable solution. I figured I could turn the columns into arrays, use a for loop to create column 3 and add it to the dataset, but was not able to figure this out. 

 

Any help is appreciated, cheers!

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input a $ b;
datalines;
A 1
A 1
A 2
A 3
B 1
B 1
B 4
C 1
C 2
C 2
;

data want;
   set have;
   if lag(a) ne a | lag(b) ne b then c+1;
run;

View solution in original post

3 REPLIES 3
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input a $ b;
datalines;
A 1
A 1
A 2
A 3
B 1
B 1
B 4
C 1
C 2
C 2
;

data want;
   set have;
   if lag(a) ne a | lag(b) ne b then c+1;
run;
Valor123
Calcite | Level 5
worked in one go, thank you!
novinosrin
Tourmaline | Level 20

Hi @Valor123  Good morning

data have;
input id $ var1;
cards;
A	1
A	1
A	2
A	3
B	1
B	1
B	4
C	1
C	2
C	2
;

data want;
 set have;
 by id var1 ;
 if first.var1 then want+1;
run;