I have a table of data that I want to add an extra line to, indicating if two variables stay the same or if one of them changes.
For example if I had a table like this:
I would want to add a new column starting at 1 and going up by 1 every time either column 1 or column 2 changes. In the example this would create the following result:
A | 1 | 1 |
A | 1 | 1 |
A | 2 | 2 |
A | 3 | 3 |
B | 1 | 4 |
B | 1 | 4 |
B | 4 | 5 |
C | 1 | 6 |
C | 2 | 7 |
C | 2 | 7 |
The real dataset is much larger than this, about 400.000 entries, but the principle remains the same. I need an automated and scalable solution. I figured I could turn the columns into arrays, use a for loop to create column 3 and add it to the dataset, but was not able to figure this out.
Any help is appreciated, cheers!