BookmarkSubscribeRSS Feed
trevand
Obsidian | Level 7

I would like to create a new identifier (new_id) that identifies a new subgroup whenever the indicator switches from 0 to 1. Has some and idea how to accomplish this? 

 

group_id indicator new_id
1 1 1a
1 0 1a
1 0 1a
1 1 1b
1 0 1b
1 0 1b
1 0 1b
2 1 2a
2 0 2a
4 REPLIES 4
PaigeMiller
Diamond | Level 26

Much much much easier to create sequential numbers.

 

data want;
    set have;
    if indicator=1 then new_id+1;
run;

 

I'm pretty sure anything you want to do with this NEW_ID variable can be done with the sequential numbers.

--
Paige Miller
LinusH
Tourmaline | Level 20

Agree with @PaigeMiller sequences are easier to maintain/use.

If you need the group_id  within the new variable you can try this:

data have;
	length group_id indicator 8;
	input group_id indicator;
	datalines;
1 1
1 0
1 0
1 1
1 0
1 0
1 0
2 1
2 0
;
run;

data want;
	set have;
	by group_id;
	if first.group_id then index = 0;
	if indicator = 1 then index+1;
	new_id = cats(put(group_id,best.),'-',put(index,best.));
run;

But to get the result you need, try this:

data want;
	set have;
	by group_id;
	if first.group_id then index = 0;
	if indicator = 1 then index+1;
	new_id = cats(put(group_id,best.),substr('abcdefghijklmnopqrstuvwxyz',index,1));
run;

It was not clear from your example how many times the indicator can change, but I guess the maximum with this technique  would be 26.

 

 

Data never sleeps
PaigeMiller
Diamond | Level 26

@LinusH wrote:

Agree with @PaigeMiller sequences are easier to maintain/use.

 

It was not clear from your example how many times the indicator can change, but I guess the maximum with this technique  would be 26.


Good point. That's a problem if you need more than 26 letters. Another problem is that if the variable GROUP_ID goes above 9, then the next value of NEW_ID is 10a. Since you also have 1a, these will not sort properly. If you sort on these NEW_ID values, 10a, 10b, ... will be followed by 1a. And if GROUP_ID goes above 9, you will also need a LENGTH statement to make this work at all.

 

Plus I am skeptical that there is a valid use for identifiers like 1a and 10a and so on. Who can remember what they represent when looking at the output? I also worry that the OP has oversimplified a real world problem and any answers we give will not fit the real world problem.

--
Paige Miller
PaigeMiller
Diamond | Level 26

@trevand 

please tell us what you are going to do with variable NEW_ID

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 267 views
  • 2 likes
  • 3 in conversation