BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
renanm
Obsidian | Level 7

Hello everyone, 

 

I've tried tackling the problem below multiple ways including using FIRST. and LAST. in BY-GROUP analysis, WHILE loops, but have not been unsuccessful so far. I'm running out of time and if anyone can provide some direction it'd be greatly appreciated. 

 

Here's the situation:

 

I have a data set which is 'tenurized' based on the occurrence of a particular event. In this case, the event being a 'limit placement' which happens at a particular date defined as 'Limit_dt'. Therefore, as of the 'Limit_dt' the 'Tenure' field gets set to = 0 and increments by 1 until the last row for that particular customer group (ID). The data is sorted by ID, Limit_dt and Date. 

 

See below an illustration of what the data looks like as well as better explanation of each field. Field Tenure2 (highlighted in yellow) is what I'm attempting to tackle right now. Please ignore it for now. 

 

Data Structure.PNG 

 

Field description.PNG


Essentially, what I need to do here is 're-tenurize' this data set based on the FIRST occurrence Score_flag ='Y' within each customer group. To expand a bit on this, Score_flag = 'Y' when Score >= Score30 so this flag can basically jump back and forth between 'Y' and 'N' from month to month. 

 

What I need to do is create a field named 'Tenure2' which gets set to '0' at the FIRST occurrence (and ONLY the first occurrence for each ID) of Score_flag='Y' for a particular ID and increment by 1 from there until the last record/row for that customer (ID). The field 'Tenure2' in the illustration above exemplifies the values that this field should take. 

 

Hoping someone can direct me to a simple solution to this. 

 

Thanks in advance for your support.  

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Add another flag. If this doesn't work, please post sample data following these instructions:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

data WANT;
  retain TENURE2;
  set HAVE;
  by ID;
  length first_flag $1.;
  retain first_flag;
  if first.ID then call missing(TENURE2, first_flag);;
  if score_flag='Y' and missing(first_flag) then do;
  tenure2=0;
first_flag='Y'; 
end;
  else TENURE2=TENURE2+1;
run;

@renanm wrote:

Hi @ChrisNZ , thanks for your attempt at helping to solve this. 

 

This is similar to one of my previous attempts but unfortunately it doesn't solve the problem here. 

 

Here are some of the reasons why:

 

#1 - Whenever there are consecutive Score_flag = 'Y' month after month, the Tenure2 fields gets the value '0' again and again instead of incrementing by 1.

 

#2 - Similarly, if Score_flag alternates between 'Y' and 'N' from month to month, every time it gets set to 'Y' again, the field Tenure2 gets re-set to '0' again.

 

See illustrations of these situations below:

 

Example 1:

 

EX1.PNG

 

Example 2:

EX2.PNG

 

Example 3:

 

EX3.PNG

 

Essentially, Tenure2 needs to be set to '0' ONLY at the FIRST occurrence of Score_flag = 'Y' within each customer group. 

 

This is the missing piece I can't seem to figure out. 

 

Any additional help  is much appreciated. 

 

Thanks ! 


 

View solution in original post

12 REPLIES 12
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT;
  retain TENURE2;
  set HAVE;
  by ID;
  if first.ID then TENURE2=.;
  if SCORE_FLAG='Y' then TENURE2=0;
  else TENURE2=TENURE2+1;
run;

 

renanm
Obsidian | Level 7

Hi @ChrisNZ , thanks for your attempt at helping to solve this. 

 

This is similar to one of my previous attempts but unfortunately it doesn't solve the problem here. 

 

Here are some of the reasons why:

 

#1 - Whenever there are consecutive Score_flag = 'Y' month after month, the Tenure2 fields gets the value '0' again and again instead of incrementing by 1.

 

#2 - Similarly, if Score_flag alternates between 'Y' and 'N' from month to month, every time it gets set to 'Y' again, the field Tenure2 gets re-set to '0' again.

 

See illustrations of these situations below:

 

Example 1:

 

EX1.PNG

 

Example 2:

EX2.PNG

 

Example 3:

 

EX3.PNG

 

Essentially, Tenure2 needs to be set to '0' ONLY at the FIRST occurrence of Score_flag = 'Y' within each customer group. 

 

This is the missing piece I can't seem to figure out. 

 

Any additional help  is much appreciated. 

 

Thanks ! 

Reeza
Super User
Don't post data as images, this means we can't test any code because we'd have to type out your data.
renanm
Obsidian | Level 7

Hi @Reeza , my apologies. I'm new to this forum and learning how to go about this. 

 

How would you suggest I go about illustrating the situation? I've created a hypothetical data set which mirrors the issue I'm having due to confidentiality reasons. 

 

Thanks for your support and advice.  

renanm
Obsidian | Level 7

Thanks @Reeza for sharing these instructions and guidelines. I'll ensure to follow them the next time around. 

Reeza
Super User

Add another flag. If this doesn't work, please post sample data following these instructions:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

data WANT;
  retain TENURE2;
  set HAVE;
  by ID;
  length first_flag $1.;
  retain first_flag;
  if first.ID then call missing(TENURE2, first_flag);;
  if score_flag='Y' and missing(first_flag) then do;
  tenure2=0;
first_flag='Y'; 
end;
  else TENURE2=TENURE2+1;
run;

@renanm wrote:

Hi @ChrisNZ , thanks for your attempt at helping to solve this. 

 

This is similar to one of my previous attempts but unfortunately it doesn't solve the problem here. 

 

Here are some of the reasons why:

 

#1 - Whenever there are consecutive Score_flag = 'Y' month after month, the Tenure2 fields gets the value '0' again and again instead of incrementing by 1.

 

#2 - Similarly, if Score_flag alternates between 'Y' and 'N' from month to month, every time it gets set to 'Y' again, the field Tenure2 gets re-set to '0' again.

 

See illustrations of these situations below:

 

Example 1:

 

EX1.PNG

 

Example 2:

EX2.PNG

 

Example 3:

 

EX3.PNG

 

Essentially, Tenure2 needs to be set to '0' ONLY at the FIRST occurrence of Score_flag = 'Y' within each customer group. 

 

This is the missing piece I can't seem to figure out. 

 

Any additional help  is much appreciated. 

 

Thanks ! 


 

renanm
Obsidian | Level 7

Thanks @Reeza. This seems to solve it and quite interesting the way you approached this. Helped me to learn a lot. Much appreciated. 

ChrisNZ
Tourmaline | Level 20

> Whenever there are consecutive Score_flag = 'Y' month after month, the Tenure2 fields gets the value '0' 

This seems easy to solve. Have you even tried?

 

data WANT;
  retain TENURE2;
  set HAVE;
  by ID;
  if first.ID then TENURE2=.;
  if SCORE_FLAG='Y' & TENURE2=. then TENURE2=-1;
  TENURE2=TENURE2+1;
run;

And yes, as @Reeza said, posting images as data prevents us from running code on actual data (unless we do the work for you and create actual data, but why would we do your work?)

renanm
Obsidian | Level 7

Hi @ChrisNZ ,

 

Thanks again for your help.

 

As mentioned in my response to Reeza above, I'm new to this forum and did not know posting images was not acceptable. I thought that would be a good way to clearly illustrate the situation I was dealing with and clearly demonstrate it to the contributors of the forum.

 

It won't happen again. I'll follow the guidelines shared by Reeza next time. That wasn't meant for you to do my work for me, I simply
didn't know better.

 

And to answer your other question of whether or not I have even tried, the answer is YES. I have tried multiple times and searched for different solutions. Unfortunately, as you can tell, I'm learning and wasn't able to find a solution. I'll get there and hopefully one day be
able to help others with similar situations.

 

Thanks for sharing another alternative to the problem. 

 

ballardw
Super User

@renanm wrote:

Hi @ChrisNZ ,

 

Thanks again for your help.

 

As mentioned in my response to Reeza above, I'm new to this forum and did not know posting images was not acceptable. I thought that would be a good way to clearly illustrate the situation I was dealing with and clearly demonstrate it to the contributors of the forum.

It isn't that images are "not acceptable" but that they can be misleading. We will get a number of posts each week often involving dates but sometimes other values, that appear to be one thing, SAS date value or numeric and are actually character values in the users data. Or "dates" that look like 20190215 that are actually simple numbers and do not have the proper values for use with date functions and formats.

Also testing code without actual values is difficult and too time consuming for most of us to type. There is a SAS macro program at  https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

We also like to see the code that you tried because sometimes all it takes is tweak to get it working. It is not uncommon for someone coming from a different programming or scripting language to use something that may be valid in SAS but means something different than the other language. The fun thing with this can be the rules SAS uses.

A moderately common issue is comparing values:

 

if x = 3 or 4 then do <some code>; for example. Does NOT throw an error but will always do the code represented by <some code>

Reason: the value 4, being non-zero is treated as true all by itself and x is only compared to 3 because there was no instruction to compare x to 4. So the code could be fixed by either

if x=3 or x=4 then do <some code>.

or

if x in (3,4) then do <some code>.

So if you had posted code with something like the first If we could relatively quickly point that out. Code and log entries should be pasted into code boxes opened with either the {I} or "running man" icon to preserve formatting. The main message windows will reformat text removing lots of white space and that often makes data step code fail in one way or another. Error messages from the log also often have diagnostic characters like an _ where the error is found. If you paste one of those into the main message window the _ character usually ends up under something not related to the error.

renanm
Obsidian | Level 7

Very helpful context and guidelines. Thanks @ballardw ! 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1771 views
  • 3 likes
  • 4 in conversation