BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
martyvd
Fluorite | Level 6

I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not. 

 

ID         Type       Date            Flag

0001       1          03/03/15

0001       1          03/04/15

0001       1          03/05/15

0001       2          03/06/15         1

0001       2          03/07/15

0002       1          03/03/15

0002       1          03/04/15

0002       2          03/05/15         1

0002       2          03/06/15 

0003       1          03/03/15

0003       2          03/04/15         1

0003       2          03/05/15

0004       2          03/06/15

 

I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions. 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Use LAG() to check the difference.

 

data want;
set have;

by ID date; *may need to presort your data;

*get previous Type value;
prev_type = lag(Type);

*if first for that ID, don't use previous value since it will be for wrong ID;
if first.id then call missing(prev_type);

*rules to set flags, for more rules add more if/then statements;
if type=3 and prev_type = 2 then flag=1;



run;

@martyvd wrote:

I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not. 

 

ID         Type       Date            Flag

0001       1          03/03/15

0001       1          03/04/15

0001       1          03/05/15

0001       2          03/06/15         1

0001       2          03/07/15

0002       1          03/03/15

0002       1          03/04/15

0002       2          03/05/15         1

0002       2          03/06/15 

0003       1          03/03/15

0003       2          03/04/15         1

0003       2          03/05/15

0004       2          03/06/15

 

I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions. 


 

View solution in original post

6 REPLIES 6
Reeza
Super User

Use LAG() to check the difference.

 

data want;
set have;

by ID date; *may need to presort your data;

*get previous Type value;
prev_type = lag(Type);

*if first for that ID, don't use previous value since it will be for wrong ID;
if first.id then call missing(prev_type);

*rules to set flags, for more rules add more if/then statements;
if type=3 and prev_type = 2 then flag=1;



run;

@martyvd wrote:

I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not. 

 

ID         Type       Date            Flag

0001       1          03/03/15

0001       1          03/04/15

0001       1          03/05/15

0001       2          03/06/15         1

0001       2          03/07/15

0002       1          03/03/15

0002       1          03/04/15

0002       2          03/05/15         1

0002       2          03/06/15 

0003       1          03/03/15

0003       2          03/04/15         1

0003       2          03/05/15

0004       2          03/06/15

 

I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions. 


 

ballardw
Super User

@martyvd wrote:

I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not. 

 

ID         Type       Date            Flag

0001       1          03/03/15

0001       1          03/04/15

0001       1          03/05/15

0001       2          03/06/15         1

0001       2          03/07/15

0002       1          03/03/15

0002       1          03/04/15

0002       2          03/05/15         1

0002       2          03/06/15 

0003       1          03/03/15

0003       2          03/04/15         1

0003       2          03/05/15

0004       2          03/06/15

 

I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions. 


Have you covered all of the possible values and "valid" changes? Can any values be missing? If missing then what are the rules involving that for a "valid" or "invalid" change.

 

And how do you want to indicate an invalid change? BTW, you didn't show any values of 3 or what appears to be an invalid change so your example data is incomplete. If practical you should show enough cases to cover the possibilities.

 

I am assuming that you meant to specify "within values of ID". If so, what do you want to indicate for an ID that only has 1 record? Since there is not another to compare then you likely do not want a flag value that appears the same as "no change".

martyvd
Fluorite | Level 6
I do have a list of all possible values as well as valid changes. I didn't think it was important to list that here.

There are no missing values for id, type, or date in my data set. Basically want I want to do is set rules for valid changes and set all other conditions to be invalid.

You are correct that I mean "within values of ID". For an ID that has only one record I suppose it is fine for the flag value to be missing.
ballardw
Super User

@martyvd wrote:
I do have a list of all possible values as well as valid changes. I didn't think it was important to list that here.

 

There are coding approaches that can work for specific values such as: thisvalue - thatvalue > 1 that might work when 1 to 3 is not valid (3 - 1 = 2) but what if 2 to 5 is acceptable? That specific calculation would not be invalid.

So if there are not specific rules then all of the valid (or invalid , which ever is shorter) combinations are needed for code to have a chance to work the first time.

 

If one value can only be valid followed by one specific value, fine. That's a relatively short list. But if 2 could be followed by 4, 7, 8, 10 and 12 we have a completely different kettle of monkeys, or is that a barrel of fish?

martyvd
Fluorite | Level 6

The specific rules are as follows:

1 to 2 or vice versa is valid.
3 to 13 or vice versa is valid.
3 to 23 or vice versa is valid.
97,98, or 99 to any other value is valid.

 

All other changes are invalid. 

Reeza
Super User
Given what is here you should be able to adapt the code to your rules. Are you having issues doing that? If so, post your code please.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1323 views
  • 0 likes
  • 3 in conversation