I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
Use LAG() to check the difference.
data want;
set have;
by ID date; *may need to presort your data;
*get previous Type value;
prev_type = lag(Type);
*if first for that ID, don't use previous value since it will be for wrong ID;
if first.id then call missing(prev_type);
*rules to set flags, for more rules add more if/then statements;
if type=3 and prev_type = 2 then flag=1;
run;
@martyvd wrote:
I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
Use LAG() to check the difference.
data want;
set have;
by ID date; *may need to presort your data;
*get previous Type value;
prev_type = lag(Type);
*if first for that ID, don't use previous value since it will be for wrong ID;
if first.id then call missing(prev_type);
*rules to set flags, for more rules add more if/then statements;
if type=3 and prev_type = 2 then flag=1;
run;
@martyvd wrote:
I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
@martyvd wrote:
I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
Have you covered all of the possible values and "valid" changes? Can any values be missing? If missing then what are the rules involving that for a "valid" or "invalid" change.
And how do you want to indicate an invalid change? BTW, you didn't show any values of 3 or what appears to be an invalid change so your example data is incomplete. If practical you should show enough cases to cover the possibilities.
I am assuming that you meant to specify "within values of ID". If so, what do you want to indicate for an ID that only has 1 record? Since there is not another to compare then you likely do not want a flag value that appears the same as "no change".
@martyvd wrote:
I do have a list of all possible values as well as valid changes. I didn't think it was important to list that here.
There are coding approaches that can work for specific values such as: thisvalue - thatvalue > 1 that might work when 1 to 3 is not valid (3 - 1 = 2) but what if 2 to 5 is acceptable? That specific calculation would not be invalid.
So if there are not specific rules then all of the valid (or invalid , which ever is shorter) combinations are needed for code to have a chance to work the first time.
If one value can only be valid followed by one specific value, fine. That's a relatively short list. But if 2 could be followed by 4, 7, 8, 10 and 12 we have a completely different kettle of monkeys, or is that a barrel of fish?
The specific rules are as follows:
1 to 2 or vice versa is valid.
3 to 13 or vice versa is valid.
3 to 23 or vice versa is valid.
97,98, or 99 to any other value is valid.
All other changes are invalid.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.