I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
Use LAG() to check the difference.
data want;
set have;
by ID date; *may need to presort your data;
*get previous Type value;
prev_type = lag(Type);
*if first for that ID, don't use previous value since it will be for wrong ID;
if first.id then call missing(prev_type);
*rules to set flags, for more rules add more if/then statements;
if type=3 and prev_type = 2 then flag=1;
run;
@martyvd wrote:
I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
Use LAG() to check the difference.
data want;
set have;
by ID date; *may need to presort your data;
*get previous Type value;
prev_type = lag(Type);
*if first for that ID, don't use previous value since it will be for wrong ID;
if first.id then call missing(prev_type);
*rules to set flags, for more rules add more if/then statements;
if type=3 and prev_type = 2 then flag=1;
run;
@martyvd wrote:
I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
@martyvd wrote:
I have an issue similar to the one described here, but a bit more complex. See example data below. I need to flag when the type changes, but only for certain invalid changes. For example changing from type 1 to type 2 or vice versa is valid but changing from type 2 to type 3 is not.
ID Type Date Flag
0001 1 03/03/15
0001 1 03/04/15
0001 1 03/05/15
0001 2 03/06/15 1
0001 2 03/07/15
0002 1 03/03/15
0002 1 03/04/15
0002 2 03/05/15 1
0002 2 03/06/15
0003 1 03/03/15
0003 2 03/04/15 1
0003 2 03/05/15
0004 2 03/06/15
I am hoping to create a variable for each observation that indicates valid change, invalid change, or no change. I would appreciate any suggestions.
Have you covered all of the possible values and "valid" changes? Can any values be missing? If missing then what are the rules involving that for a "valid" or "invalid" change.
And how do you want to indicate an invalid change? BTW, you didn't show any values of 3 or what appears to be an invalid change so your example data is incomplete. If practical you should show enough cases to cover the possibilities.
I am assuming that you meant to specify "within values of ID". If so, what do you want to indicate for an ID that only has 1 record? Since there is not another to compare then you likely do not want a flag value that appears the same as "no change".
@martyvd wrote:
I do have a list of all possible values as well as valid changes. I didn't think it was important to list that here.
There are coding approaches that can work for specific values such as: thisvalue - thatvalue > 1 that might work when 1 to 3 is not valid (3 - 1 = 2) but what if 2 to 5 is acceptable? That specific calculation would not be invalid.
So if there are not specific rules then all of the valid (or invalid , which ever is shorter) combinations are needed for code to have a chance to work the first time.
If one value can only be valid followed by one specific value, fine. That's a relatively short list. But if 2 could be followed by 4, 7, 8, 10 and 12 we have a completely different kettle of monkeys, or is that a barrel of fish?
The specific rules are as follows:
1 to 2 or vice versa is valid.
3 to 13 or vice versa is valid.
3 to 23 or vice versa is valid.
97,98, or 99 to any other value is valid.
All other changes are invalid.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.