Hello,
I am trying to test whether or not records of a particular category all have consistent values in 3 fields, ideally within a data step (no other reason than this is how all of the other tests are done, and I'm a sucker for consistency).
Unfortunately this is work related, and the data is highly confidential... but I believe I can very closely align this to a typical "store front" type data structure for all intents and purposes.
There are 5 key fields I need to consider within this metaphor. Account Number, Order Number, Store Name, Product Name, Product Price. I need to test that for each Account Number, for each Order Number... Store Name, Product Name, and Product Price are all consistent (the same). Where this analogy breaks from reality. is that the Store Name / Product Name / Product Price can be different for different Accounts... but for each Account, they must be the same.
I would love to share a code sample, but I feel as though Data Steps can only look at 1 records at a time so I'm just not sure how to even get started... But maybe a mock data sample might help. The desired result here, is that due to Order 003 being different... all 3 records get flagged as inconsistent. (My tests all return a 0 or a 1 for pass/fail in the summary report.)
Account # Order # StoreName ProductName ProductPrice
00000001 001 Store1 Product1 100.00
00000001 002 Store1 Product1 100.00
00000001 003 Store2 Product1 100.00
Each account can have any number of Orders. I read up on being able to look at the previous/next row... but I couldn't ascertain how to apply that to my given situation. I thought maybe I would simply need to iterate... Compare 001 to 002, then 002 to 003... But that thought line died when I asked myself how I would then update 001 to flag as fail if 002 vs. 003 failed the integrity check.
I feel at this point I'm over complicating and over thinking the problem. Any assistance is much appreciated.
I apologize for the lack of code and real data.
Best Regards,
-Kade
@Kade_M Hi and welcome to the SAS Community 🙂
There are many ways to do this. Here is one. I added another group to your data for demonstration purpose.
Feel free to ask if you have any questions.
data have;
input Account $ Order $ StoreName $ ProductName $ ProductPrice;
datalines;
00000001 001 Store1 Product1 100.00
00000001 002 Store1 Product1 100.00
00000001 003 Store2 Product1 100.00
00000002 001 Store1 Product1 100.00
00000002 002 Store1 Product1 100.00
00000002 003 Store1 Product1 100.00
;
data want (drop=_:);
do until (last.Account);
set have;
by Account;
if first.Account then do;
_StoreName = StoreName;
_ProductName = ProductName;
_ProductPrice = ProductPrice;
flag = 0;
end;
if _StoreName ne StoreName |
_ProductName ne ProductName |
_ProductPrice ne ProductPrice then flag = 1;
end;
do until (last.Account);
set have;
by Account;
output;
end;
run;
Result:
Account Order StoreName ProductName ProductPrice flag 00000001 001 Store1 Product1 100 1 00000001 002 Store1 Product1 100 1 00000001 003 Store2 Product1 100 1 00000002 001 Store1 Product1 100 0 00000002 002 Store1 Product1 100 0 00000002 003 Store1 Product1 100 0
@Kade_M Hi and welcome to the SAS Community 🙂
There are many ways to do this. Here is one. I added another group to your data for demonstration purpose.
Feel free to ask if you have any questions.
data have;
input Account $ Order $ StoreName $ ProductName $ ProductPrice;
datalines;
00000001 001 Store1 Product1 100.00
00000001 002 Store1 Product1 100.00
00000001 003 Store2 Product1 100.00
00000002 001 Store1 Product1 100.00
00000002 002 Store1 Product1 100.00
00000002 003 Store1 Product1 100.00
;
data want (drop=_:);
do until (last.Account);
set have;
by Account;
if first.Account then do;
_StoreName = StoreName;
_ProductName = ProductName;
_ProductPrice = ProductPrice;
flag = 0;
end;
if _StoreName ne StoreName |
_ProductName ne ProductName |
_ProductPrice ne ProductPrice then flag = 1;
end;
do until (last.Account);
set have;
by Account;
output;
end;
run;
Result:
Account Order StoreName ProductName ProductPrice flag 00000001 001 Store1 Product1 100 1 00000001 002 Store1 Product1 100 1 00000001 003 Store2 Product1 100 1 00000002 001 Store1 Product1 100 0 00000002 002 Store1 Product1 100 0 00000002 003 Store1 Product1 100 0
@PeterClemmensen Thank you! I am really excited about your solution as there are several things I'm unfamiliar with, so lots to learn here!
Looking over the code, it makes perfect sense. I have a couple questions though, just to make sure I understand what is happening.
My last question, I've been doing all of my tests in a DATA step as I mentioned... Each one sorta looks like this redacted sample below. Could I add this code into the same data step? I feel the answer is no, because then it would execute this loop every single row as it processes the other tests.
*************************************************************;
* TEST_1 scrubbed description
*************************************************************;
IF field_1 ne '.' AND field_2 ne '.' THEN
IF field_1 > field_2 THEN
DO;
TEST_1 = 1;
HAS_ERROR = 1;
END;
ELSE TEST_1 = 0;
ELSE TEST_1 = 0;
Time in invaluable, thank you for sharing yours to help me learn.
Best Regards,
-Kade
@Kade_M, anytime 🙂 Glad you found your solution.
- As to the drop=_: part, you are correct. I use the colon operator to specify a list of variables that begin with a common prefix.
- The technique here is known as a 'Double DoW Loop'. Plenty of information online if you want to get a good grasp of it. Shortly put, the second loop ensures that I flag every observation of the By-Group where there is an inconsistency. And not only the observations where the inconsistency is present.
As to your final question, I'm not sure. I don't know how or where your field variables are calculated. However, my guess is that you can put the code at the bottom of the data step, ie after the second loop.
Regards.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.