Hi,
So I'm working in EG 7.1 and I have a dataset with over 4 million records and another dataset with around 200,000 records. For explanation purposes I'll call them lunch and time, respectively.
I need to output all the person_ids from lunch that aren't in time. I have tried the following -
Functionally, the code works. It's outputting around 3 million records which makes sense, however the result I need to get to is approx. 150,000. At this point I'm thinking that there's something my boss has forgotten to tell me about the data, however I was just wondering if anyone has any other ideas on how to do this?
Thanks in advance.
If I understand this right then you've used multiple coding approaches for combining the data and you've always got the same result. Looks very much like your boss didn't tell you something - or the data is different from what your boss thinks it is.
May be run a proc freq over your lunch dataset to see if there are some DQ issues like a few matching id's with very high (too high) volumes.
If I understand this right then you've used multiple coding approaches for combining the data and you've always got the same result. Looks very much like your boss didn't tell you something - or the data is different from what your boss thinks it is.
May be run a proc freq over your lunch dataset to see if there are some DQ issues like a few matching id's with very high (too high) volumes.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.