BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

In the following code I want to conditionally create the variables based on the validation which was in the code below. In case if the 'Insurance_ID' variable is missing in the dataset where we read through SET Statement then I want to open the other dataset (insurance_source) and see whether that variable is missing. If it is still missing then my Validation is right else I want a note in the log file as stated in the comment portion /**/

 

Because Insurance_ID is first created in insurance_source dataset as a Primary key and this variable is available in other dataset 'insurance_data' which is being coming from insurance_source dataset. so if the Insurance_ID is missing in the bothe the datasets then my validation is right and if it is missing in one of the dataset 'insurance_data' then it means something has went wrong while creating this dataset which i need to investigate and therefore I need a note in the log file.

 

data error_data;
length FLT_ID 8	ERR_MSG $100 FLT_REC $200;
keep FLT_ID ERR_MSG FLT_REC;
set insurance_data;
FLT_REC=cats(SOURCE_CD,'|',FILE_ID,'|',Delivery_ID)
FLT_ID=44;
ERR_MSG_TXT="Insurance ID cannot be missing";
if missing(Insurance_ID) then
/*look at the other dataset insurance_source and check for Insurance_ID is missing and if still missing then our validation is right else write a note in the log stating inspect the source dataset*/
output;
run;
7 REPLIES 7
Kurt_Bremser
Super User

What do you mean by "missing"?

The variable is not there in the dataset, or it contains missing values?

If it is the latter, on which condition do you join the two datasets to look for a non-missing value?

David_Billa
Rhodochrosite | Level 12
Missing means missing values.

I'm not certain about join condition. Both the tables are part of ETL.
Kurt_Bremser
Super User

@David_Billa wrote:
Missing means missing values.

I'm not certain about join condition. Both the tables are part of ETL.

Well, you need to know which observation from the second dataset you should select to replace a missing insurance_id in the first, or you'll get unpredictable results.

 

To further clarify, you should post example data for both datasets (in usable form, as shown in https://communities.sas.com/t5/SAS-Programming/Data-validation-and-writing-the-error-record-in-a-var...), and what you expect as result.

David_Billa
Rhodochrosite | Level 12
It's OK if I get the unpredicted results. Atleast I know the way to develop
further.
Tom
Super User Tom
Super User

That sounds like a complex operation. But I am a little confused about want you are trying to do.

 

Currently it sounds like you already have a list of errors and want to confirm that they are errors. So you want scan a list of error codes.  And then for certain error codes perform some separate process that tests whether that error code was properly raised.  Sort of quality control process?

 

Or do you mean that you have a control file that describes what tests you want perform.  So there is record in the metadata/control file that says check if variable X in dataset Y has any (all?) missing values.  

 

For either of these I would traditionally use the control/metadata/error dataset to drive the creation of the code that needs to run.  Then run the code and collect the results.  Now with the introduction of DOSUBL() function might be able to run steps in a separate execution space while your current data step pauses so you could control the whole flow with a single data step.

 

In any case it might make sense to create macros that generate the code for the tests based on input parameters. Then the code generation in either case is just a matter of passing the values to the parameters in the macro call.

Astounding
PROC Star

I can think of several ways to go about this, some interesting, some boring.  But all of them depend on what is in your INSURANCE_SOURCE data set.  I know it's clear as day to you, but it's clear as mud to us.  Either list a few observations, or describe what is in it.  The programming can't begin until we understand what is in the data.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 610 views
  • 0 likes
  • 4 in conversation