BookmarkSubscribeRSS Feed
Saawan
Obsidian | Level 7

I have joined 3 tables (table1, table 2, table 3) with Empid which is common in all three tables.

The final merged table is Four.

 

 

Data Four;
    merge table1(in=a)
               table2(in=b)
               table3(in=c);

    by Empid;
    if a and b and c;
run;


proc print data = Four;
Run;

 

I got the final table Four. 

 

My question is :

What steps would you take to ensure your dataset is correct apart from reading the log and apart from reading the first few lines of the code as  formal testing?Smiley Happy

8 REPLIES 8
PGStats
Opal | Level 21

First step: read your code again Smiley Happy

 

It is missing the statement

 

by Empid;

 

then look at the first few lines of all datasets involved. Formal testing would require specific business rules.

PG
Saawan
Obsidian | Level 7

Formal testing is not the answer that I am looking for.

 

I know how does this code works. Because that code has been written by me. Apart from that do you have any other info. available for my question? 

 

Thanks for your advice.

Astounding
PROC Star

We tend to take this for granted, so:  READ THE LOG.

 

That's how SAS communicates what happened.  Besides errors / notes / warnings, look at how many observations came in from each source and how many you ended up with.

 

*********************************

 

EDITED:  Given what you describe, then I would add one more item to your checklist.  Check the last 10 observations to see that they come out as expected.  MERGE without a BY statement is not so common, and the anomalies would tend to come at the end of the resulting data set.

Saawan
Obsidian | Level 7

Reading the log, checking the errors and warning messages are the basic rules in SAS when creating a dataset.

 

Anyways thanks for the reply.

Patrick
Opal | Level 21

@Saawan

For formal testing have test cases and define the criteria for "met".

 

For unit testing create test data for your test cases and define the expected outcome. Then run your code and compare the actual outcome with the expected outcome.

 

I would also have some generic test cases like: No errors or warnings and eventually also performance metrics to be met if this is relevant.

Saawan
Obsidian | Level 7
That's Cool. Compare would work in my case.
Reeza
Super User

@Saawan wrote:

I have joined 3 tables (table1, table 2, table 3) with Empid which is common in all three tables.

The final merged table is Four.

Data Four;

merge table1(in=a)

           table2(in=b)

           table3(in=c);

by Empid;

if a and b and c;

run;

proc print data = Four;

Run;

 

I got the final table Four. 

 

My question is :

What steps would you take to ensure your dataset is correct apart from reading the log and apart from reading the first few lines of the code as  formal testing?Smiley Happy


 

Not sure this is formal testing but:

1. Pick a few cases and check the merge is what you expect

2. Check any edge cases possible, are missings correctly excluded, if you have records in one dataset but not others are they dropped. Check the number of records, N of records in the output should be equal to or less than the number of records in any of the contributing data sets.

3. If you have a character variable check if the results differ if you upcase/lowcase everything.

 

 


@Saawan wrote:

apart from reading the first few lines of the code 


 

You should read every line of your code. I'm assuming you mean the first few records of your output data set. 

 

In the future, please included modifications to your question in the thread, not as an edit. 

Reeza
Super User
Note, I've updated the subject line on your main post. SAS query is too generic and doesn't help to explain your question. Please use more descriptive subject lines in the future and include your code in a code editor rather than as text in the forum.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1810 views
  • 2 likes
  • 5 in conversation