BookmarkSubscribeRSS Feed
Tbell1
Calcite | Level 5

I am trying to merge two dataset of unequal rows but after merging i got completely empty data. The first dataset has 50 columns and 6320 rows while the second dataset has 5 columns and 220 rows. was trying to merge by a common variable but I got empty data after my merge. This is my code, what am i missing?

 

data BTS201505WTHR (drop=div:);
merge BTS201505 WTHRDATA(keep= FlightDate PRCP SNWD SNOW TMAX TMIN);
by FlightDate ;
if Origin='ORD';
run ;

8 REPLIES 8
Astounding
PROC Star
You are missing the notes in the log that tell you what happened.

If you want help, post them.
Tbell1
Calcite | Level 5
There were no errors in the log but the output of the merged data did not return any values for the columns in the second dataset
andreas_lds
Jade | Level 19

Looks as if the subsetting-if does not what you expect it to do. Right now you drop any observation not having "ORD" as value of Origin, so checking the values of the variable is the first thing i would do.

Tbell1
Calcite | Level 5
I removed the if subset but there was no difference, i still got empty columns for the merged dataset
andreas_lds
Jade | Level 19

@Tbell1 wrote:
I removed the if subset but there was no difference, i still got empty columns for the merged dataset

Have read the log? Are there any unexpected notes? Can you post the log as text, not as screenshot, not as attachment?

Tbell1
Calcite | Level 5
NOTE: There were 6320 observations read from the data set BTS201505.
NOTE: There were 220 observations read from the data set WTHRDATA.
NOTE: The data set BTS201505WTHR has 6245 observations and 42 variables.
NOTE: DATA statement used (Total process time):
real time 5.99 seconds
cpu time 2.86 seconds
Patrick
Opal | Level 21

In case the columns in your keep list for the 2nd data set are empty then they will overwrite same named columns from your first data set.

WTHRDATA(keep= FlightDate PRCP SNWD SNOW TMAX TMIN);

For cases where there is no match between the DS over FlightDate all columns from the DS not contributing to the obs will be missing.

 

Are there any other case where you've got missings in a column that isn't already missing in the source table?

 

The merge statement is already so long around that it's certainly not a SAS bug so it must be either merge logic or source data.

Tom
Super User Tom
Super User

Check whether or not you are actually getting any matches.  Are you expecting a one to one match?  A one to many match?  Or a many to many match?

 

You can use the IN= datastep option to create a variable with indicator of whether that dataset contributes to the merge.

Assign the values to a new variable to keep them in the results.

 

data BTS201505WTHR (drop=div:);
  merge BTS201505(in=in1)
       WTHRDATA(in=in2 keep= FlightDate PRCP SNWD SNOW TMAX TMIN)
  ;
  by FlightDate ;
  length source $32 ;
  if in1 and in2 then source='BOTH';
 else if in1 then source='BTS201505';
 else source='WTHRDATA';
run ;

proc freq ;
  tables source;
run;

 

Do the variables from the second dataset (PRCP SNWD SNOW TMAX TMIN) already exist in the first dataset?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2898 views
  • 0 likes
  • 5 in conversation