BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mv_fes
Calcite | Level 5

On the merge function, looking the on-line documentation (http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001318494.htm)

There is a description when two records are merged, and one field has the same name.  The documentation indicates the last record read will be the value in the merged record.

(from the website)

If you do not use the RENAME= option and a variable exists in more than one data set, then the value of that variable in the last data set read is the value that goes into the result record.

 

My question: is there ever a time when the field in the first record will ever be the value in the result record?

IE:

Record1: field expressn (size char(4)), description code

Record2: field expressn (size char(1)), express (value x or space)

 

Data merge_record;

Merge (record1, record2)

On fields a, b

 

Would merge_record.expressn ever be size char(4) – descriptive code?  As I understand the on-line documentation, the merge_record.expressn will always be size char(1) – express (value x or space).

 

Any help would be much appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

The size (and other attributes) will be set from the first variable encountered during the compilation phase, but the contents will be from the last record read during the execution phase.

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

The size (and other attributes) will be set from the first variable encountered during the compilation phase, but the contents will be from the last record read during the execution phase.

Astounding
PROC Star

As @Kurt_Bremser explained, the definition of the new variable will be $4. 

 

Here are some cases that illustrate this concept of "last record read" when merging by a b:

 

  • When there is a 1 to 1 match, the value read comes from the second data set and thus will always be no more than 1 character long.
  • When there is a 1 to many match (many observations in Record 2 for the same a b combination), again the last value read comes from the second data set and thus will always be no more than 1 character long.
  • When there is a mismatch and the a b combination does not appear in the second data set, the last record read comes from the first data set and can be up to 4 characters long.
  • The tricky case:  when there is a many to 1 match (multiple observations for the a b combination from the first data set, but just one observation in the second data set), here's what results for that a b combination.  For the first observation, the value comes from the second data set (1 character at most).  For all the remaining observations in that a b combination, the last value comes from the first data set and will be up to 4 characters long.

 

It's trickier than it looks, and is based on the idea that SAS never re-reads an observation.  It just keeps whatever it read last.  You should probably create a few test data sets to try it out and see what you get.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 803 views
  • 0 likes
  • 3 in conversation