Solved: Question about SAS merge function

mv_fes · Posted 12-01-2018 06:05 PM

On the merge function, looking the on-line documentation (http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001318494.htm)

There is a description when two records are merged, and one field has the same name. The documentation indicates the last record read will be the value in the merged record.

(from the website)

If you do not use the RENAME= option and a variable exists in more than one data set, then the value of that variable in the last data set read is the value that goes into the result record.

My question: is there ever a time when the field in the first record will ever be the value in the result record?

IE:

Record1: field expressn (size char(4)), description code

Record2: field expressn (size char(1)), express (value x or space)

Data merge_record;

Merge (record1, record2)

On fields a, b

Would merge_record.expressn ever be size char(4) – descriptive code? As I understand the on-line documentation, the merge_record.expressn will always be size char(1) – express (value x or space).

Any help would be much appreciated.

Kurt_Bremser · Posted 12-01-2018 07:09 PM

The size (and other attributes) will be set from the first variable encountered during the compilation phase, but the contents will be from the last record read during the execution phase.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 12-01-2018 07:09 PM

The size (and other attributes) will be set from the first variable encountered during the compilation phase, but the contents will be from the last record read during the execution phase.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Astounding · Posted 12-01-2018 08:11 PM

As @Kurt_Bremser explained, the definition of the new variable will be $4.

Here are some cases that illustrate this concept of "last record read" when merging by a b:

When there is a 1 to 1 match, the value read comes from the second data set and thus will always be no more than 1 character long.
When there is a 1 to many match (many observations in Record 2 for the same a b combination), again the last value read comes from the second data set and thus will always be no more than 1 character long.
When there is a mismatch and the a b combination does not appear in the second data set, the last record read comes from the first data set and can be up to 4 characters long.
The tricky case: when there is a many to 1 match (multiple observations for the a b combination from the first data set, but just one observation in the second data set), here's what results for that a b combination. For the first observation, the value comes from the second data set (1 character at most). For all the remaining observations in that a b combination, the last value comes from the first data set and will be up to 4 characters long.

It's trickier than it looks, and is based on the idea that SAS never re-reads an observation. It just keeps whatever it read last. You should probably create a few test data sets to try it out and see what you get.

Question about SAS merge function

Re: Question about SAS merge function

Re: Question about SAS merge function

Re: Question about SAS merge function

Question about SAS merge function

Re: Question about SAS merge function

Re: Question about SAS merge function

Re: Question about SAS merge function

The 2025 SAS Hackathon has begun!