I am trying to merge HPI score based on the zip code. However, it only shows up once instead of applying the score to each row. This code worked for other part of my coding but I am not sure why it's not working for this:
[prefinal_missing]
orgID hplzip ...
111112 90017
111113 90017
111114 90017
[HPI_hpl]
hplzip HPIscore
90017 0.026
Code:
proc sort data=prefinal_missing; by hplzip; run;
proc sort data=HPI_hpl; by hplzip; run;
data test;
merge prefinal_missing(in=in1) HPI_hpl;
by hplzip;
if in1;
attrib _all_ label='';
run;
What comes out:
orgID hplzip HPIscore ...
111112 90017 0.026
111113 90017 .
111114 90017 .
What I want:
orgID hplzip HPIscore ...
111112 90017 0.026
111113 90017 0.026
111114 90017 0.026
That is usually caused by making the mistake of having the variable already existing in MANY dataset.
data test;
merge prefinal_missing(in=in1 drop=hpiscore) HPI_hpl;
by hplzip;
if in1;
attrib _all_ label='';
run;
Your code works for me:
Are there errors in the log? Are the values of hplzip numeric or character? If character, are there unnecessary spaces before or after the characters? Or perhaps there are non-printable characters in some of the values of HPLZIP.
From now on, please (EVERY SINGLE TIME) present data as working SAS data step code, such as this:
data HPI_hpl;
input hplzip HPIscore;
cards;
90017 0.026
;
You can use this macro to create this working DATA step code for you: https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/
That is usually caused by making the mistake of having the variable already existing in MANY dataset.
data test;
merge prefinal_missing(in=in1 drop=hpiscore) HPI_hpl;
by hplzip;
if in1;
attrib _all_ label='';
run;
Use Tom's solution and note that by using
options msglevel=I;
the MERGE step writes an INFO message about this potential issue to the log:
INFO: The variable HPIscore on data set WORK.PREFINAL_MISSING will be overwritten by data set WORK.HPI_HPL.
Which should remind you to drop the conflicting variable from dataset PREFINAL_MISSING, as Tom has suggested, and hence avoid that log message.
Actually, the INFO message describes that in the first observation of a BY group the HPIscore from PREFINAL_MISSING is overwritten with the corresponding value from HPI_HPL (as desired). But the problem is that in the second observation of the BY group the retained value of HPIscore from HPI_HPL is overwritten with the new value read from PREFINAL_MISSING and it is never restored, while more values from PREFINAL_MISSING are read in subsequent observations of the BY group -- apparently missing values in your case.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.