Hi again!
This seems like a VERY easy fix that I can't seem to get down. As the title states, I am trying to merge based on an ID variable (PID). When I go to merge the two datasets, it creates new rows even when the PID matches in both datasets. The new dataset puts all of one dataset's variables into a new row and makes the other dataset's variables missing.
My code looks like this:
proc sort data= sub.aptcorrelations;
by pid;
proc sort data= sub.master_dataset053019;
by pid;
data Merged;
set sub.aptcorrelations sub.master_dataset053019;
by pid;
run;
I checked the variables and both PID variables are numeric and 8 in length. They do have different labels which may matter, but I do not think that it does.
I tried changing the order of the datasets and using an inner join. What could I possibly be missing here?
I attached a picture for clarity. I am trying to get everything on one row and matched by PID.
Here is the merge log:
1 ;*';*";*/;quit;run;
2 OPTIONS PAGENO=MIN;
3 %LET _CLIENTTASKLABEL='Program';
4 %LET _CLIENTPROCESSFLOWNAME='Process Flow';
5 %LET _CLIENTPROJECTPATH='S:\Alcstudy\ComboDataEntry\Joe\Inflation\Subcategories\MasterDataset and RegressionLogProp1.egp'
5 ! ;
6 %LET _CLIENTPROJECTNAME='MasterDataset and RegressionLogProp1.egp';
7 %LET _SASPROGRAMFILE=;
8
9 ODS _ALL_ CLOSE;
10 OPTIONS DEV=ACTIVEX;
11 GOPTIONS XPIXELS=0 YPIXELS=0;
12 FILENAME EGSR TEMP;
13 ODS tagsets.sasreport13(ID=EGSR) FILE=EGSR
14 STYLE=HtmlBlue
15 STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/SASEnterpriseGuide/7.1/Styles/HtmlBlue.css")
16 NOGTITLE
17 NOGFOOTNOTE
18 GPATH=&sasworklocation
19 ENCODING=UTF8
20 options(rolap="on")
21 ;
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
22
23 GOPTIONS ACCESSIBLE;
24 data Merged;
25 set sub.aptcorrelations sub.master_dataset053019;
26 by pid;
27 run;
NOTE: There were 191 observations read from the data set SUB.APTCORRELATIONS.
NOTE: There were 616 observations read from the data set SUB.MASTER_DATASET053019.
NOTE: The data set WORK.MERGED has 807 observations and 7692 variables.
NOTE: DATA statement used (Total process time):
real time 0.66 seconds
cpu time 0.14 seconds

Are you trying to interleave(sorted append) or merge?
If match merge is really your objective, the code should be using a merge statement
data Merged;
merge sub.aptcorrelations sub.master_dataset053019;
by pid;
run;
You have one-to-many relationships for some of the pid values.
Corrected: looks like you may have no matches at all, as the obs count of the output is the sum of the obscounts of the inputs.
Are you trying to interleave(sorted append) or merge?
If match merge is really your objective, the code should be using a merge statement
data Merged;
merge sub.aptcorrelations sub.master_dataset053019;
by pid;
run;
I am an absolute moron. Thank you. I looked in every single place except the most obvious one. Wow.
@novinosrin wrote:
Are you trying to interleave(sorted append) or merge?
If match merge is really your objective, the code should be using a merge statement
data Merged; merge sub.aptcorrelations sub.master_dataset053019; by pid; run;
Gosh. Well, it's been the end of a working day ....
lol Well it's 9:51 AM here and am fresh although got a sinus headache that's annoying but coffee helps . Have a great afternoon 🙂
Indeed. Brand the scarlet letter on me and let me face my public humiliation.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.