I have two large tables I am trying to merge. Table 1 is called services and table 2 is customers. I used the data step merge below and it appears that i am missing some records. There can be multiple customers tied to the same service using the 'merge by unitA' but it seems that the code is picking and merging only on the first instance and missing subsequent customer records which should also be matched too but is missing- this is technically working but I am missing customer records. Can this be correct?
DATA YFYS YFNS NFYS;
MERGE Service1(in=a) customer1(in=b); by unitA;
IF A AND B THEN OUTPUT YFYS;
IF A=1 AND B=0 THEN OUTPUT YFNS;
IF A=0 AND B=1 THEN OUTPUT NFYS; RUN;
I think thats it. its a one to many merge. What do you suggest as the best way to accomplish this?
If it really is a many-to-one merge (which way? many customers to a service -- or vice versa?), then the program you showed will accomplish what you apparently want. You describe the problem as "I am missing some records". That's not enough to make a diagnosis, let alone a prescription.
Please show a sample of the data that become missing after running the merge statement. That's what @ballardw was asking for.
Help us help you.
Below is a sample of the data. I am merging by 'Rx'. My output table (want) is missing the second row for jen.
That does not look like a one to one or one to many relationship. Multiple people could have the same RX. And at least some of the RX have multiple PH values (whatever that is). Use SQL to combine them, it will perform a many to many join.
Also learn how to share your data as simple data steps:
data customer; input id $ race $ Rx $; cards; jen w b12 mik h xyy rob b d20 ; data service; input Rx $ ph $; cards; b12 wit b12 wan o18 chi xyy gon d20 rie ; data want; input id $ race $ rx $ ph $; cards; jen w b12 wit jen w b12 wan mik h xyy gon rob b d20 rie ;
Here is code to combine CUSTOMER and SERVICE based on match of RX. I assumed you want all customers even if they do not have any matching service records so used a LEFT JOIN.
proc sql; create table try as select a.id,a.race,a.rx,b.ph from customer a left join service b on a.rx = b.rx order by 1,2,3 ; quit;
Obs id race Rx ph 1 jen w b12 wit 2 jen w b12 wan 3 mik h xyy gon 4 rob b d20 rie
Let's discuss the first problem you identify:
... stuff deleted ...
but I have also used Proc SQL to tie the tables together and still missing records.
"Missing records" is way too amibiguous to diagnose. But from what you have shown us, any missing records are not due to size of the data sets. Please provide an explicit example of a record you expected in the result, but was missing. And provide the initial records from which you expected it to be produced.
You should be able to extract those records into a sample, and then quickly run both your SQL code and DATA step merge code that resulted in missing records.
Show us those codes, and the log file that went with them. Then diagnosis (and maybe prescription) is possible.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.