Hi! I am joing to join/merge/ append...not sure which!?
I have five tables, and I want to merge all table- all rows and all variables. There are some variables that overlap, but some variables that are unique to that table. For example, PatientID is in all tables, but varialbe NDC is only in one table that I am merging (but I need NDC populated in the final table with any missing values assigned to missing). I need all rows and all variables in the final table...with values assigned missing if variable was not in the merged table. Make sense?
I've tried append... but some many values assigned to missing when they should have been populated.
Proc datasets force;
Append base=asthma.FULLDX_FY11to14_IP_OP data=tedi.fulldx_FY11to14_MIL_crgF;
Append base=asthma.FULLDX_FY11to14_IP_OP data= tedni.ALLDX_OPER_FY11to14_MIL_CRGF;
Append base= asthma.FULLDX_FY11to14_IP_OP data= caper.ALLDX_OPER_FY11to14_MIL_CRGF;
Append base=asthma.FULLDX_FY11to14_IP_OP data=sidr.fulldx_FY11to14_MIL_crgF;
run;
I've tried UNION, UNION ALL...both gave tons of errors and did not process.
I tried MERGE, but did not include a BY statment. SHould I have?
TIA for your help!!
Hello @jenim514,
You haven't mentioned the SET statement yet, although the result of a data step of the form shown below would be similar to what you have described as your intended result.
data want;
set have1
have2
have3
have4
have5;
run;
Of course, the above data step could be enhanced. For example, if HAVE1 - HAVE5 were sorted by PatientID, the statement by PatientID; could be inserted in order to keep observations from the same patient together. A variable SOURCE could be introduced indicating which of the five input datasets contributed the respective observation. ...
That said, the main question is: What will be the purpose of the new dataset?
This question should govern the choice of the most appropriate concatenation/interleaving/merging technique.
For example: If both HAVE 2 and HAVE4 contribute one observation from PatientID=1234, would it make sense to have these combined into one observation of WANT?
Append in a number of forms such as Proc Append as well as Proc datasets, Union and data step SET stack the data "vertically" as in each row of each contributing set is added sequentially.
It appears that you likely want a MERGE (data step) or JOIN (Proc Sql).
Since you have multiple tables the orders of steps can be very important. It would help if you could show a few rows of each data set and exactly what the output data set should look like as well as any known rules for which "overlapping" varibles end up in the final data.
It sounds like PatientId is likely to be a key in tying things together but we need to know if it appears multiple times in any of the contributing data sets. If so, then we need to know what rule there may be for matching the different rows in one set to those in another.
Hello @jenim514,
You haven't mentioned the SET statement yet, although the result of a data step of the form shown below would be similar to what you have described as your intended result.
data want;
set have1
have2
have3
have4
have5;
run;
Of course, the above data step could be enhanced. For example, if HAVE1 - HAVE5 were sorted by PatientID, the statement by PatientID; could be inserted in order to keep observations from the same patient together. A variable SOURCE could be introduced indicating which of the five input datasets contributed the respective observation. ...
That said, the main question is: What will be the purpose of the new dataset?
This question should govern the choice of the most appropriate concatenation/interleaving/merging technique.
For example: If both HAVE 2 and HAVE4 contribute one observation from PatientID=1234, would it make sense to have these combined into one observation of WANT?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.