Hi,
I am trying to concatenate multiple files from one single folder. I was getting some warning and would like to know how to resolve the problem. Thanks
The main thing is to make sure the variables are defined consistently across the datasets.
How did you create the original datasets?
How did they end up with the PATIENT_FIRST_NAME and NOTES variables defined with different lengths in the different datasets?
You can use PROC CONTENTS to see the variable attributes. The key things you need to check is the TYPE and the LENGTH. You definitely cannot combine datasets when the same variable is defined as different types in the two datasets. If the LENGTH of characters variables are different then that is what it causing the message you are seeing. You might also need to check the FORMATS (if any) that are attached to the variables. Because if you fix the length of the character variable the values will store ok, but if you accidentally end up with a format that only display some of the characters in the variable then it will still cause you trouble down the line.
The main thing is to make sure the variables are defined consistently across the datasets.
How did you create the original datasets?
How did they end up with the PATIENT_FIRST_NAME and NOTES variables defined with different lengths in the different datasets?
You can use PROC CONTENTS to see the variable attributes. The key things you need to check is the TYPE and the LENGTH. You definitely cannot combine datasets when the same variable is defined as different types in the two datasets. If the LENGTH of characters variables are different then that is what it causing the message you are seeing. You might also need to check the FORMATS (if any) that are attached to the variables. Because if you fix the length of the character variable the values will store ok, but if you accidentally end up with a format that only display some of the characters in the variable then it will still cause you trouble down the line.
Hi @CathyVI
The proper solution is to follow @Tom 's advice and get your input data sets om a common form before you append them.
I suppose the data come from a monthly csv-file or something similar, and they are imported with a proc import, so the varaibles have their lengths depending on the current month's actual data. write uour own data steps instead, so you can control what's happening.
As your warnings say: The variables in your output get the length and attributes of the first dataset in the set statement, and longer names or notes in later datasets are truncated, so you get incomplete data.
A quick and dirty solution - definitely not recommended for anything that isn't a one-shot, but ends up in a production job - is this:
libname a "/_B/P4/PC/SASDC/idkey_2020_2021";
data idkey_all;
length PATIENT_FIRST_NAME $60 NOTES $512;
format PATIENT_FIRST_NAME $60. NOTES $512.;
set
a.idkey_for_202011 a.idkey_for_202012 a.idkey_for_202101
a.idkey_for_202102 a.idkey_for_202103 a.idkey_for_202104;
run;
The lengths should be set to (at least) the max. lengths in all monthly data sets. You have the correct length when the code runs without warnings.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.