BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ramgouveia
Obsidian | Level 7

I want to merge 20 datasets with the same variable names - three character variables and one numeric.

When execute this code:

 

data COMB;
set COMB01 COMB02 COMB03 COMB04 COMB06 COMB07 COMB08 COMB09 COMB11 COMB12 COMB13 COMB14 COMB16 COMB17 COMB18 COMB19 COMB21 COMB22 COMB23 COMB24;
run;

I get duplicate registers from the characters variables with different and incorrect values in the numeric variable.
Is 20 datasets too much for this comand?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The right solution is to go back to the steps that created the 20 input datasets and make sure that you define the variables consistently.  For example instead of using PROC IMPORT to read a text file write your own data step so that you have control over how the variables are defined.

 

But for your simple application with ONLY FOUR VARIABLES you can just fix it in this step by adding a LENGTH statement before the SET statement to define all four of the variables.  That will let you define the ORDER, the TYPE and the storage LENGTH of each variable.  You migth want to also set the FORMAT attached to the variable to avoid the problem that the character values look truncated because the display width is shorter than the storage length.

 

Let's assume the other two variables are numeric then the code might look like this

data COMB;
  length var1 8 ACTIVITY $30 COUNTERPART_AREA $50 var2 8 ;
  set COMB01-COMB09 COMB11-COMB14 COMB16-COMB19 COMB21-COMB24;
  format var2 date9.  var1 activity counterpart ;
run;

 

PS What you are doing is not "merging" the datasets. You are just concatenating (or appending) the datasets.

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

The code you posted should work fine, if the datasets have the same variables defined the same way.

You can simplify the code by using dataset lists.

data COMB;
  set COMB01-COMB09 COMB11-COMB14 COMB16-COMB19 COMB21-COMB24;
run;

No idea what a "duplicate registers" means.  Can you share the SAS log for this step with all of the notes/warnings/errors that SAS generates?

 

 

 

ramgouveia
Obsidian | Level 7

@Tom 

 

Here it is the log

766        /*Junta todas as COMB*/
767        data COMB;
768        set COMB01 COMB02 COMB03 COMB04 COMB06 COMB07 COMB08 COMB09 COMB11 COMB12 COMB13 COMB14 COMB16 COMB17 COMB18 COMB19
768      ! COMB21 COMB22 COMB23 COMB24;
769        run;

WARNING: Multiple lengths were specified for the variable ACTIVITY by input data set(s). This can cause truncation of data.
WARNING: Multiple lengths were specified for the variable COUNTERPART_AREA by input data set(s). This can cause truncation of data.
NOTE: There were 165 observations read from the data set WORK.COMB01.
NOTE: There were 330 observations read from the data set WORK.COMB02.
NOTE: There were 330 observations read from the data set WORK.COMB03.
NOTE: There were 165 observations read from the data set WORK.COMB04.
NOTE: There were 330 observations read from the data set WORK.COMB06.
NOTE: There were 660 observations read from the data set WORK.COMB07.
NOTE: There were 638 observations read from the data set WORK.COMB08.
NOTE: There were 231 observations read from the data set WORK.COMB09.
NOTE: There were 22 observations read from the data set WORK.COMB11.
NOTE: There were 44 observations read from the data set WORK.COMB12.
NOTE: There were 44 observations read from the data set WORK.COMB13.
NOTE: There were 22 observations read from the data set WORK.COMB14.
NOTE: There were 11 observations read from the data set WORK.COMB16.
NOTE: There were 22 observations read from the data set WORK.COMB17.
NOTE: There were 22 observations read from the data set WORK.COMB18.
NOTE: There were 11 observations read from the data set WORK.COMB19.
NOTE: There were 3718 observations read from the data set WORK.COMB21.
NOTE: There were 5247 observations read from the data set WORK.COMB22.
NOTE: There were 770 observations read from the data set WORK.COMB23.
NOTE: There were 440 observations read from the data set WORK.COMB24.
NOTE: The data set WORK.COMB has 13222 observations and 4 variables.
NOTE: DATA statement used (Total process time):
32                                                         The SAS System                             16:34 Thursday, March 30, 2023

      real time           0.01 seconds
      cpu time            0.01 seconds
      

Thank you for your help.

ramgouveia
Obsidian | Level 7

Thank you @Reeza 

 

Your link also solved the problem.

Tom
Super User Tom
Super User

The right solution is to go back to the steps that created the 20 input datasets and make sure that you define the variables consistently.  For example instead of using PROC IMPORT to read a text file write your own data step so that you have control over how the variables are defined.

 

But for your simple application with ONLY FOUR VARIABLES you can just fix it in this step by adding a LENGTH statement before the SET statement to define all four of the variables.  That will let you define the ORDER, the TYPE and the storage LENGTH of each variable.  You migth want to also set the FORMAT attached to the variable to avoid the problem that the character values look truncated because the display width is shorter than the storage length.

 

Let's assume the other two variables are numeric then the code might look like this

data COMB;
  length var1 8 ACTIVITY $30 COUNTERPART_AREA $50 var2 8 ;
  set COMB01-COMB09 COMB11-COMB14 COMB16-COMB19 COMB21-COMB24;
  format var2 date9.  var1 activity counterpart ;
run;

 

PS What you are doing is not "merging" the datasets. You are just concatenating (or appending) the datasets.

ballardw
Super User

 

How exactly do you know that the numeric variable is "incorrect". Be very explicit and include the FORMAT assigned to the resulting variable.

 

When asking questions about any result it is best practice to copy the LOG, include the code submitted along with ALL notes, messages, warnings or errors. On the forum open a text box using the </> icon above the message and paste all of that text. That way we don't have to ask a bunch of questions that the log answers. The text box is important to preserve the formatting of any diagnostic messages that SAS often includes as the main message windows on this forum reformat text when pasted.

 

Numeric values may be more problematic depending on the original values and how they were read though typically the main issue is going to be the format assigned. If you are seeing things like 1E10 or 1E-12, that is exponential notation and occurs when the value is large or very small but the default format assigned to the variable won't display all of the digits.

An example of how the format affects the appearance:

data _null_;
   x= 10000000000;
   y= 0.0000000132;
   file print;
   put 'Best5 format ' x= best5. y= best5.;
   put 'F6 format'     x= f6.    y= f6. ;
   put 'f8.3 format'   x= f8.3   y= f8. ;
   put 'f28.13 format'   x= f28.13   y= f28.13;
   put 'Best16. format' x= best16. y= best16.;
   put 'Best32. format' x= best32. y= best32.;
run;

The results will show in the Results window.

If you have a value with more than 14 digits make sure that it should actually be a number. If the value is actually something like an identification code or account number there may be issues with things like missing leading zero and rounding of large values due to the precision of numeric storage that SAS uses.

 

 

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 870 views
  • 3 likes
  • 4 in conversation