BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Guerraje
Quartz | Level 8

Hello, 

I am trying to merge 4 datasets on the variable county. When trying to do this I reviece the error:

 
WARNING: Multiple lengths were specified for the BY variable county by input data sets. This might cause unexpected results.
ERROR: BY variables are not properly sorted on data set WORK.NAMES.
 
I am not sure what this means, or how to fix the code from the dataset work.names so that I can complete my merge. The code for the work.names dataset is: 
proc sql noprint;
select catx ('=',nliteral(_name_),nliteral(substr(_name_,7)))
into : rename separated by ' '
from names
where _name_ like 'county%'
;
quit;
%put &=sqlobs;
%put &=rename; 
run; 

Data fixedcounty;
SET coimpt.insurance;
rename =rename;
run; 

 


PROC TRANSPOSE DATA = fixedcounty OUT= names;
var county:;
run;

data names;
set names;
county=tranwrd(_name_,"county", "");
 my_new_county = tranwrd(county,"_", " ");
run;


proc print data= names;
run;
 proc sort data= names; BY  descending county;
run; 
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Make sure that all datasets have the same length for the BY variable(s). How you do that depends on how the variables are created (read from external file, existing dataset, transpose, ...). Set to the longest of all defined lengths to avoid truncation.

View solution in original post

11 REPLIES 11
Reeza
Super User
You don't include the code that would generate that error. Your code above looks like it has some errors as well, so make sure that everything before your merge has no errors in the log.
Guerraje
Quartz | Level 8

There are no errors in the entire log but this one. 

Tom
Super User Tom
Super User

@Guerraje wrote:

There are no errors in the entire log but this one. 


The code you posted only has a BY statement in the PROC SORT step.  That step would never generate an error message like the one you mentioned.

 

Post the lines of text from the actual SAS log.  Use the Insert Code button (looks like < / > ) to get a pop-up window where you can paste/edit the lines of text copied from the SAS log.  This will preserve the formatting of the text.  That is important because SAS clearly marks in the SAS log the line of code where it detected an error.

Guerraje
Quartz | Level 8

Tom, this is the log from where that error is created. 

 

thank you, 



1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 68 69 ****************CoAnlCombined_Analysis **********; 70 Data CoAnlCombined_Analysis; 71 Merge HisLatpercent 72 names 73 languagepercent 74 work.obese 75 ; 76 By County ; 77 run; WARNING: Multiple lengths were specified for the BY variable county by input data sets. This might cause unexpected results. ERROR: BY variables are not properly sorted on data set WORK.NAMES. county=YUMA languageHisLat=. denominator=. HisLatfinal=. _NAME_=countyYUMA _LABEL_=countyYUMA COL1=78.16 my_new_county=YUMA language_avg=. languagefinal=. estimate= state= risk_pct_obese=. FIRST.county=1 LAST.county=1 _ERROR_=1 _N_=1 NOTE: The SAS System stopped processing this step because of errors. NOTE: There were 1 observations read from the data set WORK.HISLATPERCENT. NOTE: There were 2 observations read from the data set WORK.NAMES. NOTE: There were 1 observations read from the data set WORK.LANGUAGEPERCENT. NOTE: There were 1 observations read from the data set WORK.OBESE. WARNING: The data set WORK.COANLCOMBINED_ANALYSIS may be incomplete. When this step was stopped there were 0 observations and 13 variables. WARNING: Data set WORK.COANLCOMBINED_ANALYSIS was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2354.59k OS Memory 28596.00k Timestamp 12/07/2021 01:31:05 AM Step Count 206 Switch Count 0 Page Faults 0 Page Reclaims 357 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 1 Block Input Operations 0 Block Output Operations 8 78 79 80 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 90
Tom
Super User Tom
Super User

What type of investigations did you do?

 

 

 WARNING: Multiple lengths were specified for the BY variable county by input data sets. This might cause unexpected results.

 

 

Did you check the datasets to see if COUNTY is defined with the same length in all four of those datasets?  You can use PROC CONTENTS to get a report on a dataset.  How did you create those four datasets?  In each one how did you create the variable COUNTY?

 

 

 ERROR: BY variables are not properly sorted on data set WORK.NAMES.
 county=YUMA languageHisLat=. denominator=. HisLatfinal=. _NAME_=countyYUMA _LABEL_=countyYUMA COL1=78.16 my_new_county=YUMA
 language_avg=. languagefinal=. estimate=  state=  risk_pct_obese=. FIRST.county=1 LAST.county=1 _ERROR_=1 _N_=1

 

 

Did you try sorting datasets by COUNTY before running this data step? It does seem very strange that the first value of COUNTY is "YUMA".

 

How does this data step relate to the other code you posted?

What is the overall goal of your program?

What are your inputs?

What is the output you want?

Guerraje
Quartz | Level 8

Tom, 

 

This is the last step of a project where I am trying to merge 4 data sets. The whole idea is to manipulate every data set so at the end I can merge all four on county. I am guessing they are not the same lengths. I went through and did a proc sort on each data set to see if that was the problem and it did not resolve the problem. I sorted by descending order. 

you mentioned checking to see if the lengths are defined the same way, how would I check and adjust for this? That could certainly be the problem because that is not something I know how to check/adjust for.

 

Thank you

Reeza
Super User

Your BY statement in your SORT should be very the same as in your merge.

Different lengths will generate a warning but if it's just trailing spaces it'll be fine. I'm assuming you've viewed the files and verified no truncation is happening, ie that New York isn't truncated to New York C (8 chars) or something.

You have not posted your full code and log with the issues, showing both the sort, code and log with errors/notes.
Kurt_Bremser
Super User

Make sure that all datasets have the same length for the BY variable(s). How you do that depends on how the variables are created (read from external file, existing dataset, transpose, ...). Set to the longest of all defined lengths to avoid truncation.

Guerraje
Quartz | Level 8
The length was indeed what I needed to fix. Thank you so much!
ballardw
Super User

 

I suspect you did not actually mean this step:


Data fixedcounty;
SET coimpt.insurance;
rename =rename;
run; 

Did you look at the Fixedcounty data set? I am going to guess that the only difference from coimpt.insurance is the addition of a variable with all missing values that is named Rename.

 

The rename statement does not start with Rename =. It is Rename followed by the rename pairs. Which you seem to have attempted to create in a macrovariable named Rename and then not used.

I think you intended:

Data fixedcounty;
   SET coimpt.insurance;
   rename &rename.;
run; 

Which would likely mean that your data set names is completely incorrect prior to the transpose and would not expect the transpose to work.

 


@Guerraje wrote:

Hello, 

I am trying to merge 4 datasets on the variable county. When trying to do this I reviece the error:

 
WARNING: Multiple lengths were specified for the BY variable county by input data sets. This might cause unexpected results.
ERROR: BY variables are not properly sorted on data set WORK.NAMES.
 
I am not sure what this means, or how to fix the code from the dataset work.names so that I can complete my merge. The code for the work.names dataset is: 
proc sql noprint;
select catx ('=',nliteral(_name_),nliteral(substr(_name_,7)))
into : rename separated by ' '
from names
where _name_ like 'county%'
;
quit;
%put &=sqlobs;
%put &=rename; 
run; 

Data fixedcounty;
SET coimpt.insurance;
rename =rename;
run; 

 


PROC TRANSPOSE DATA = fixedcounty OUT= names;
var county:;
run;

data names;
set names;
county=tranwrd(_name_,"county", "");
 my_new_county = tranwrd(county,"_", " ");
run;


proc print data= names;
run;
 proc sort data= names; BY  descending county;
run; 

 

Tom
Super User Tom
Super User

What are you trying to do?  The code you posted does not appear to have anything to do with the question you asked.

 

I am trying to merge 4 datasets on the variable county. 

Show example input dataset(s) and what you want as output.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 4736 views
  • 2 likes
  • 5 in conversation