hit a problem with PROC SQL sorting
after pulling some data together with proc sql, I ran a short data step to check for duplicate names and found the table was NOT sorted by the column in the "ORDER BY schemeName" clause.[pre]513 proc sql ;
514 create table sch_names_low as
515 select a.psrNumber, a.schemeName, b.statusdate, b.status
516 from mi.tbl_bau_stats_details a
517 join mi.tbl_bau_stats_valuations b
518 on a.psrNumber = b.psrNumber
519 where schemeName lt 'a'
520 order by schemeName
NOTE: Table WORK.SCH_NAMES_LOW created, with 258 rows and 4 columns.
522 quit ;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.25 seconds
cpu time 0.03 seconds
524 data schn_dup ;
525 set ;
526 by schemeName ;
527 if first.schemeName & last.schemeName then delete ;
ERROR: BY variables are not properly sorted on data set WORK.SCH_NAMES_LOW.
PSRNumber=10###### SchemeName=[c& S (Neatxxxxxxxxxxxxxxxxxxxxxx Scheme 155EP373 StatusDate=04JAN2005:00:00:00 Status=Wound Up FIRST.SchemeName=1 LAST.SchemeName=1 _ERROR_=1 _N_=3
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 4 observations read from the data set WORK.SCH_NAMES_LOW[/pre]
No amount of SORTSEQ= handling could resolve the problem.
Usage note 4999 http://support.sas.com/kb/4/999.html hints at the problem. These data come from sql server where [ appears to have a different sort value than in SAS![pre]571 data _null_ ;
572 set SCH_NAMES_LOW( obs=4) ;
573 put schemeName= $char5. schemeName $hex10. ;
SchemeName="J" F 224A222046
SchemeName=(C M) 2843204D29
SchemeName=[c& S 5B63262053
NOTE: There were 4 observations read[/pre]
Helpfully, PROC SQL did not mark the table created (with ORDER BY) as sorted, when the sort was performed on the SQL Server.
However, Proc SQL might inform and properly warn with some MSGLEVEL=i message or WARNING .
Is there some option I have missed which creates this warning?
I was extracting data from a DB2 system (zOS) to SAS on a distributed system (AIX).
The sort sequence did not match.
Since there were no predefined SORTSEQ that suited my needs, I've solved this trouble by creating a new SORT sequence (matching tho one at DB2) and setting the system option SORTSEQ to this table in the autoexec file.
sorry the message was so long, that you missed
> These data come from sql server where [
At least this scenario is a little better than my previous experience in this area.
Then, a cimport of a sas table downloaded to windows from zOS, continued to provide sortedby information - including sortseq=ebcdic. That was ignored in proc sql sas913 - so sql joined with no errors when the first row in the table downloaded from zOS was higher than all rows in the other table and both tables were declared to be in the corresponding order (apart from the sortseq=ebcdic on the table from zOS).
At least in this scenario, the implicit pass-thru of the query including the "ORDER BY" clause, by the sas/access to odbc engine connected to sql server, results in no sortedby information being placed in the header of the output data set.
So I've removed that order-by clause, and plan not to use it where ever I'm pulling data through pass-thru sql.
I still think proc sql could generate a warning when the effect of order-by is outside the interpretation of SAS.