Hi Friends,
I am working on migration of process from SAS 9.3 on Unix Aix to SAS 9.4 on linux.
for one of my process, I find different default ordering of observations in output datasets (created from SQL left join from source tables) in SAS 9.3 M2 (On unix AIX)and SAS 9.4 M4 (on Linux redhat) .
This different ordering of observations are causing different results in downstream as we are removing duplicates based of few variables which don't uniquely represent each row in final table. I tried to use _method_tree option to find the default order of SQL join but it didn't provide much information for the differences.
proc sql _method _tree;
...
quit;
Is it possible to find the default ordering of observations on Unix Aix? I will replicate the same ordering explicitly then.
Any help will be appreciated.
Thanks ,
Kulbhushan
We struck this issue as well when migrating to SAS 9.4. Our solution was to add explicit ORDER BY statements to our SQL queries. This ensured we got the same data outputs in 9.4 versus 9.3.
If row order is crucial to your process, add order by clauses to your SQL queries.
yeah in that case we have to change current production too to have match for passing migration. I am trying to find if somehow I can find default order in current production so that I can put that in order clause of newly migrated code to SAS 9.4 server. Thanks for the feedback.
If your data on Unix Aix is SAS data then Proc Contents will report on the SORTEDBY information, or the Dictionary.Columns table has a field SORTEDBY that indicates the order in the key sequence a variable may appear so you could recover that information to add Order By to syntax.
sorted by in proc content provide me the sorted keys on which we explicitly sorted. but the problem is we still have duplicates in those keys.
for examples: we sorted in id1 id2 in production. so you can see ID1 ID2 is not unique and the other variables are sorted by default in SQL join.
But the issue is in SAS 9.3 PROC SQL join is sorting differently than SAS 9.4 on other variables(not ID1 ID2). in later step we are picking last record for ID1 & ID2. but last record is different in SAS 9.3 & SAS 9.4 SQL join.
Sources of Join are exact match(even order)
ID1 ID2 start_dt var2 var3 var4
1 1 1feb2019 22.3 33.2 10
1 1 1feb2019 22.3 33.2 10
1 1 3feb2019 22.3 33.2 10
1 1 2feb2019 24.3 34.2 10
I am sorry If I didn't able to explain it to you better,
Thanks.
Sounds like you were lucky to get consistent results in the old system.
If you are removing duplicates based on only a partial key and you want to consistently get the same record kept then you probably need to order by ALL of the variables.
proc sort data=have out=sorted ;
by key1 key2 _all_;
run;
data nodups;
set sorted;
by key1 key2;
if first.key2;
run;
yeah that's the last solution, we don't want to change current production. if I can fix it in new server i.e. SAS 9.4 using default order in current system on duplicates variables. otherwise the last option to add order clause in both current production and newly migrated system.
Thanks for the reply.
We struck this issue as well when migrating to SAS 9.4. Our solution was to add explicit ORDER BY statements to our SQL queries. This ensured we got the same data outputs in 9.4 versus 9.3.
yeah that will be the last solution. we have to justify to the business new order and then fix current production and keep same order new system i.e. SAS 9.4.
Thank you so much!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.