Hi All,
I have two data steps, one in Proc SQL the other in bases SAS. Both of the codes are sourcing the exact same variables from the same source and are being sorted by the same variable.
proc sql;
create table vals_base as
select
ACCTAPPLNUMBER as application_id,
VALUATIONDATE,
DIM_CO_VALUATION_SSK,
VALUATIONTYPE,
LASTMODIFIEDDATE_DATE,
NUMBEROFBATHROOMS,
NUMBEROFBEDROOMS,
PROPERTYBUILT,
VALUEINCURRENTCONDITION,
PROPERTYTYPE,
PROPERTYSUBTYPE
from TEST_APP.DPR_DIM_APPLICATION_VALUATIONS
order by ACCTAPPLNUMBER;
quit;
proc sort data=test_app.DPR_DIM_APPLICATION_VALUATIONS (keep = DIM_CO_VALUATION_SSK
VALUATIONDATE VALUATIONTYPE VALUEINCURRENTCONDITION LASTMODIFIEDDATE_DATE
NUMBEROFBATHROOMS NUMBEROFBEDROOMS PROPERTYBUILT ACCTAPPLNUMBER
PROPERTYTYPE PROPERTYSUBTYPE ) out=vals_base (rename = (ACCTAPPLNUMBER = application_id));
by ACCTAPPLNUMBER;
run;
Why would the outputs differ in sort beyond the sort key. For example the valuationtype variable is outputted different in both outputs although it is not listed as a variable to be sorted.
Base SAS output
application_id | VALUATIONDATE | VALUATIONTYPE |
1538 | 28FEB2014:00:00:00 | Quest - Homebuyers |
1538 | 17FEB2014:00:00:00 | Quest - Homebuyers |
1538 | 28FEB2014:00:00:00 | Quest - Full Valuation |
Proc SQL output
application_id | VALUATIONDATE | VALUATIONTYPE |
1538 | 28FEB2014:00:00:00 | Quest - Full Valuation |
1538 | 17FEB2014:00:00:00 | Quest - Homebuyers |
1538 | 28FEB2014:00:00:00 | Quest - Homebuyers |
The source table does contain multiple application Id's.
Thanks
Adnan
The secondary sort order is not guranteed in sql. Well, the sort order is never guranteed in sql regardless unless you force or specify a vars in order by clause like 1,2,3 to get desired
However proc sort/datastep are sequential processing. Therefore spits out the order that exists in the dataset that is by read.
SQL is very tricky in general to imagine how the sql processor would work for a particular logic
A simpler example to illustrate:
proc sql;
create table w as
select *
from sashelp.class
order by sex;
quit;
SAS Output
Judy | F | 14 | 64.3 | 90 |
Jane | F | 12 | 59.8 | 84.5 |
Joyce | F | 11 | 51.3 | 50.5 |
Barbara | F | 13 | 65.3 | 98 |
Carol | F | 14 | 62.8 | 102.5 |
Mary | F | 15 | 66.5 | 112 |
Louise | F | 12 | 56.3 | 77 |
Alice | F | 13 | 56.5 | 84 |
Janet | F | 15 | 62.5 | 112.5 |
proc sort data=sashelp.class out=w1;
by sex;
run;
SAS Output
Alice | F | 13 | 56.5 | 84.0 |
Barbara | F | 13 | 65.3 | 98.0 |
Carol | F | 14 | 62.8 | 102.5 |
Jane | F | 12 | 59.8 | 84.5 |
Janet | F | 15 | 62.5 | 112.5 |
Joyce | F | 11 | 51.3 | 50.5 |
Judy | F | 14 | 64.3 | 90.0 |
The secondary sort order is not guranteed in sql. Well, the sort order is never guranteed in sql regardless unless you force or specify a vars in order by clause like 1,2,3 to get desired
However proc sort/datastep are sequential processing. Therefore spits out the order that exists in the dataset that is by read.
SQL is very tricky in general to imagine how the sql processor would work for a particular logic
A simpler example to illustrate:
proc sql;
create table w as
select *
from sashelp.class
order by sex;
quit;
SAS Output
Judy | F | 14 | 64.3 | 90 |
Jane | F | 12 | 59.8 | 84.5 |
Joyce | F | 11 | 51.3 | 50.5 |
Barbara | F | 13 | 65.3 | 98 |
Carol | F | 14 | 62.8 | 102.5 |
Mary | F | 15 | 66.5 | 112 |
Louise | F | 12 | 56.3 | 77 |
Alice | F | 13 | 56.5 | 84 |
Janet | F | 15 | 62.5 | 112.5 |
proc sort data=sashelp.class out=w1;
by sex;
run;
SAS Output
Alice | F | 13 | 56.5 | 84.0 |
Barbara | F | 13 | 65.3 | 98.0 |
Carol | F | 14 | 62.8 | 102.5 |
Jane | F | 12 | 59.8 | 84.5 |
Janet | F | 15 | 62.5 | 112.5 |
Joyce | F | 11 | 51.3 | 50.5 |
Judy | F | 14 | 64.3 | 90.0 |
Don't expect any order you did not force. See Maxim 31.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.