My following PROC SQL sort the common variable I joined the table with. Is there a code for lookup table where the common variable (Country_Code in this case) does not get sorted?
My expected output:
Emp_Name Country Code
David Japan
Joy France
Murad India
Paul Italy
Vick USA
Maya France
DATA Lookup_CountryCode;
INPUT Country_Name $ Country_Code Region $;
DATALINES;
Japan 107 Asia
USA 109 America
France 103 Europe
Canada 111 America
India 105 Asia
Nepal 115 Asia
Italy 102 Europe
;
RUN;
DATA Employee;
INPUT Emp_Name $ Country_Code;
DATALINES;
David 107
Joy 103
Murad 105
Paul 102
Vick 109
Maya 103
;
RUN;
PROC SQL;
CREATE TABLE want AS
SELECT a.Emp_Name, b.Country_Name
FROM Employee a LEFT JOIN Lookup_CountryCode b
on a.Country_Code=b.Country_Code;
QUIT;
You can't control row order with SQL unless you use an order by clause.
You must use a data step, or add a sorting variable.
I tried this and it worked:
PROC SQL;
CREATE TABLE want AS
SELECT a.Emp_Name,
( select Country_Name
from Lookup_CountryCode
where Country_Code = a.Country_Code ) as country_name
FROM Employee as a;
select * from want;
QUIT;
@PGStats Maybe add option
proc sql nothreads;
to ensure that multithreading is off, otherwise there is no way to be sure that row order is kept as is.
Yeah. I can imagine (but not test) that multithreading would disturb output order. Anyway, output order is never guaranteed unless there is an ORDER BY clause.
Multithreading is when SAS splits a task among many processors that run in parallel. I don't know exactly why your code reordered the data table and mine didn't. To my knowledge, this isn't documented anywhere and might change in a future SAS version. I agree 100% with @LinusH's comment below.
You could also create formats and then apply them to your dataset. You can apply your formats and not modify your dataset unless you need both, code and name.
By a data step it can be achieved.
Hold the Country_Name in a temporary array using Country_Code as its Index. Then read a record by record from Lookup_Country data set, load the array. Then lookup the array using Country_Code from Employee. If Country_Code is found in Lookup_CountryCode data set then the coreesponding cell will have Country_Name, otherwise a missing value. Thus the order of rows of Employee is undisturbed.
data need;
array k[102:115] $8 _temporary_;
length Emp_Name $8;
do until(eof);
set Lookup_CountryCode(drop = Region) end = eof;
k[Country_Code] = Country_Name;
end;
do until(last);
set Employee end = last;
if not missing (k[Country_Code]) then do;
Country_Name = k[Country_Code];
output;
end;
end;
drop Country_Code;
run;
This is how you do it with a format:
data cntlin;
set lookup_countrycode;
drop region;
fmtname = 'country_lookup';
type = 'N';
rename
country_code=start
country_name=label
;
run;
proc format library=work cntlin=cntlin;
run;
data want;
set employee;
country_name = put(country_code,country_lookup.);
run;
Original sort order is not a sustainable concept when transforming data. if your original sort order is important, you need a columns/variable that gives you the appropriate sort order (which you could use to sort the result after the processing, like in an ORDER BY clause in SQL).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.