DATA Step, Macro, Functions and more

Lookup using PROC SQL

Reply
Regular Contributor
Posts: 218

Lookup using PROC SQL

My following PROC SQL sort the common variable I joined the table with. Is there a code for lookup table where the common variable (Country_Code in this case) does not get sorted? 

 

My expected output:

Emp_Name Country Code
David        Japan
Joy            France
Murad        India
Paul           Italy
Vick            USA
Maya         France

 

DATA Lookup_CountryCode;
INPUT Country_Name $ Country_Code Region $;
DATALINES;
Japan 107 Asia
USA 109 America
France 103 Europe
Canada 111 America
India 105 Asia
Nepal 115 Asia
Italy 102 Europe
;
RUN;


DATA Employee;
INPUT Emp_Name $ Country_Code;
DATALINES;
David 107
Joy 103
Murad 105
Paul 102
Vick 109
Maya 103
;
RUN;


PROC SQL;
CREATE TABLE want AS
SELECT a.Emp_Name, b.Country_Name 
FROM Employee a LEFT JOIN Lookup_CountryCode b
on a.Country_Code=b.Country_Code;
QUIT;
PROC Star
Posts: 1,760

Re: Lookup using PROC SQL

You can't control row order with SQL unless you use an order by clause.

You must use a data step, or add a sorting variable.

Respected Advisor
Posts: 4,927

Re: Lookup using PROC SQL

I tried this and it worked:

 

PROC SQL;
CREATE TABLE want AS
SELECT a.Emp_Name, 
    (   select Country_Name 
        from Lookup_CountryCode 
        where Country_Code = a.Country_Code  ) as country_name
FROM Employee as a;
select * from want;
QUIT;
PG
PROC Star
Posts: 1,760

Re: Lookup using PROC SQL

@PGStats Maybe add option

 

proc sql nothreads;

 

to ensure that multithreading is off, otherwise there is no way to be sure that row order is kept as is.

 

Respected Advisor
Posts: 4,927

Re: Lookup using PROC SQL

Yeah. I can imagine (but not test) that multithreading would disturb output order. Anyway, output order is never guaranteed unless there is an ORDER BY clause.

PG
Regular Contributor
Posts: 218

Re: Lookup using PROC SQL

Hi PG,
Thanks for your help.
What is multithreading?
In which case your code is going to fail?
Thanks,
Respected Advisor
Posts: 4,927

Re: Lookup using PROC SQL

Multithreading is when SAS splits a task among many processors that run in parallel. I don't know exactly why your code reordered the data table and mine didn't. To my knowledge, this isn't documented anywhere and might change in a future SAS version. I agree 100% with @LinusH's comment below.

PG
Super User
Posts: 19,847

Re: Lookup using PROC SQL

You could also create formats and then apply them to your dataset. You can apply your formats and not modify your dataset unless you need both, code and name. 

Super Contributor
Posts: 298

Re: Lookup using PROC SQL

By a data step it can be achieved.

 

Hold the Country_Name in a temporary array using Country_Code as its Index. Then read a record by record from Lookup_Country data set, load the array. Then lookup the array using Country_Code from Employee. If Country_Code is found in Lookup_CountryCode data set  then the coreesponding cell will have Country_Name, otherwise a missing value. Thus the order of rows of Employee is undisturbed.

 

data need;
   array k[102:115] $8 _temporary_;
   length Emp_Name $8;
   do until(eof);
      set Lookup_CountryCode(drop = Region) end = eof;
      k[Country_Code] = Country_Name;
   end;

   do until(last);
      set Employee end = last;
      if not missing (k[Country_Code]) then do;
         Country_Name = k[Country_Code];
         output;
      end;
   end;
drop Country_Code;
run;

 

Super User
Posts: 7,832

Re: Lookup using PROC SQL

This is how you do it with a format:

data cntlin;
set lookup_countrycode;
drop region;
fmtname = 'country_lookup';
type = 'N';
rename
  country_code=start
  country_name=label
;
run;

proc format library=work cntlin=cntlin;
run;

data want;
set employee;
country_name = put(country_code,country_lookup.);
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 5,434

Re: Lookup using PROC SQL

Original sort order is not a sustainable concept when transforming data. if your original sort order is important, you need a columns/variable that gives you the appropriate sort order (which you could use to sort the result after the processing, like in an ORDER BY clause in SQL).

Data never sleeps
Ask a Question
Discussion stats
  • 10 replies
  • 643 views
  • 3 likes
  • 7 in conversation