BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
loubams3
Calcite | Level 5

proc sql;
select Manager_Name, Employee_Name, Total_Sales format=COMMA10.2
from (select l.Employee_ID, sum(Total_Retail_Price) as Total_Sales, Employee_Name, Country
from orion.order_fact as l,
orion.employee_addresses as p
where l.Employee_ID=p.Employee_ID and year(Order_Date)=2011
Group by Country, l.Employee_ID, Employee_Name) as l,
(select l.Employee_ID, Manager_ID, Employee_Name as Manager_Name
from orion.employee_organization as l,
orion.employee_addresses as p
where l.Manager_ID=p.Employee_ID) as p
where l.Employee_ID=p.Employee_ID
Order by Country, Manager_Name, Total_Sales desc;
quit;

 

Employee_Name is shown as Zhou, Tom, but I need it to change it to Tom Zhou

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If the data is well formatted you can easily use SCAN() as well.

 

catx(" ", scan(employee_name, 2, ","), scan(employee_name, 1, ",")) as new_name

View solution in original post

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Classic use of the PRXCHANGE Function

 

I stole this example directly from the PRXCHANGE Documentation (Example 2).

 

data ReversedNames;
   input name & $32.;
   datalines;
Jones, Fred
Kavich, Kate
Turley, Ron
Dulix, Yolanda
Zhou, Tom
;
proc sql;
   create table names as
   select prxchange('s/(\w+), (\w+)/$2 $1/', -1, name) as name
   from ReversedNames;
quit;
proc print data=names;
run;

 

Reeza
Super User

If the data is well formatted you can easily use SCAN() as well.

 

catx(" ", scan(employee_name, 2, ","), scan(employee_name, 1, ",")) as new_name
novinosrin
Tourmaline | Level 20

@Reeza  is brilliant to answer at lightning speed making me jealous while i was testing the same with 50 million records on my machine

 

here is the log difference. Kudos Reeza

 

361  proc sql;
362     create table names as
363     select prxchange('s/(\w+), (\w+)/$2 $1/', -1, name) as name
364     from ReversedNames;
NOTE: Table WORK.NAMES created, with 50000000 rows and 1 columns.

365  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           1:39.26
      cpu time            33.20 seconds


366  data names;
367   set ReversedNames;
368   name=catx(', ',scan(name,-1),scan(name,1));
369   run;

NOTE: There were 50000000 observations read from the data set WORK.REVERSEDNAMES.
NOTE: The data set WORK.WANT has 50000000 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           10.46 seconds
      cpu time            8.90 seconds

@mkeintz  mentioned PRX being very slow in another thread and the above is the proof. But Mark if you have a minute, do you know why so? Is there an implementation problem? I am just curious

PaigeMiller
Diamond | Level 26

Is it PRX that is slow, or is it doing this operation in SQL is the cause of the slowness? What happens if you try the PRX in a data step?

--
Paige Miller
novinosrin
Tourmaline | Level 20

@PaigeMiller   Like I always said, you are full of brains, much better

 

370  data names;
371   set ReversedNames;
372   name=prxchange('s/(\w+), (\w+)/$2 $1/', -1, name);
373   run;

NOTE: There were 50000000 observations read from the data set WORK.REVERSEDNAMES.
NOTE: The data set WORK.NAMES has 50000000 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           32.99 seconds
      cpu time            22.29 seconds
PaigeMiller
Diamond | Level 26

So PRX is still approximately 3X slower than doing it with the SCAN function.

--
Paige Miller
Reeza
Super User

@novinosrin I just use SCAN() because I find it's easier to remember the parameters, has nothing to do with efficiency in terms of computer, but efficiency in terms of typing and my time. I should really just sit down one day and learn them 😄

novinosrin
Tourmaline | Level 20

 

" it's easier to remember the parameters," sounds like invariably efficient mind. Like @PaigeMiller  pointed out "So PRX is still approximately 3X slower than doing it with the SCAN function." plus SQL processor makes it all the more time consuming. Jeez! what if we were to deal with 5 years transaction data as population for customer spends analysis for campaign targeting models. I can only imagine the size of the data would let us have lunch session and coffee session combines and process might still be running 🙂

 

I learned something from OP, You, draycut and ofcourse Mr. Quantmiller


@Reeza wrote:

@novinosrin I just use SCAN() because I find it's easier to remember the parameters, has nothing to do with efficiency in terms of computer, but efficiency in terms of typing and my time. I should really just sit down one day and learn them 😄


 

PaigeMiller
Diamond | Level 26

I can only imagine the size of the data would let us have lunch session and coffee session ... 

 


 

The first company to ever allow its employees to have coffee breaks is located about one mile from here in Buffalo, NY. It is the Barcalo Manufacturing Company, now known as Barcalounger (because they also invented the first recliner, the Barcalounger).

--
Paige Miller
novinosrin
Tourmaline | Level 20

Great presence of mind!!!!!!!!

novinosrin
Tourmaline | Level 20

Thank you. Looks interesting. Will take a serious look. 

 

For those who work on large datasets on prod environments, dealing with humongous data 

1. SQL processor doesn't seem to compete well with a datastep

2. PRX if not avoidable coz some manipulations are done best with PRX, otherwise stick to SAS functions

 

Disclaimer: Nonetheless,only extensive  testing with rich samples can only confirm

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 7708 views
  • 7 likes
  • 5 in conversation