BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pcfriendly
Calcite | Level 5


Hi,

I'm trying to find the Function in SAS that will perform a sort on my Dataset of Job Roles & Start Dates.

I'm using Enterprise Guide 4.3 & I want to do this sorting in an Advanced Expression, that is via The Query Builder >> 'Computed Columns' >> 'New' >> 'Advanced Expression'

The Dataset has various Job Roles staff have had & the various Start Dates, there are multiple Job Roles & Start Dates for each Staff member.

I want to find the latest Start Date & Job Role for each Staff member.

In Teradata i would use the below;

------------------------------------------------------

Select

Job_Role,

Start_Date,

ROW_NUMBER OVER (PARTITION BY 'Job_Role' ORDER BY Start_Date DESC) AS Rank_No

From Data_Table

------------------------------------------------------

I could then filter the records for where the 'Rank_No' = 1 which would give the most current Job Role for that Staff member.

What can I use in SAS to do a similar task?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
BrunoMueller
SAS Super FREQ

SAS Enterprise Guide also provides a task to do ranking under Tasks -> Data -> Rank

You will use start_date as "column to rank" and job_role as "rank by"

Unter options you check "Reverse Ranking from largest to smallest" so the youngest date gets the ranking 1.

You can then use a query to select only those with ranking value 1

View solution in original post

20 REPLIES 20
Reeza
Super User

Use a datastep and first/last processing is the easiest way via program.

First sort by staff member, job date descending and job role

data want;

set have;

by staff desc job_date job_role;

if first.staff then output;

run;

There are multiple ways to do this though, another is a sql join with a subselect of the maximum job_date/job_role per staff member. Another is to rank the dates using SAS Rank functions and then take the top 1. Both of the above can be done using EG point and click.

pcfriendly
Calcite | Level 5

Thanks Reeza,

So I would be entering that code into a new Program?

Where is the reference to the source table?

Thanks

Reeza
Super User

Actually the EG solution is easy, I'm just used to thinking in Programming Steps...

Use a Data>Sort Data step to sort your data first

By Member, Date, Job Role

Make sure to set date descending.

Then Sort the Output dataset again (Data>Sort Data)

But only by Member.

And only Options>Duplicate - select the option to keep only the first record.

pcfriendly
Calcite | Level 5

Hi Reeza,

I'm using EG 4.3 version, it doesn't have the option of 'Options>Duplicate' - select the option to keep only the first record.


Reeza
Super User

I'm using 4.2 and it does.

Tasks>Data>Sort Data

Left hand side > Options

Duplicate is in the second set of options. It could have been removed in 4.3 I suppose...

SAS EG Options.jpg

LarryWorley
Fluorite | Level 6

I see the same window as REEza in EG 4.3.0

pcfriendly
Calcite | Level 5

You're right Larry, I was in the wrong task icon, Rank instead of Sort... DUH !!:smileysilly: 

pcfriendly
Calcite | Level 5

Sorry Reeza, you're right, I was in a 'Rank' Icon instead of a Sort Icon.

My screen looks exactly like yours & does have the Duplicates option area.

Under 'Data', I've added the Start_Date filed in the 'Sort By' area descending & under 'Options' I have selected the 'Keep only the first record for each 'Sort By' group' option.

My results are showing more than one date for the Job_Role though?

Is it correct to just add one field in the Task Roles - Sort By area?

Also, what is 'Columns to be dropped for?

Thanks    

Reeza
Super User

I don't quite understand your specific requirements so my suggestion would be to try a bunch of different things until you get what you want....scientific, I know. But you probably have a better idea.

The logis is to sort first on your key fields and then to sort only on what you want to be the UNIQUE key and everything else that is duplicate will be winnowed out. 

Columns to be dropped probably allows you to not include all variables in your final dataset, for example if you only wanted unique members.

pcfriendly
Calcite | Level 5

OK, thanks Reeza,

I'm doing that now to see how I can just get the row that has the latest Start_Date.

Thanks for your help.

LarryWorley
Fluorite | Level 6

Reeza is correct in pointing your towards datastep, since SAS SQL does not support windowing sql functions.

I believe there is an alternative sql formulation which will generate the same result as the partition by.  You could try this:

Select

Staff

Job_Role,

max(Start_Date) as start_date

From Data_Table

group by staff

having start_date = max(start_date)

;

Or you could use select distinct and omit the having clause.

You might be able to implement that in the query builder.

BrunoMueller
SAS Super FREQ

SAS Enterprise Guide also provides a task to do ranking under Tasks -> Data -> Rank

You will use start_date as "column to rank" and job_role as "rank by"

Unter options you check "Reverse Ranking from largest to smallest" so the youngest date gets the ranking 1.

You can then use a query to select only those with ranking value 1

pcfriendly
Calcite | Level 5

Hi Bruno,

I wasn't aware of this before, thank you.

I'm getting some strange numbers in the Rank result.

I'm not getting a 1, 2 3 in the ranking which is what I was expecting.

I noticed I have the below options when I add the Rank node to the graph as below;

  • Sort By Variables under the Data area
  • Ranking Method under the Options area
  • If values tie, use;

What shoud I choose for these to get a 1,2,3 Ranking?

Thanks

BrunoMueller
SAS Super FREQ

pcfriendly

use the defaults for the data and assign the roles as stated earlier

the Ranking method should be "smallest to largest"

check the box "Reverse Ranking from smallest to largest"

leave the default for "if values tie"

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 46869 views
  • 3 likes
  • 8 in conversation