Hello to all, hope you are doing well.
I have a database TEST that has 5 columns. Here is an example:
ID_Customer | ID_Project | ID_Application | Nbre_Application | Creation_Application |
200200 | 1000 | 267 | 1 | 20JANV2020 |
200200 | 1000 | 256 | 1 | 10JANV2020 |
200200 | 1000 | 456 | 1 | 15JANV2020 |
400400 | 2000 | 120 | 2 | 10FEB2020 |
400400 | 2000 | 101 | 1 | 05JANV2020 |
For each client, I would like to classify his applications, from the oldest to the most recent with the number 1 for the oldest, 2 for the most recent...
Like this :
ID_Customer | ID_Project | ID_Application | Nbre_Application | Creation_Application |
200200 | 1000 | 256 | 1 | 1 |
200200 | 1000 | 456 | 1 | 2 |
200200 | 1000 | 267 | 1 | 3 |
400400 | 2000 | 101 | 1 | 1 |
400400 | 2000 | 120 | 2 | 2 |
I know there's a proc rank procedure but here how can I use it ? And is there a way with proc sql ?
Thank you !
That would not be a "rank" , that would be a sequence number and not ideal for Proc SQL.
The question might be what happens if a single ID_customer value has multiple Id_project values and how that gets treated. The example below resets the sequence for each project. If that is not desired use First.id_customer
Proc sort data=test; by Id_customer id_project creation_application; run; data want; set test; by id_customer id_project; retain sequence; if first.id_project then sequence=1; else sequence+1; run;
The BY statement creates boolean (true/false or 1/0) values for each variable that on the statement indicating whether the specific record is the first or last of a value group. These are accessed using the First. (see the dot) and Last. notation preceding the variable name.
Retain creates a variable whose value is kept across iteration boundaries of a data step so can be used to do sequences. But you need to reset it to a starting value when needed.
Bad juju results from changing the values of variables like Creation_application from a date to numeric, or was that just an accident in your "want"?
proc rank data=have out=want;
by id_customer;
var creation_application;
ranks ranked_creation_application;
run;
proc sort data=want;
by id_customer ranked_creation_application;
run;
Or alternatively
proc sort data=have;
by id_customer creation_application;
run;
data want;
set have;
by id_customer;
if first.id_customer then ranked_creation_application=0;
ranked_creation_application+1;
run;
Really no need for SQL here.
That would not be a "rank" , that would be a sequence number and not ideal for Proc SQL.
The question might be what happens if a single ID_customer value has multiple Id_project values and how that gets treated. The example below resets the sequence for each project. If that is not desired use First.id_customer
Proc sort data=test; by Id_customer id_project creation_application; run; data want; set test; by id_customer id_project; retain sequence; if first.id_project then sequence=1; else sequence+1; run;
The BY statement creates boolean (true/false or 1/0) values for each variable that on the statement indicating whether the specific record is the first or last of a value group. These are accessed using the First. (see the dot) and Last. notation preceding the variable name.
Retain creates a variable whose value is kept across iteration boundaries of a data step so can be used to do sequences. But you need to reset it to a starting value when needed.
Bad juju results from changing the values of variables like Creation_application from a date to numeric, or was that just an accident in your "want"?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.