Hello to all, hope you are doing well.
I have a database TEST that has 5 columns. Here is an example:
ID_Customer | ID_Project | ID_Application | Nbre_Application | Creation_Application |
200200 | 1000 | 267 | 1 | 20JANV2020 |
200200 | 1000 | 256 | 1 | 10JANV2020 |
200200 | 1000 | 456 | 1 | 15JANV2020 |
400400 | 2000 | 120 | 2 | 10FEB2020 |
400400 | 2000 | 101 | 1 | 05JANV2020 |
For each client, I would like to classify his applications, from the oldest to the most recent with the number 1 for the oldest, 2 for the most recent...
Like this :
ID_Customer | ID_Project | ID_Application | Nbre_Application | Creation_Application |
200200 | 1000 | 256 | 1 | 1 |
200200 | 1000 | 456 | 1 | 2 |
200200 | 1000 | 267 | 1 | 3 |
400400 | 2000 | 101 | 1 | 1 |
400400 | 2000 | 120 | 2 | 2 |
I know there's a proc rank procedure but here how can I use it ? And is there a way with proc sql ?
Thank you !
That would not be a "rank" , that would be a sequence number and not ideal for Proc SQL.
The question might be what happens if a single ID_customer value has multiple Id_project values and how that gets treated. The example below resets the sequence for each project. If that is not desired use First.id_customer
Proc sort data=test; by Id_customer id_project creation_application; run; data want; set test; by id_customer id_project; retain sequence; if first.id_project then sequence=1; else sequence+1; run;
The BY statement creates boolean (true/false or 1/0) values for each variable that on the statement indicating whether the specific record is the first or last of a value group. These are accessed using the First. (see the dot) and Last. notation preceding the variable name.
Retain creates a variable whose value is kept across iteration boundaries of a data step so can be used to do sequences. But you need to reset it to a starting value when needed.
Bad juju results from changing the values of variables like Creation_application from a date to numeric, or was that just an accident in your "want"?
proc rank data=have out=want;
by id_customer;
var creation_application;
ranks ranked_creation_application;
run;
proc sort data=want;
by id_customer ranked_creation_application;
run;
Or alternatively
proc sort data=have;
by id_customer creation_application;
run;
data want;
set have;
by id_customer;
if first.id_customer then ranked_creation_application=0;
ranked_creation_application+1;
run;
Really no need for SQL here.
That would not be a "rank" , that would be a sequence number and not ideal for Proc SQL.
The question might be what happens if a single ID_customer value has multiple Id_project values and how that gets treated. The example below resets the sequence for each project. If that is not desired use First.id_customer
Proc sort data=test; by Id_customer id_project creation_application; run; data want; set test; by id_customer id_project; retain sequence; if first.id_project then sequence=1; else sequence+1; run;
The BY statement creates boolean (true/false or 1/0) values for each variable that on the statement indicating whether the specific record is the first or last of a value group. These are accessed using the First. (see the dot) and Last. notation preceding the variable name.
Retain creates a variable whose value is kept across iteration boundaries of a data step so can be used to do sequences. But you need to reset it to a starting value when needed.
Bad juju results from changing the values of variables like Creation_application from a date to numeric, or was that just an accident in your "want"?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.