Hi everyone,
I have a data set consist two columns, column 1 is item ID, column 2 is the purchase price for this item id in different transactions. Now I would like to add an additional column (column 3) which counts the number of purchase prices that is less than the purchase price for current transaction of the row, grouped by item ID. For example, item ID 1 has 3 transaction prices that are less than 23.1 (20.4, 13, 14), hence column 3 for row 1 should be 3.
It looks like a very common questions but I have searched the forum and found nothing related to this topic. Is this doable (practically)? and how could I achieve this with SAS EG? Please send me the link if this has been raised & resolved earlier?
ID Price Column(3)
1 23.1 3
1 20.4 2
1 13 0
1 49.5 5
1 23.4 4
1 14 1
1 56.7 6
2 234 3
2 323 4
2 232.4 2
2 145.6 1
2 100 0
3 33.4 2
3 54 3
3 32.4 1
3 23.1 0
3 54 3
Thanks everyone for reading.
JK
I don't use EG, but you can easily write SAS code to do this.
data have ;
input id price expected @@;
cards;
1 23.1 3 1 20.4 2 1 13 0 1 49.5 5 1 23.4 4 1 14 1 1 56.7 6
2 234 3 2 323 4 2 232.4 2 2 145.6 1 2 100 0
3 33.4 2 3 54 3 3 32.4 1 3 23.1 0 3 54 3
;
You could use PROC SQL and have it count for you.
proc sql noprint ;
create table want1 as
select id
, price
, expected
, (select count(*) from have b where b.id=a.id and b.price < a.price)
as wanted
from have a
;
quit;
Or you could use PROC RANK (there might even be an EG wizard to do this for you), but the values will be one larger than you wanted since it starts counting with one instead of zero.
proc rank data=have out=want2 ties=low ;
by id ;
ranks wanted ;
var price;
run;
By statement processing will do the trick here.
PROC SORT DATA=HAVE;
BY ID PRICE;
RUN;
DATA WANT;
SET HAVE;
BY ID PRICE;
IF FIRST.ID THEN COLUMN3 = 0;
ELSE COLUMN3+1;
RUN;
I don't use EG, but you can easily write SAS code to do this.
data have ;
input id price expected @@;
cards;
1 23.1 3 1 20.4 2 1 13 0 1 49.5 5 1 23.4 4 1 14 1 1 56.7 6
2 234 3 2 323 4 2 232.4 2 2 145.6 1 2 100 0
3 33.4 2 3 54 3 3 32.4 1 3 23.1 0 3 54 3
;
You could use PROC SQL and have it count for you.
proc sql noprint ;
create table want1 as
select id
, price
, expected
, (select count(*) from have b where b.id=a.id and b.price < a.price)
as wanted
from have a
;
quit;
Or you could use PROC RANK (there might even be an EG wizard to do this for you), but the values will be one larger than you wanted since it starts counting with one instead of zero.
proc rank data=have out=want2 ties=low ;
by id ;
ranks wanted ;
var price;
run;
Thanks both, that will do the trick, the only issue is that my data is quite large (2 million records) and it takes quite a long time to process the code.
See if PROC RANK does what you want, it should be more optimized.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.