Hi everyone,
I have a data set consist two columns, column 1 is item ID, column 2 is the purchase price for this item id in different transactions. Now I would like to add an additional column (column 3) which counts the number of purchase prices that is less than the purchase price for current transaction of the row, grouped by item ID. For example, item ID 1 has 3 transaction prices that are less than 23.1 (20.4, 13, 14), hence column 3 for row 1 should be 3.
It looks like a very common questions but I have searched the forum and found nothing related to this topic. Is this doable (practically)? and how could I achieve this with SAS EG? Please send me the link if this has been raised & resolved earlier?
ID Price Column(3)
1 23.1 3
1 20.4 2
1 13 0
1 49.5 5
1 23.4 4
1 14 1
1 56.7 6
2 234 3
2 323 4
2 232.4 2
2 145.6 1
2 100 0
3 33.4 2
3 54 3
3 32.4 1
3 23.1 0
3 54 3
Thanks everyone for reading.
JK
I don't use EG, but you can easily write SAS code to do this.
data have ;
input id price expected @@;
cards;
1 23.1 3 1 20.4 2 1 13 0 1 49.5 5 1 23.4 4 1 14 1 1 56.7 6
2 234 3 2 323 4 2 232.4 2 2 145.6 1 2 100 0
3 33.4 2 3 54 3 3 32.4 1 3 23.1 0 3 54 3
;
You could use PROC SQL and have it count for you.
proc sql noprint ;
create table want1 as
select id
, price
, expected
, (select count(*) from have b where b.id=a.id and b.price < a.price)
as wanted
from have a
;
quit;
Or you could use PROC RANK (there might even be an EG wizard to do this for you), but the values will be one larger than you wanted since it starts counting with one instead of zero.
proc rank data=have out=want2 ties=low ;
by id ;
ranks wanted ;
var price;
run;
By statement processing will do the trick here.
PROC SORT DATA=HAVE;
BY ID PRICE;
RUN;
DATA WANT;
SET HAVE;
BY ID PRICE;
IF FIRST.ID THEN COLUMN3 = 0;
ELSE COLUMN3+1;
RUN;
I don't use EG, but you can easily write SAS code to do this.
data have ;
input id price expected @@;
cards;
1 23.1 3 1 20.4 2 1 13 0 1 49.5 5 1 23.4 4 1 14 1 1 56.7 6
2 234 3 2 323 4 2 232.4 2 2 145.6 1 2 100 0
3 33.4 2 3 54 3 3 32.4 1 3 23.1 0 3 54 3
;
You could use PROC SQL and have it count for you.
proc sql noprint ;
create table want1 as
select id
, price
, expected
, (select count(*) from have b where b.id=a.id and b.price < a.price)
as wanted
from have a
;
quit;
Or you could use PROC RANK (there might even be an EG wizard to do this for you), but the values will be one larger than you wanted since it starts counting with one instead of zero.
proc rank data=have out=want2 ties=low ;
by id ;
ranks wanted ;
var price;
run;
Thanks both, that will do the trick, the only issue is that my data is quite large (2 million records) and it takes quite a long time to process the code.
See if PROC RANK does what you want, it should be more optimized.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.