Hey guys, I have a further problem with SAS and I would really appreciate if someone could help me with this.
I have the following dataset:
data have;
input date $ product $ supermarket $ price $;
datalines;
01JAN01 1 x 1.00
01JAN01 1 y 1.20
01JAN01 2 x 2.00
01FEB01 1 x 3.00
01FEB01 1 b 3.50
01MAR01 1 y 1.00
01MAR01 2 z 1.50
01MAR01 3 z 2.00
;
run;
The dataset contains observations of the prices, some supermarkets have in a specific month for a specific product.
For example, the first row says, that in January, the price for product 1 in supermarket x was 1 Dollar.
Now I want to add an additional column, for example price_of_x which contains the price of supermarket x for the specific product AND the specific month, or missing if no price is available in the data set. This could look like this:
data want;
input date $ product $ supermarket $ price $ price_of_x $;
datalines;
01JAN01 1 x 1.00 1.00
01JAN01 1 y 1.20 1.00
01JAN01 2 x 2.00 2.00
01FEB01 1 x 3.00 3.00
01FEB01 1 b 3.50 3.00
01MAR01 1 y 1.00 .
01MAR01 2 z 1.50 .
01MAR01 3 z 2.00 .
;
run;
My attempt to do this was to create a subset of all observations of supermarket x and then to inner join it with the original dataset. Unfortunately, I have a really big dataset and need to do this join very often. Thus, I want to know if there is a more intelligent (in terms of performance) way to do this?
If it helps, in the next step I add 2 further columns agreement and disagreement. agreement gets the value 1 if the price is the same, otherwise its a disagreement. Maybe this could also be integrated in the solution 🙂
Thanks in advance 🙂
mrzlatan
data have;
input date $ product $ supermarket $ price $;
datalines;
01JAN01 1 x 1.00
01JAN01 1 y 1.20
01JAN01 2 x 2.00
01FEB01 1 x 3.00
01FEB01 1 b 3.50
01MAR01 1 y 1.00
01MAR01 2 z 1.50
01MAR01 3 z 2.00
;
run;
proc sort data=have;
by date product;
run;
data want;
merge have have( rename=(supermarket=_s price=price_x) where=(_s='x'));
by date product;
drop _s;
run;
data have;
input date $ product $ supermarket $ price $;
datalines;
01JAN01 1 x 1.00
01JAN01 1 y 1.20
01JAN01 2 x 2.00
01FEB01 1 x 3.00
01FEB01 1 b 3.50
01MAR01 1 y 1.00
01MAR01 2 z 1.50
01MAR01 3 z 2.00
;
run;
proc sort data=have;
by date product;
run;
data want;
merge have have( rename=(supermarket=_s price=price_x) where=(_s='x'));
by date product;
drop _s;
run;
Thank you very much, this is exactly what I was looking for
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.