BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mrzlatan91
Obsidian | Level 7

Hey guys, I have a further problem with SAS and I would really appreciate if someone could help me with this.

 

I have the following dataset: 

 

data have;
input date $ product $ supermarket $ price $;
datalines;
01JAN01 1 x 1.00
01JAN01 1 y 1.20
01JAN01 2 x 2.00
01FEB01 1 x 3.00
01FEB01 1 b 3.50
01MAR01 1 y 1.00
01MAR01 2 z 1.50
01MAR01 3 z 2.00
;
run;

 

The dataset contains observations of the prices, some supermarkets have in a specific month for a specific product.

For example, the first row says, that in January, the price for product 1 in supermarket x was 1 Dollar.

Now I want to add an additional column, for example price_of_x which contains the price of supermarket x for the specific product AND the specific month, or missing if no price is available in the data set. This could look like this:

 

data want;
input date $ product $ supermarket $ price $ price_of_x $;
datalines;
01JAN01 1 x 1.00 1.00
01JAN01 1 y 1.20 1.00
01JAN01 2 x 2.00 2.00
01FEB01 1 x 3.00 3.00
01FEB01 1 b 3.50 3.00
01MAR01 1 y 1.00 .
01MAR01 2 z 1.50 .
01MAR01 3 z 2.00 .
;
run;

 

My attempt to do this was to create a subset of all observations of supermarket x and then to inner join it with the original dataset. Unfortunately, I have a really big dataset and need to do this join very often. Thus, I want to know if there is a more intelligent (in terms of performance) way to do this?

If it helps, in the next step I add 2 further columns agreement and disagreement. agreement gets the value 1 if the price is the same, otherwise its a disagreement. Maybe this could also be integrated in the solution 🙂 

 

 

 

Thanks in advance 🙂

mrzlatan

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input date $ product $ supermarket $ price $;
datalines;
01JAN01 1 x 1.00
01JAN01 1 y 1.20
01JAN01 2 x 2.00
01FEB01 1 x 3.00
01FEB01 1 b 3.50
01MAR01 1 y 1.00
01MAR01 2 z 1.50
01MAR01 3 z 2.00
;
run;
proc sort data=have;
by date product;
run;
data want;
 merge have have( rename=(supermarket=_s price=price_x) where=(_s='x'));
 by date product;
 drop _s;
run;

View solution in original post

2 REPLIES 2
Ksharp
Super User
data have;
input date $ product $ supermarket $ price $;
datalines;
01JAN01 1 x 1.00
01JAN01 1 y 1.20
01JAN01 2 x 2.00
01FEB01 1 x 3.00
01FEB01 1 b 3.50
01MAR01 1 y 1.00
01MAR01 2 z 1.50
01MAR01 3 z 2.00
;
run;
proc sort data=have;
by date product;
run;
data want;
 merge have have( rename=(supermarket=_s price=price_x) where=(_s='x'));
 by date product;
 drop _s;
run;
mrzlatan91
Obsidian | Level 7

Thank you very much, this is exactly what I was looking for

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 981 views
  • 0 likes
  • 2 in conversation