BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
togglefroggle
Fluorite | Level 6

Hello,

I'm trying to find a price value associated with upgrading arcade machines.

My datasets contain store inventory of arcade machines and their associated price. I'm trying to find each years increase in cost associated with UPGRADING the machines.

 

sample dataset, I have a dataset YEARLYSTATEMENT_yyyy from 2015-2020, there are around about 400-600 rows each year.

YEARLYSTATEMENT_2015
STORE_NUM MACHINE_INV PRICE
005 P006 301
001 P001 212
004 P016 351
012 P091 462
007 P092 504
005 P093 520
005 P090 480
012 P014 322
003 P015 412
etc.. ... ...

Each store keeps an inventory of each machine that they currently have and it reflects that here under machine_inv

 

Upgrading the machine is not simply buying a new one but buying a new model of the same kind of machine, I also have a small list of upgrades by the manufacturer that looks like this:

P014->P015->P016
P090->P091->P092->P093
P055->P056->P057
P080->P081->P082

Every year an arcade will buy new machines and upgrade old ones, i'm trying to find the cost associated with only upgrading to new machines year to year. For example in 2015 if 100,000 was our inventory worth on machines for all stores, and in 2016 130,000 was the new value, I want to see how much of that increase of 30,000 was solely because we upgraded the machines and not purchased a separate model.

 

Another factor that complicated this is that an upgrade can be from P014 -> P015 or it can also be P014 -> P016, any increase in the upgrade "chain" will count as an upgrade. When an arcade upgrades a machine, they will no longer carry any of the previous version.

 

An idea i had to tackle this was to single out each machine that has an upgrade path and then add the cost up and subtract by the number the previous year received. But by doing this I cant account for if the store had the previous model or not or if they just purchased one of these machines outright.

 

Is there a way in sas that I can easily code to look if a previous year carried a machine in previous part of the upgrade chain and calculate the cost associated with upgrading that year?

 

Sorry if this is kind of a complex problem, I'm just not sure how to tackle this at all.

I would really appreciate any help, insights, or ideas.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Please run this code and see if the resulting Price_dif is what you expect.

proc sort data=arcade;
   by store_num  MACHINE_INV year;
run;

proc format;
value $macseries
'P014','P015','P016'        ='P014'
'P090','P091','P092','P093' ='P090'
'P055','P056','P057'        ='P055'
'P080','P081','P082'        ='P080'
;
run;


data temp;
   set arcade;
   macseries = put(machine_inv,$macseries.);
run; 

data want;
   set temp;
   by store_num macseries;
   dp= dif(price);
   if first.macseries then price_dif=0;
   else price_dif=dp;
   drop dp;
run;

There is an intermediate step that creates a variable indicating which series of upgrades, what I called “'basically the same thing' identifier”. That variable could be added with if/then/else statements but since these upgrade paths may not be ended it might be easier to simply add a value to the format and not have to change any other code.

 

If you haven't seen it before the DIF function returns the difference of the current value of variable from that on the previous record (actually you can use DIFx, x=1 to more than you should use generally to get differences from 1 to x values). Using BY statement on the group variable created allows us to test if this the first time that upgrade path or specific machine is encountered. We wouldn't want a difference from a different machine  so is set to 0 (possibly missing will make more sense later, depends on use) for the price_dif variable and that actual difference from the last year otherwise.

 

View solution in original post

6 REPLIES 6
ballardw
Super User

Show what your SAS data sets look like as data step code.

If you do not have a YEAR for each record of store, machine and price then that will be part of the solution.

Then you will want the data sets combined.

 

Likely the "upgrade" information would be used to create some sort of 'basically the same thing' identifier for cross year purposes UNLESS those upgrade paths cross each other somewhere.

 

"Easily" would depend on your experience.

togglefroggle
Fluorite | Level 6

Thanks for the reply!

So this is a more accurate example of my data in a data step.

I've combined the datasets and added the year variable.

DATA arcade; 
INPUT STORE_NUM $ MACHINE_INV $ PRICE YEAR;
CARDS;
005       P014        322         2015
001       P001        212         2015
004       P016        351         2015
012       P045        462         2015
007       P055        634         2015
005       P145        504         2015
005       P090        480         2015
012       P014        322         2015
003       P512        521         2015
005       P015        343         2016
001       P001        212         2016
004       P016        351         2016
005       P091        492         2016
007       P055        634         2016
005       P077        520         2016
005       P122        480         2016
012       P014        322         2016
003       P015        343         2016
005       P016        351         2017
001       P001        212         2017
004       P016        351         2017
012       P091        492         2017
007       P057        688         2017
005       P092        520         2017
005       P047        480         2017
012       P016        351         2017
003       P016        351         2017
RUN; 

In this data I have some examples of what I'm looking at. In store 005 upgrading from P014->P015->P016 each year, and store 007 went from p055 in 2015 and 2016, to p057 in 2017. I'm just trying to find the price variable difference by upgrading these machines year over year. One assumption is that upgrading a machine will always increase the price variable, so there should always be an increase when looking at upgrades every year.

 

I’m not sure what you mean by “'basically the same thing' identifier”, and my SAS experience is not too high so I'm really not sure how to continue from here.

ballardw
Super User

Please run this code and see if the resulting Price_dif is what you expect.

proc sort data=arcade;
   by store_num  MACHINE_INV year;
run;

proc format;
value $macseries
'P014','P015','P016'        ='P014'
'P090','P091','P092','P093' ='P090'
'P055','P056','P057'        ='P055'
'P080','P081','P082'        ='P080'
;
run;


data temp;
   set arcade;
   macseries = put(machine_inv,$macseries.);
run; 

data want;
   set temp;
   by store_num macseries;
   dp= dif(price);
   if first.macseries then price_dif=0;
   else price_dif=dp;
   drop dp;
run;

There is an intermediate step that creates a variable indicating which series of upgrades, what I called “'basically the same thing' identifier”. That variable could be added with if/then/else statements but since these upgrade paths may not be ended it might be easier to simply add a value to the format and not have to change any other code.

 

If you haven't seen it before the DIF function returns the difference of the current value of variable from that on the previous record (actually you can use DIFx, x=1 to more than you should use generally to get differences from 1 to x values). Using BY statement on the group variable created allows us to test if this the first time that upgrade path or specific machine is encountered. We wouldn't want a difference from a different machine  so is set to 0 (possibly missing will make more sense later, depends on use) for the price_dif variable and that actual difference from the last year otherwise.

 

togglefroggle
Fluorite | Level 6

Thank you so much for the help!

 

The solution you provided somewhat works, I encountered a few problems trying to implement it.

There is no distinction between stores using this code and it will just takes the previous rows price and uses that for the difference, for example the output looks like this:

 

 

STORE_NUM MACHINE_INV PRICE YEAR macseries price_dif
05 P012 322 2015 P014 0
05 P145 504 2015 P145 182
12 P095 390 2016 P095 -114

Here store 12 is calculating the difference between two stores instead of machines within the same store. It's easy to notice since the output is sorted by store and macseries, and each transition to the next store or macseries in the data there is a difference calculation going.

 

It does this throughout the output whenever the macseries value changes, pretty much just taking the difference from the previous row and ignoring things like store number and machine type.

 

Edit: looks like this issue was caused by some bad additions to the code, I've resolved this now and it works perfectly, sorry for the confusion.

ballardw
Super User

Show the actual code you used. Starting with the Proc Sort step. The example data and the code provided do treat stores as part of the group.

 

togglefroggle
Fluorite | Level 6

The code you provided performs as it should, my error was poorly implementing some restrictions in the code. I do have a few questions if you don't mind answering.

 

1. Some machine upgrades go in descending order ( ie 200->199->198), for this I created a descending order on a different set during the proc sort phase, but when I try to run the code it tells me that the BY variables are not properly sorted on the dataset. I'm not sure what the problem is here, where the only change is adding descending before the variable machine_inv in proc sort.

 

2. The calculation for machine upgrades only calculates it for one machine, for example:

STORE_NUM MACHINE_INV PRICE YEAR macseries price_dif
005                  P014                 322       2015  P014         0
005                  P014                 322       2015  P014         0
005                  P015                 343       2016  P014         21
005                  P015                 343       2016  P014         0
005                  P015                 343       2016  P014         0

For my situation I want to populate all the rows of that year where a difference was calculated, so the last 3 rows in this example should all show 21. I'm thinking of just using an if then in a datastep to do this, but I'm not sure if there is a simpler solution.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 624 views
  • 0 likes
  • 2 in conversation