I have two tables, t1 is filled with all purchase records and is laid out like so:
CustomerID|ProductID|PurchaseAge|TimeSincePurchase|
A00001 |0001 |23 |7 |
A00001 |0002 |29 |1 |
A00002 |0001 |65 |13 |
t2 is in the layout of:
ProductID|PurchaseAge|FactorAtTime0|FactorAtTime1|...|FactorAtTime120|
0001 |18 | ... | ... |...| ... |
0001 |19 | ... | ... |...| ... |
... |... | ... | ... |...| ... |
0002 |18 | ... | ... |...| ... |
0002 |19 | ... | ... |...| ... |
... |... | ... | ... |...| ... |
I'm attempting to write a query that would left join on the ProductID and PurchaseAge characteristics and then pull in the factor from the "TimeSincePurchase + 3"-th column of t2 .
Is there a method to call columns by their ordinal position within the table as opposed to the name of the column?
If so, can a variable be used in place of a numerical constant?
EDIT: As per the below comments, it appears you would have to use a data step rather than sql in order to dynamically pick which column to pull from. The solution I ended up going with was transforming the second table like below then joining on ProductID, PurchaseAge, and TimeSincePurchase.
ProductID|PurchaseAge|TimeSincePurchase|Factor|
0001 |18 | 0 | .. |
0001 |18 | 1 | .. |
0001 |18 | 2 | .. |
0001 |19 | ... | .. |
... |... | ... | .. |
0002 |18 | ... | .. |
0002 |19 | ... | .. |
... |... | ... | .. |
So when TimeSincePurchase=7 you want to pull the factor from t2 from the variable FactorAtTime10 in the record for productID=0001, PurchaseAge=23, correct?
Can you show us an example of what your desired output looks like?
When TimeSincePurchase=7 I want to pull the factor from the 10th column in t2, FactorAtTime7.
For Example:
t1 record
CustomerID|ProductID|PurchaseAge|TimeSincePurchase|
A00001 |0001 |23 |7 |
A00001 |0002 |29 |1 |
A00002 |0001 |65 |13 |
and t2 record
ProductID|PurchaseAge|FactorAtTime0|FactorAtTime1|...|FactorAtTime7|
0001 |21 | 1.3 | 0.2 |...| 0.9 |
0001 |22 | 0.6 | 5.1 |...| 4.7 |
0001 |23 | 1.7 | 2.3 |...| 6.8 |
would create
CustomerID|ProductID|PurchaseAge|TimeSincePurchase|CurrentFactor
A00001 |0001 |23 |7 |6.8
A00001 |0002 |29 |1 |
A00002 |0001 |65 |13 |
An array is another option but note that you start at 0, so make sure it's +3 that you actually want which would return FactorAtTime 11 since SAS indexes from 1, not 0.
array factorT(*) factorAtTime0 - factorAtTime120;
Value = factort(timesincePurchase-1);
If you simply want to pull the factor from the variable FactorAtTime7 when TimeSincePurchase=7 in t1, then pulling this value by its relative position in the data set is not the way to go.
Use the VVALUEX Funtion as suggested by @Reeza, but do not set time=timeSincePurchase+3.
@PeterClemmensen wrote:
If you simply want to pull the factor from the variable FactorAtTime7 when TimeSincePurchase=7 in t1, then pulling this value by its relative position in the data set is not the way to go.
Use the VVALUEX Funtion as suggested by @Reeza, but do not set time=timeSincePurchase+3.
Good catch, thanks 🙂
@ProcLunchbreak use Arrays or VVALUEX. Both are illustrated, you can play around to ensure you have the right index set up.
I'm getting a syntax error immediately after VVALUEX with the following code:
PROC SQL; SELECT t1.CustomerID, t1.ProductID, t1.PurchaseAge, t1.TimeSincePurchase, (t2.VVALUEX(CATT('FactorAtTime', PUT(t1.TimeSincePurchase, 8., -l)))) AS CurrentFactor FROM WORK.PurchaseRecords t1 LEFT JOIN WORK.ProductFactors t2 ON (t1.ProductID = t2.ProductID) AND (t1.PurchaseAge = t2.PurchaseAge) ;QUIT;
Is this is something I should be trying in a datastep rather than SQL?
@ProcLunchbreak wrote:
Is this is something I should be trying in a datastep rather than SQL?
Yes, AFAIK SQL doesn't have a dynamic method of referencing columns without knowing it ahead of time.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.