About Rickie2

Rickie2 · ‎08-27-2024

Thanks for your answers Tom, here are the answers for your questions: - What is the overall goal here? What is the meaning of the data and what is the analysis you are trying to do? Those data will be used for a ML model. A left outer join brings all of tables together. The new table is going to be the dataset to modelized. Each table represent measures from different periods P: P0 current period, P1 the previous period, P2 the period before P1, ... until PX which is the ultimate period. That's why it required a loop. 1° P0-P1 2° P1-P2 3° P2-P3 .... PX-1 - PX. - What does this P0, P1, etc suffix on the variable names mean? Periods of time, example a year, measurement for a given year. - Why is the data in multiple datasets? Why not just store all of the data in one dataset? No problem, this step can be done before make a comparison step. In this case, the variable names have to keep P0, P1, P2, etc. at the end of the variable name. - Why do the variable names change between the datasets? The value is computed period by period. Table Have_PO are results for the current year, Have_P1 results for the previous year, Have_P2 etc. Why not just have a separate variable with values like P0 or P1 (or perhaps numeric variable with 0 and 1) to indicate which P value this observation is for? It's a possibility. Do you always have the same 100 variables? Yes, periods P0, P1, P2,...,PX will always have the same variables Do you have SAS/IML license? Could you load the two datasets into matrices and just subtract them? NO Why do you need to make some many different difference datasets? For a ML model, to follow the evolution between periods. Then tables are united with a left outer join. if you have other questions, don't hesitate, your help is greatly appreciated.

Rickie2 · ‎08-26-2024

Hi everybody, I would like to figure out the best way to solve this : create a macro with a loop that substract (or if character indicate if difference) each variable from datasets of different periods (P0 - P1 until PX-1 - PX), and put the result in a new dataset. For example, apple_P0 - apple_P1, Orange_P0 - Orange_P1,then same for Pineapple and TotalAB. For dataset P0 to dataset PX, increment by 1. The dataset contains about 1000 colunms. %let %n = 0; %let %X = X; (max period) data HAVE_P0; input ID Apple_P0 Orange_P0 Pineapple_P0 TotalAB_P0 $; cards; 15427 10 100 1000 Machine 35894 20 200 2000 Hand 57842 40 400 4000 Hand 79432 75 750 7500 Machine ;run; data HAVE_P1; input ID Apple_P1 Orange_P1 Pineapple_P1 TotalAB_P1 $; cards; 15427 50 500 5000 Machine 35894 10 100 1000 Machine 57842 40 400 4000 Wind ;run; data HAVE_P2; input ID Apple_P2 Orange_P2 Pineapple_P2 TotalAB_P2 cards; ... data HAVE_PX; etc. ( case if Var_P1 do not exist then . if Var_P0 is character and P0 = P1 then 1 if Var_P0 is character and P0 <> P1 then 0 else Var_P0 - Var_P1 end ) data WANT_P0_minus_P1; 15427 40 400 4000 1 35894 -10 -100 -1000 0 57842 0 0 0 0 79432 . . . . ; data WANT_P1_minus_P2; … data WANT_PX-1_minus_PX; Thanks in advance,

Online Status	Offline
Date Last Visited	‎08-28-2024 07:03 AM

Re: Loop through tables at different periods and columns name

Loop through tables at different periods and columns name

Re: Loop through tables at different periods and columns name

Re: Loop through tables at different periods and columns name

Re: Loop through tables at different periods and columns name

Loop through tables at different periods and columns name