Hi all,
I have the following columns: Cum_Rev_0-Cum_Rev_24 Cum_NCO_0-Cum_NCO_24 Cum_Net_0-Cum_Net_24
Is there a way to join these arrays to help minimize the code being used here instead of duplicating each line? I know I can do this with a macro but looking to see if there are alternatives.
I was thinking there might be a way to use a wild card or something in the middle e.g. Cum_***_0-Cum_***_24 or is this not possible?
data cum_net_rev2;
set cum_net_rev1;
format today mmddyy.;
today = today();
array cum_rev_{25} cum_rev_0 - cum_rev_24;
do i = 1 to 25;
if intck('month', business_date, today) >= i then cum_rev_{i} = cum_rev_{i};
else cum_rev_{i} = .;
end;
/*Duplicate row for cum_nco_{i} rows*/
array cum_nco_{25} cum_nco_0 - cum_nco_24;
do i = 1 to 25;
if intck('month', business_date, today) >= i then cum_nco_{i} = cum_nco_{i};
else cum_nco_{i} = .;
end;
/*Duplicate row for cum_net_{i} rows*/
array cum_net_{25} cum_net_0 - cum_net_24;
do i = 1 to 25;
if intck('month', business_date, today) >= i then cum_net_{i} = cum_net_{i};
else cum_net_{i} = .;
end;
run;
What is it your code is trying to do? The action in the THEN clause of this seems strange:
if intck('month', business_date, today) >= i then cum_rev_{i} = cum_rev_{i};
else cum_rev_{i} = .;
You are essentially doing:
if not (intck('month', business_date, today) >= i) then cum_rev_{i} = .;
You could create a 2-D array and add another DO loop.
array cumm[3,25] Cum_Rev_0-Cum_Rev_24 Cum_NCO_0-Cum_NCO_24 Cum_Net_0-Cum_Net_24;
do i=1 to 25 ;
if not (intck('month', business_date, today) >= i) then do j=1 to 3;
cumm[j,i] = .;
end;
end;
Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html
@Krueger wrote:
Hi all,
I have the following columns: Cum_Rev_0-Cum_Rev_24 Cum_NCO_0-Cum_NCO_24 Cum_Net_0-Cum_Net_24
Is there a way to join these arrays to help minimize the code being used here instead of duplicating each line? I know I can do this with a macro but looking to see if there are alternatives.
I was thinking there might be a way to use a wild card or something in the middle e.g. Cum_***_0-Cum_***_24 or is this not possible?
data cum_net_rev2; set cum_net_rev1; format today mmddyy.; today = today(); array cum_rev_{25} cum_rev_0 - cum_rev_24; do i = 1 to 25; if intck('month', business_date, today) >= i then cum_rev_{i} = cum_rev_{i}; else cum_rev_{i} = .; end;
/*Duplicate row for cum_nco_{i} rows*/ array cum_nco_{25} cum_nco_0 - cum_nco_24; do i = 1 to 25; if intck('month', business_date, today) >= i then cum_nco_{i} = cum_nco_{i}; else cum_nco_{i} = .; end;
/*Duplicate row for cum_net_{i} rows*/ array cum_net_{25} cum_net_0 - cum_net_24; do i = 1 to 25; if intck('month', business_date, today) >= i then cum_net_{i} = cum_net_{i}; else cum_net_{i} = .; end; run;
Thanks I wasn't aware of alot of this stuff!
What is it your code is trying to do? The action in the THEN clause of this seems strange:
if intck('month', business_date, today) >= i then cum_rev_{i} = cum_rev_{i};
else cum_rev_{i} = .;
You are essentially doing:
if not (intck('month', business_date, today) >= i) then cum_rev_{i} = .;
You could create a 2-D array and add another DO loop.
array cumm[3,25] Cum_Rev_0-Cum_Rev_24 Cum_NCO_0-Cum_NCO_24 Cum_Net_0-Cum_Net_24;
do i=1 to 25 ;
if not (intck('month', business_date, today) >= i) then do j=1 to 3;
cumm[j,i] = .;
end;
end;
You are correct, I'm just over complicating things trying to learn and experiment with new things.
This is what I am doing.
if not (intck('month', business_date, today) >= i) then cum_rev_{i} = .;
Thanks I wasn't aware of 2-D arrays!
FYI - this is bad data structure, you should reformat your data to be in columns with time as a row.
Then your analysis can be simplified massively.
This would follow the principles of tidy data that allow for more efficient analysis.
https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html
Column headers that contain data, not just variables are problematic in this structure.
I think I understand what your saying here (I had to do some proc transposes earlier) but basically this analysis is breaking down a time period (in this instance it is the months) and doing cumulative counts on Revenue, Debt and Net from the month since the start of the loan. We then track this for each month from the start of the loan.
At the end we're grouping all of this data by loan into current time periods so it would like something like this...
MonthYear Campaign CUM_REV_1-CUM_REV_3 CUM_NCO_1-CUM_NCO_3 CUM_NET_1-CUM_NET_3
Hopefully that made sense?
Well, because the final output isn't that simple.
Below is a snippet with values removed showing some but not all of the columns. There are multiple groupings for Campaign_customer, customer_group and product_name which are all broken out by business_date. We then have the cum_: fields for each of these.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.