Hi. I have a dataset with a single record. It contains a count for each column. I need to output another dataset with one record, like this one, but that has Percentage for each day (and NULL) column instead of a count. Can anyone advise me on creating this?
TOTAL _08_01_2016 _07_30_2016 _07_29_2016 NULL
1917462 61 4813 18477 118420
null is a bad variable name, as it has other context within databases and programming.
You can use the _: wildcard, but you will need to know the amount of variables to do the percentages.
Storing the dates in your variable names isn't a good idea and violates best practices.
Here's a good article on managing data, from an R perspective but it holds.
https://www.jstatsoft.org/article/view/v059i10
I would *seriously* consider tranpsosing your data to a long format, doing your calculations and then retransposing if required.
Otherwise, determine the number of days needed for percent by querying the sashelp.vcolumn table and storing that number in a macro variable. That can be used to index your arrays.
proc sql;
select count(*) into :count_vars
from sashelp.vcolumn where upper(libname)='WORK' and upper(memname) = 'HAVE' and name like '_%';
quit;
array cts(*) _: null;
array pcts(*) pct1-pct&count_vars pct_null;
The rest of the code is the same.
Wide data is always a bit annoying to deal with
1. Declare two arrays, one for counts and one for percentages
array cts(*) _date_vars list;
array pcts(*) day1-day3;
2. Loop over array to calculate percentages
do I= 1 to dim(cts);
pcts(I)=cts(I) / total ;
end;
Thanks for the suggestion. I should've mentioned the number of days (day1-day3) is not static. It may vary from run to run. Can I handle that in some way? Also, what about handling the NULL variable?
Do you have a naming convention for the variables?
All the variables except NULL are named like a date with an underscore in front. Like _07_29_2016.
I create the above dataset using this code:
/* RULE: MATCHING + NO SCORE IMPACTING. Calculate the totals for this rule and then append this record to FinalData&QTR */
proc means data=QueryData&ZIP5 sum;
var total _: null;
output out=cal_sums&ZIP5 (drop=_type_ _freq_) sum=;
run;
As you can see it creates the _date looking variables (_08_01_2016)
using the _: wildcard on the VAR statement.
var total _: null;
Is there any way to create a dataset of Percentages in a similar way?
null is a bad variable name, as it has other context within databases and programming.
You can use the _: wildcard, but you will need to know the amount of variables to do the percentages.
Storing the dates in your variable names isn't a good idea and violates best practices.
Here's a good article on managing data, from an R perspective but it holds.
https://www.jstatsoft.org/article/view/v059i10
I would *seriously* consider tranpsosing your data to a long format, doing your calculations and then retransposing if required.
Otherwise, determine the number of days needed for percent by querying the sashelp.vcolumn table and storing that number in a macro variable. That can be used to index your arrays.
proc sql;
select count(*) into :count_vars
from sashelp.vcolumn where upper(libname)='WORK' and upper(memname) = 'HAVE' and name like '_%';
quit;
array cts(*) _: null;
array pcts(*) pct1-pct&count_vars pct_null;
The rest of the code is the same.
I will give this a shot. Thank you.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.