Hi. I have a dataset with a single record. It contains a count for each column. I need to output another dataset with one record, like this one, but that has Percentage for each day (and NULL) column instead of a count. Can anyone advise me on creating this?
TOTAL _08_01_2016 _07_30_2016 _07_29_2016 NULL
1917462 61 4813 18477 118420
null is a bad variable name, as it has other context within databases and programming.
You can use the _: wildcard, but you will need to know the amount of variables to do the percentages.
Storing the dates in your variable names isn't a good idea and violates best practices.
Here's a good article on managing data, from an R perspective but it holds.
https://www.jstatsoft.org/article/view/v059i10
I would *seriously* consider tranpsosing your data to a long format, doing your calculations and then retransposing if required.
Otherwise, determine the number of days needed for percent by querying the sashelp.vcolumn table and storing that number in a macro variable. That can be used to index your arrays.
proc sql;
select count(*) into :count_vars
from sashelp.vcolumn where upper(libname)='WORK' and upper(memname) = 'HAVE' and name like '_%';
quit;
array cts(*) _: null;
array pcts(*) pct1-pct&count_vars pct_null;
The rest of the code is the same.
Wide data is always a bit annoying to deal with
1. Declare two arrays, one for counts and one for percentages
array cts(*) _date_vars list;
array pcts(*) day1-day3;
2. Loop over array to calculate percentages
do I= 1 to dim(cts);
pcts(I)=cts(I) / total ;
end;
Thanks for the suggestion. I should've mentioned the number of days (day1-day3) is not static. It may vary from run to run. Can I handle that in some way? Also, what about handling the NULL variable?
Do you have a naming convention for the variables?
All the variables except NULL are named like a date with an underscore in front. Like _07_29_2016.
I create the above dataset using this code:
/* RULE: MATCHING + NO SCORE IMPACTING. Calculate the totals for this rule and then append this record to FinalData&QTR */
proc means data=QueryData&ZIP5 sum;
var total _: null;
output out=cal_sums&ZIP5 (drop=_type_ _freq_) sum=;
run;
As you can see it creates the _date looking variables (_08_01_2016)
using the _: wildcard on the VAR statement.
var total _: null;
Is there any way to create a dataset of Percentages in a similar way?
null is a bad variable name, as it has other context within databases and programming.
You can use the _: wildcard, but you will need to know the amount of variables to do the percentages.
Storing the dates in your variable names isn't a good idea and violates best practices.
Here's a good article on managing data, from an R perspective but it holds.
https://www.jstatsoft.org/article/view/v059i10
I would *seriously* consider tranpsosing your data to a long format, doing your calculations and then retransposing if required.
Otherwise, determine the number of days needed for percent by querying the sashelp.vcolumn table and storing that number in a macro variable. That can be used to index your arrays.
proc sql;
select count(*) into :count_vars
from sashelp.vcolumn where upper(libname)='WORK' and upper(memname) = 'HAVE' and name like '_%';
quit;
array cts(*) _: null;
array pcts(*) pct1-pct&count_vars pct_null;
The rest of the code is the same.
I will give this a shot. Thank you.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.