Hello,
Every month I want to create(update) a data set that check 4 variables.
The task is to create for each variable a distrubtion by 3 categores : 0, Null,Other.
The distribution calculate number of rows and Percentile From Total.
As you understand each month there are more 2 columns that are added to the data set.
The idea is to find if there is any change in distribution for any variable.
I would like to define a rule that if the difference in Percentile between 2 consecutive months is higher than 1% then need to color the row in red color.(It means that the check is done between columns 4 and 6, 6 and 8, 8 and 10 and so on....
What is the way to do it please?
Data have; Input Var_name $ category $ nr202101 pct202101 nr202102 pct202102 nr202103 pct202103 nr202104 pct202104 ; cards; Welath 0 80000 0.8 Welath Null 10000 0.1 Welath Other 10000 0.1 Age 0 5000 0.05 Age Null 8000 0.08 Age Other 87000 0.87 CustID 0 0 0 CustID Null 0 0 CustID Other 100000 1 Income 0 20000 0.2 Income Null 10000 0.1 Income Other 70000 0.7 ; Run;
You switch from talking about percent to talking about percentile. Which is it?
In either case, the data you present does not seem to offer a possibility to calculate percentile or percent. (Percent of what quantity? Percentile of what distribution?)
As I (and others) have said many times, putting calendar information into the variable names (wide data set) is a poor choice. A long data set would be a much better choice, resulting in easier programming. Surely, you are going to take this advice from now on.
Your code doesn't work, please fix it. Please test your code before you post it from now on.
At least give us some data to work with that will display both cases ... not red in some cases, red in some cases. Fake data is fine, but the data set you provided is unusable for this problem.
Also please SHOW us the desired output — you don't have to actually color the output red, but we need to know the layout of the report and you can tell us in words which cells should be red.
Hi:
Conceptually, I think what you want will be similar to this. I didn't bother with the percent or percentile, since that part was not clear. All I did was use LAG in a DATA step to get the height from the previous row in SASHELP.CLASS, then used a COMPUTE block in PROC REPORT to calculate the difference between the height on the current row and the height on the previous row (although I could have done this part in the DATA step too). Then I made a user-defined format to hold the colors I wanted. I used the format for a simple example of only formatting the background color of the calculated difference and then I did a more advanced example of changing the background color for the entire row.
Cynthia
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.