I often come across tasks where I need to present the drop off total counts as certain criteria are applied to the original base population.
And the table is then presented in this manner
input name $ count type $;
datalines;
Total Base 1000000 Base
first filter 900000 Filter
second filter 500000 Filter
third filter 400000 Filter;
The table shows the total number after each filter has been applied consecutively.
However, the waterfall chart http://blogs.sas.com/content/iml/2015/04/27/cascade-chart.html requires a slightly different
dataset.
datalines;
Total Base 1000000 Base
first filter -100000 Filter
second filter -400000 Filter
third filter -200000 Filter;
Note that the number of filter may varies from time to time. How can I generate the second table before applying the proc sgplot?
Try this:
data have;
infile datalines dlm=',';
input name :$20. count type $;
datalines;
Total Base,1000000,Base
first filter,900000,Filter
second filter,500000,Filter
third filter,400000,Filter
;
run;
data want;
set have (rename=(count=oldcount));
retain keepcount;
if type = 'Base'
then do;
keepcount = oldcount;
count = oldcount;
end;
if type = 'Filter'
then do;
count = oldcount - keepcount;
keepcount = keepcount + count;
end;
drop keepcount oldcount;
run;
proc print data=want noobs;
run;
Note that this produces a different value for the third filter:
name type count Total Base Base 1000000 first filter Filter -100000 second filter Filter -400000 third filter Filter -100000
Did I miss something, or was that a miscalculation in your example?
Try this:
data have;
infile datalines dlm=',';
input name :$20. count type $;
datalines;
Total Base,1000000,Base
first filter,900000,Filter
second filter,500000,Filter
third filter,400000,Filter
;
run;
data want;
set have (rename=(count=oldcount));
retain keepcount;
if type = 'Base'
then do;
keepcount = oldcount;
count = oldcount;
end;
if type = 'Filter'
then do;
count = oldcount - keepcount;
keepcount = keepcount + count;
end;
drop keepcount oldcount;
run;
proc print data=want noobs;
run;
Note that this produces a different value for the third filter:
name type count Total Base Base 1000000 first filter Filter -100000 second filter Filter -400000 third filter Filter -100000
Did I miss something, or was that a miscalculation in your example?
Terrific! Yes, you are right, the last value was a miscalculation. Apologize. Thanks @Kurt_Bremser!
And for lazy programmers:
data want; set have; dif=dif(count); if type ne 'Base' then count=dif; drop dif; run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.