Create an if-then loop to calculate percentage of multiple groups with...

mariko5797 · Posted 10-28-2021 11:41 AM

I want to make a macro loop to calculate percentages for each treatment group and any treatment group within the same study stage. I want to make a loop b/c I have more treatment groups than I care to repeatedly type, and I want a macro b/c I'll need to run the code with multiple datasets. If there is a better way than a loop and/or macro, please let me know!

I currently have the following:

*Code to get dataset;
proc sql;
 create table data_ as
 	/*Treatment Groups*/
 	select stage, 'Grp' as type, trt, count(distinct patid) as cnt
		from enroll_
		where not missing(trt)
		group by trt
	union corr

	/*All Subjects*/
	select stage, 'All' as type, trt, count(distinct patid) as cnt
		from enroll_
		where not missing(trt)
		group by stage
	order by stage, type, trt;
quit;

*Resulting dataset;
data have;
 input tab $ type $ trt $ cnt;
 cards;
 1	All	T1	9
 1	All	T2	9
 1	Grp	T1	5
 1	Grp T2	4
 2A	All	T3	20
 2A All	T4	20
 2A All	T5	20
 2A	All	T6	20
 2A	Grp	T3	5
 2A Grp	T4	5
 2A Grp	T5	7
 2A	Grp	T6	3
 2B All T7	12
 2B All T8	12
 2B Grp T7	8
 2B Grp T8	4
 ;
run;

I'm fine using the dataset I made, but I feel it may be easier to work with something more along the lines of this:

data want;
 input tab $  trt $ cnt;
 cards;
 1	A1	9
 1	T1	5
 1	T2	4
 2A	A2	20
 2A	T3	5
 2A T4	5
 2A T5	7
 2A	T6	3
 2B A3	12
 2B T7	8
 2B T8	4
 ;
run;

However, I am unsure if there's a direct way to do this. Only thing I can think of is replace TRT value with A1, A2, A3 and delete the extra rows.

I want to make a loop out of the following:

data data;
 set data_;
 if trt = 'A1' then pct = 100*cnt/&NA1;
 if trt = 'T1' then pct = 100*cnt/&NT1;
 if trt = 'T2' then pct = 100*cnt/&NT2;
 *and so on...;
run;

The total counts are currently saved as macros. Is there a simple way to make the process of referring to &TRT and calculating percent of N&TRT?

*Want this to loop through list of treatments;
if trt = &trt then pct = 100*cnt/N&trt;

Note: Macro variables NA1, NT1, etc. were made in project set-up file (i.e. I did not assign myself).

Thank you in advance!

mariko5797 · Posted 10-28-2021 11:42 AM

EDIT: TAB and STAGE are the same

Reeza · Posted 10-28-2021 11:55 AM

Can you show an example of what the raw data _enroll looks like and what you want as final output? Preferably a fully worked example so that the output values algin with the input data. This isn't a difficult calculation and I think you can do it in a single step rather than multiple using some summary procs instead of manual calculations.

mariko5797 · Posted 10-28-2021 12:06 PM

Here's the general set-up of the ENROLL_ dataset. There are a lot more variables, but I only listed the ones that matter.

data enroll_;
 input patid $ tab $ grp $;
 cards;
 CV001	.	.
 CV002	.	.
 CV003	1	T1
 CV004	1	T2
 CV005	2B	T8
 CV006	2A	T3
 CV007	1	T2
 CV008	2B	T8
 CV009	.	.
 CV010	2B	T7
 CV011	2A	T4
 CV012	2A	T4
 CV013	2B	T8
 CV014	2A	T5
 CV015	2A	T6
 CV016	1	T2
 CV017	.	.
 CV018	2B	T7
 CV019	2A	T5
 CV020	2A	T5
 ;
run;
/*NOTE: TAB is same as STAGE from SQL code*/
/*NOTE: counts will not match CNT from HAVE dataset since it's too much to type*/

mariko5797 · Posted 10-28-2021 12:13 PM

However, there are other datasets aside from ENROLL_ that I will have repeat ID's. In this event, I would be counting how many have at least one deviation (DVCODE) within each treatment group.

For example:

data dv_;
 input patid $ tab $ grp $ dvcode $;
 cards;
 CV003	1	T1	01
 CV003	1	T1	03
 CV005	2B	T8	01
 CV005	2B	T8	05
 CV005	2B	T8	02
 CV010	2B	T7	03
 CV020	2A	T5	01
 CV020	2A	T5	04
 ;
run;

Reeza · Posted 10-28-2021 12:25 PM

proc sql;
 create table data_ as
 	/*Treatment Groups*/
 	select tab as type, trt, count(distinct patid) as cnt
		from enroll_
		where not missing(trt)
		group by trt
	union corr

	/*All Subjects*/
	select 'All' as type, trt, count(distinct patid) as cnt
		from enroll_
		where not missing(trt)
		group by tab
	order by type, trt;
quit;

Let's deal with one question at a time, it seems like you're asking a few here.

This should get you that first table you wanted instead of the duplicates, does that work with your real data?

If so, what's the issue/relevance of the macro variables.

Then we'll come back to what you mean by multiple tables with multiple ID's.

mariko5797 · Posted 10-28-2021 12:35 PM

Is there a way to have type be different for each TAB, or would that need to be a separate chunk of code?

Update: I just realized that the macro variables and values were also saved in a table, so I'm just going to merge by TRT and calculate percent using the two columns. Figured this is a lot less computational power.

ballardw · Posted 10-28-2021 05:15 PM

Do you need a data set (for further manipulation) or a report (Humans read these)?

Quite often if a report is the actual need than one of the report procedures like Report or Tabulate will take care of the "dynamic" elements of changing group definitions if the variables are defined correctly and may well do it from the base data instead of using sql to partially summarize data first.

Quick example which creates two report tables of different layout.

proc tabulate data=sashelp.class;
   class sex age;
   tables  sex*(age all='All ages'),
       n pctn='% of students' pctn<sex>='% of sex';
   tables all='All students' sex,
      (age all='All ages')*(n pctn='% all students' rowpctn="% of sex" colpctn="% of age")
      ;
run;

This and Proc Report will also create data sets but the structure for use is a bit of a learning curve.

Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Re: Create an if-then loop to calculate percentage of multiple groups with different denominators

Registration is open

SAS Training: Just a Click Away