Observation can be a part of multiple groups Proc Freq

ematznick · Posted 03-30-2021 01:48 PM

I have a data set where an observation can be a part of multiple groups within a variable. The variable has four options coded as 'red', 'green', 'yellow', 'white'. Observations with multiple are coded as group, group, ...etc. for example 'red, green'. When I run proc freq, proc freq sees this as 5 groups. I want to be able to have the multiple observation count into each respective group so for the example 'red, green' it adds a count to red and one to green instead of being run as it's own group.

%macro disc_tab (dset, var,grp1,grp2,ct,ct1,ct2,ct3,ct4,ctc,name,ord);
proc sort data= &dset out=sort1&dset;
	by &grp1;
proc freq data= sort1&dset;
	table &var/out =outtot&var sparse;
	where &var ne '';
proc freq data=sort1&dset;
	table &var/out=out1&var sparse;
	where &var ne '';
	by &grp1;
proc sort data= &dset out=sort2&dset;
	by &grp2;
proc freq data=sort2&dset;
	table &var/out=out2&var sparse;
	where &var ne '';
	by &grp2;
data outtot&var;
set outtot&var;
	npc=trim(left(count))||"/&ct ("||trim(left(put(percent,5.1)))||')';
data out1&var;
set out1&var;
if subcoh =1 then npc=trim(left(count))||"/&ct1 ("||trim(left(put(percent,5.1)))||')';
if subcoh =2 then npc=trim(left(count))||"/&ct2 ("||trim(left(put(percent,5.1)))||')';
if subcoh =3 then npc=trim(left(count))||"/&ct3 ("||trim(left(put(percent,5.1)))||')';
if subcoh =4 then npc=trim(left(count(*)))||"/&ct4 ("||trim(left(put(percent,5.1)))||')';
data out2&var;
set out2&var;
if comb =0 then npc=trim(left(count))||"/&ct1 ("||trim(left(put(percent,5.1)))||')';
if comb =1 then npc=trim(left(count))||"/&ctc ("||trim(left(put(percent,5.1)))||')';


data out&var;
	set outtot&var out1&var out2&var;
proc SORT; 
	BY  &var;
PROC TRANSPOSE data=out&var out=trans&var;
   VAR npc; 
	BY  &var;
run;

Data table&var;
	length stat $300.;
	FORMAT stat var $300.;
	set trans&var;
	stat=&var;
	drop COL6 ;
	var=&name;
	ord=&ord;

run;
%mend disc_tab;

%disc_tab(mh_analysis,mhterm2_display,subcoh,comb,&n_term2,&n1_term2,&n2_term2,&n3_term2,&n4_term2,&nc_term2,'color* [N (%)]',3);

PaigeMiller · Posted 03-30-2021 02:14 PM

This method you have described, if I am understanding properly, would cause the sum of the numbers in each group to greater than the number of original observations. Is that what you want?

--
Paige Miller

ematznick · Posted 03-30-2021 02:35 PM

Correct

mkeintz · Posted 03-30-2021 02:28 PM

I don't think I understand your objective. Can you show what the data looks like that you are submitting to the macro?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

ballardw · Posted 03-30-2021 02:37 PM

Plus show what the expected result for example data looks like.

ematznick · Posted 03-30-2021 02:50 PM

Expected for mhterm2_display
elbow 63/102 (61.8%)
knee 51/102 (50.0%)
ankle 3/102 (2.9%)
wrist 3/102 (2.9%)
foot 4/102 (3.9%)
neck 15/102 (14.7%)

PaigeMiller · Posted 03-30-2021 02:56 PM

@ematznick wrote:
Expected for mhterm2_display
elbow 63/102 (61.8%)
knee 51/102 (50.0%)
ankle 3/102 (2.9%)
wrist 3/102 (2.9%)
foot 4/102 (3.9%)
neck 15/102 (14.7%)

This adds to more than 100%, and that was specifically my first question. Your original description talked only about counts, not percents, and if you want percents, what is the denominator 102? Your answer to my first question was that the sum would be greater than the number of records, but if you run it through PROC FREQ using my code, you get a percent that totals 100%. So PROC FREQ doesn't work? Or does it?

@ematznick , you shouldn't be providing us the problem to be solved in bits and pieces, and only when we ask. Please take a few minutes and write a clear and complete description of the problem, so we don't have to keep asking questions and modifying code, so hopefully, we can solve it once.

--
Paige Miller

ematznick · Posted 03-30-2021 02:48 PM

The data running through looks like:
Subject mhterm2_display mhterm3_display mhterm4_display
cb01 'elbow' 'catheter' 'green, red'
cb02 'knee, elbow' 'pacemaker' 'blue'
...
cb354

PaigeMiller · Posted 03-30-2021 02:53 PM

You would probably have to create a new data set named INTERMEDIATE for each variable individually, and then run PROC FREQ for each variable individually. That part would require taking my code and turning it into a macro where the only thing changing is the name of the input variable, which I called STRING.

--
Paige Miller

PaigeMiller · Posted 03-30-2021 02:39 PM

Hmm, this is a great example of where macros are not needed. You can manipulate the text strings in a data set so that PROC FREQ will produce the results you want.

data have;
infile cards missover;
input string $64.;
cards;
red
green
blue
yellow
red,green
;
data intermediate;
    set have;
    length word $ 16;
    do i=1 to countw(string,',');
        word=scan(string,i,',');
        output;
    end;
    keep word;
run;
proc freq data=intermediate;
    tables word;
run;

--
Paige Miller

ematznick · Posted 03-30-2021 02:45 PM

Some of the variables have 16 combinations of 7 groups with a total of 354 observations. Would I do this for each variable that has multiples and would I list all 16 combinations?

PaigeMiller · Posted 03-30-2021 02:47 PM

It's time you gave specific examples.

--
Paige Miller

Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Re: Observation can be a part of multiple groups Proc Freq

Classroom Training Available!