- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have the following dataset and I was wondering what would be the shortest, most elegant way to get an output similar to the one in screenshot attached? I'm thinking I can probably get it done the hard way by manipulating a bunch of proc freq statements but was hoping someone has suggestions for a more efficient/shorter way to obtain such a table? Total eligible column contains the number of participants in study=A, whereas Total Enrolled refers to number of participants in study=B. Basically, the idea behind this table is that I want to see, across the 2 age groups, how many of the participants from study A also enrolled in study B.
data have; input patid $ site $ agegrp study $; datalines; 1001 site1 1 A 1002 site1 1 A 1003 site1 2 A 1004 site1 2 A 1005 site1 2 A 1006 site1 2 A 1006 site1 2 B 2001 site2 1 A 2002 site2 1 A 2003 site2 1 A 2004 site2 1 A 2005 site2 1 A 2005 site2 1 B 2006 site2 1 A 2007 site2 1 A 2008 site2 1 A 2008 site2 1 B 2009 site2 1 A 2009 site2 1 B 2010 site2 2 A 2010 site2 2 B 2011 site2 2 A 2011 site2 2 B 2012 site2 2 A 2012 site2 2 B 2013 site2 2 A 2013 site2 2 B 2014 site2 2 A 2014 site2 2 B 2015 site2 2 A 2015 site2 2 B 2016 site2 2 A 3001 site3 1 A 3002 site3 1 A 3002 site3 1 B 3003 site3 1 A 3003 site3 1 B 3004 site3 2 A 3005 site3 2 A ; run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Look at this:
data int; /* make numeric variables for summing */
set have;
eligible = (study = "A");
enrolled = (study = "B");
run;
proc summary data=int; /* summarize */
var enrolled eligible;
class site agegrp;
output
out=want1 (where=(_type_ in (2,3)))
sum()=
;
run;
data want2; /* create "total" values */
set want1;
if _type_ = 2 then agegrp = 99;
run;
proc format; /* nice text for across variable */
value agegrp
1 = "Age Group 1"
2 = "Age Group 2"
99 = "Total"
;
run;
proc report data=want2;
column site agegrp,(eligible enrolled percent);
define site /group;
define agegrp / "Age Group" across format=agegrp.;
define eligible / "Total Eligible N" analysis sum;
define enrolled / "Total Enrolled N" analysis sum;
define percent / "Enrolled Percent" computed format=percent7.2;
compute percent;
_c4_ = _c3_ / _c2_;
_c7_ = _c6_ / _c5_;
_c10_ = _c9_ / _c8_;
endcomp;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Maybe with proc tabulate, but you won't get count and percentage in one cell.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Look at this:
data int; /* make numeric variables for summing */
set have;
eligible = (study = "A");
enrolled = (study = "B");
run;
proc summary data=int; /* summarize */
var enrolled eligible;
class site agegrp;
output
out=want1 (where=(_type_ in (2,3)))
sum()=
;
run;
data want2; /* create "total" values */
set want1;
if _type_ = 2 then agegrp = 99;
run;
proc format; /* nice text for across variable */
value agegrp
1 = "Age Group 1"
2 = "Age Group 2"
99 = "Total"
;
run;
proc report data=want2;
column site agegrp,(eligible enrolled percent);
define site /group;
define agegrp / "Age Group" across format=agegrp.;
define eligible / "Total Eligible N" analysis sum;
define enrolled / "Total Enrolled N" analysis sum;
define percent / "Enrolled Percent" computed format=percent7.2;
compute percent;
_c4_ = _c3_ / _c2_;
_c7_ = _c6_ / _c5_;
_c10_ = _c9_ / _c8_;
endcomp;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I can't repeat your behavior; could it be that the unformatted values are different in your real data than the ones you posted here?
Anyway, use an ORDER= option:
proc report data=want2;
column site agegrp,(eligible enrolled percent);
define site /group;
define agegrp / "Age Group" across format=agegrp. order=formatted;
define eligible / "Total Eligible N" analysis sum;
define enrolled / "Total Enrolled N" analysis sum;
define percent / "Enrolled Percent" computed format=percent7.2;
compute percent;
_c4_ = _c3_ / _c2_;
_c7_ = _c6_ / _c5_;
_c10_ = _c9_ / _c8_;
endcomp;
run;
This will work as long as the formatted values sort correctly in lexical order.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
See this as a starting point.
Note: Do you want the must compact code or do you want code that will be easy to modify/update as needed? There's often a big trade off there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Reeza, thank you for the link, very helpful! Ideally, I'd like to have a combination of both compact and easily updated to apply to other situations but I think that for this case, I'll settle for compactness.