BookmarkSubscribeRSS Feed
jag07g
Fluorite | Level 6

Hi all,

 

I am using Enterprise Guide 7.1 and am new to macro coding in SAS. I'm struggling with developing a macro to generate what would essentially be the output from multiple proc surveyfreq procedures. I would like the macro to output the row % and 95% CI as generated from proc surveyfreq in the Excel file attached for multiple bivariate relationships. For example, I need the program to output the row % and CIs for the relationship between age and drink and gender and drug--and continue this for many more indicators. At the end, I would like to have a data set that is able to be exported into MS Access that includes the row percent and 95% CIs for the various bivariate relationships. 

 

I have attached my code that I have been working on, but it does not run successfully. I get errors that the SUBPOPN command is not valid and then that the variables WSUM, ROWPER, etc has never been referenced and my Prev1 and Temp1 data sets have no observations. I have highlighted this in the code in red text.  I have been working on debugging the code for a long time, but I am still at a loss. I only included two outcome variables to test the program, but I have others that need to be included. Eventually, I would like the resulting data set with calculated row %s and CIs to be output into Access. I know I could get the information I need from running endless individual proc surveyfreq commands and copying them over, but with the numerous different relationships I have, I need a macro.

 

Thank you in advance for any help! 

data FILE1;
set documents.data2017;
ONE=1;
RUN;


DATA FILE2

(KEEP=ONE dataweight age gender race youth grade drink drug);  
SET FILE1;
ONE=1;
RUN;

PROC SORT; BY dataweight; RUN; 

******CREATE FILE WITH OUTCOME AS INDICATOR OF INTEREST*****;

%MACRO outcome (S=1, E=2);  ***STATE DATA***;
  %DO X=&S %TO &E %BY 1;
DATA FILE2;
SET FILE1;
If &X=1 then OUTCOME= drink;
If &X=2 then OUTCOME= drug;

 ******Add more variables as needed*****;
RUN;

PROC SORT; BY dataweight; RUN;

**OUTPUTS PREVALENCE RATES FOR CATEGORICAL INDICATORS TO WORK FILE***;

PROC SURVEYFREQ DATA=FILE2; 
WEIGHT dataweight;

SUBPOPN ONE=1 / NAME="REPORT OUTPUT";
SUBGROUP age gender race youth grade OUTCOME;
LEVELS    2    2     5     2    4        2 ;
TABLES (age gender race youth grade) * OUTCOME;

PRINT NSUM WSUM ROWPER SEROW LOWROW UPROW/ WSUMFMT=F9.0 NSUMFMT=F9.0;
OUTPUT  / FILENAME=PREV&X TABLECELL=DEFAULT REPLACE;

RUN;

DM OUTPUT 'clear' continue;

DATA PREV&X 
(KEEP=age gender race youth grade OUTCOME NSUM WSUM ROWPER LOWROW UPROW SEROW OUTLEV NSTRAT NALL VARID);
SET PREV&X;
IF (gender GE 0 OR age GE 1 OR race GE 1 OR youth GE 1 or grade GE 1);

VARID=&X;
IF gender=1 AND OUTCOME=2 THEN NALL=NSUM; *outcome=2 is no*;
IF gender=1 AND OUTCOME=1 THEN DO;**outcome=1 is yes*;
	OUTLEV=1;
	NSTRAT=NALL;
END;

IF gender=0 and outcome=2 THEN OUTLEV=0;
IF gender=1 THEN OUTLEV=1;
IF race=1 THEN OUTLEV=2;
IF race=2 THEN OUTLEV=3;
IF race=3 THEN OUTLEV=4;
IF race=4 THEN OUTLEV=5;
IF race=5 THEN OUTLEV=6;
if youth=1 then outlev=7;
if youth=2 then outlev=8;
if grade=1 then outlev=9;
if grade=2 then outlev=10;
if grade=3 then outlev=11;
if grade=4 then outlev=12;
if age=1 then outlev=11;
if age=2 then outlev=12;

RUN;


***ELIMINATES UNNECESSARY OBSERVATIONS, SETS NEG VALUES TO ., ESTABLISHES STRATUM N***;

DATA TEMP&X;
SET PREV&X;
IF OUTLEV GE 0 AND OUTCOME IN (1,2);

***CREATES STRATA N FOR SUPRESSION DETERMINATION***;
IF OUTCOME=2 THEN DO;
	IF gender=0 THEN NGENDER=NSUM;
	IF gender=1 THEN NGENDER=NSUM;

	IF race=1 THEN NGRADEID=NSUM;
	IF race=2 THEN NGRADEID=NSUM;
	IF race=3 THEN NGRADEID=NSUM;
	IF race=4 THEN NGRADEID=NSUM;
	IF race=5 THEN NGRADEID=NSUM;

	IF age=1 THEN NETHNICITYID=NSUM;
	IF age=2 THEN NETHNICITYID=NSUM;

	IF youth=1 THEN NETHNICITYID=NSUM;
	IF youth=2 THEN NETHNICITYID=NSUM;

	IF grade=1 THEN NGRADEID=NSUM;
	IF grade=2 THEN NGRADEID=NSUM;
	IF grade=3 THEN NGRADEID=NSUM;
	IF grade=4 THEN NGRADEID=NSUM;

END;

NSTRAT=NALL;
IF NALL GT . THEN NSTRAT=NALL;
IF NGENDER GT . THEN NSTRAT=NGENDER;
IF NRACE GT . THEN NSTRAT= NRACE;
IF NAGE GT . THEN NSTRAT= NAGE;
IF NYOUTH GT . THEN NSTRAT=NYOUTH;
IF NGRADE GT . THEN NSTRAT=NGRADE;


RUN;


****OUTPUTS STRATUM N VARIABLE FOR MERGE****;

DATA OUTMATCH&X (KEEP= VARID OUTLEV NSTRAT);
SET TEMP&X;
IF OUTLEV=0 THEN OUTLEV=1;
IF NSTRAT GT .;
RUN;

***REMOVES NSTRAT FROM PREV FILE FOR MERGE***;

DATA TEMP&X (DROP=NSTRAT);
SET TEMP&X;
RUN;

***SORTS FILES FOR MERGE***;
PROC SORT DATA=OUTMATCH&X;
BY VARID OUTLEV;
RUN;

PROC SORT DATA=TEMP&X;
BY VARID OUTLEV;
RUN;


****MERGES PREV FILE AND OUTMATCH TO POPULATE NSTRAT, DROPS UNNECESSARY VARS***;
DATA MERGE&X (DROP=NGENDER NGRADE NAGE NRACE NYOUTH NALL);
MERGE WORK.TEMP&X WORK.OUTMATCH&X;
BY VARID OUTLEV;
RUN;

%end;
%MEND outcome;

%OUTCOME;

***CREATES VARS - ID, VARID*****;

DATA FILEMERGE;
SET MERGE1  MERGE2;

IF OUTCOME=1;
MEASURE=ROWPER;
SE=SEROW;
LCL=LOWROW;
UCL=UPROW;
TOTALSIZE=NSTRAT;
YEAR=2017;
*ID=(99*10000)+(OUTLEV*100)+(VARID);
*STATEID=(OUTLEV*100)+VARID;
ID=(99*1000000)+(OUTLEV*10000)+(VARID);
STATEID=(OUTLEV*10000)+VARID;
RUN;

***CREATES FINAL DATA, DROPS UNNECESSARY VARS****;
DATA EXPORTFILE (KEEP=VARID NSUM WSUM SE MEASURE LCL UCL TOTALSIZE YEAR ID OUTLEV STATEID);
SET FILEMERGE;

if totalsize<30 then do;
WSUM=.;
MEASURE=.;
SEROW=.; 
SE=.;
LCL=.; 
UCL=.;
end;

RUN;

 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Step #1 for macro development: make the code work for one instance of your problem WITHOUT macros. Have you done this? Show us the code WITHOUT macros that is working.

--
Paige Miller
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please show some example test data - in the form of a datastep following:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

And what you want out at the end.  I am looking at the first few steps in that program and can say that a lot of it is redundant straight off.  Starting with a simple program which achieves the required end from the input, is the way to start this process.  Also, please avoid coding all in uppper case, it reads like your shouting at us.

ballardw
Super User

First thing would be to get at least some familiarity with the syntax for Proc Surveyfreq:

These statements are from SUDAAN not SURVEYFREQ:

SUBPOPN ONE=1 / NAME="REPORT OUTPUT";
SUBGROUP age gender race youth grade OUTCOME;
LEVELS    2    2     5     2    4        2 ;
PRINT NSUM WSUM ROWPER SEROW LOWROW UPROW/ WSUMFMT=F9.0 NSUMFMT=F9.0;
OUTPUT  / FILENAME=PREV&X TABLECELL=DEFAULT REPLACE;

Output data sets from Surveyfreq use the ODS OUTPUT. You need to know which table created by surveyfreq contains the elements you want and then use a statement such as:

 

ODS OUTPUT CROSSTABS= mycrosstabdataset.

Crosstabs is the name of the row/column summary data

 

Optional statistics are on the TABLES statement. You likely want to have: CL for confidence limits of percentages, ROW for row percentages and standard errors,

 

Also the "post processing" of the output set will be significantly different than for SUDAAN output so wait and see what your output from Surveyfreq looks like.

If you have variables with coding that you do not want included in the analysis such as "unknown" for age you will want to have the variable set to missing as SAS does not do the "levels" the way that SUDAAN does and will include the other categories. So your percentages for the values of interest may not add to 100 with other coding.

 

And surveyfreq will let you compare MANY variables at one time. It may be easier to run all of your "outcome" variables by name:

TABLES (age gender race youth grade) * (drink drug othertopic);

jag07g
Fluorite | Level 6

Thanks all for the helpful suggestions. 

 

Here is what I have worked out so far using the surveyfreq command in SAS. This seems to give me what I want in terms of CIs and row percentages. 

 

proc surveyfreq data=2017data;
weight dataweight;
table (gender age youth race grade)*(drink drug [other vars])/ cl row nocellpct nostd nofreq nowt;
ods output CrossTabs=ResponseTable1;
run;

Would I be able to get the resulting cross tab data into Access so that I can further modify the names and labels of the columns? 

 

My next step I believe would be to continue developing the macro so that I would not have to list every outcome variable in the table statement as I have over 50 outcome variables? Thanks for all your help!

ballardw
Super User

@jag07g wrote:

Thanks all for the helpful suggestions. 

 

Here is what I have worked out so far using the surveyfreq command in SAS. This seems to give me what I want in terms of CIs and row percentages. 

 

proc surveyfreq data=2017data;
weight dataweight;
table (gender age youth race grade)*(drink drug [other vars])/ cl row nocellpct nostd nofreq nowt;
ods output CrossTabs=ResponseTable1;
run;

Would I be able to get the resulting cross tab data into Access so that I can further modify the names and labels of the columns? 

 

My next step I believe would be to continue developing the macro so that I would not have to list every outcome variable in the table statement as I have over 50 outcome variables? Thanks for all your help!


It is much easier to edit "names and labels" in SAS than in Access from my experience. Proc datasets is designed to do that without opening any existing data.

 

 

Again, though the output set gets a tad large you should be able to have all of your response variables in a single tables call.

I might in some circumstances use a separate call to surveyfreq to create separate output if the responses are "grouped" in some natural way such as if I have responses only related to male or female respondents such as gender specific medical tests and remove gender from the variables. But a  macro to loop through response variables is likely not needed.

 

Another approach may be to use SURVEYMEANS with domain analysis if response variables are 0/1 coded.

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1560 views
  • 0 likes
  • 4 in conversation