BookmarkSubscribeRSS Feed
joebacon
Pyrite | Level 9

I have a fairly time-sensitive issue. I have daily data for about 616 participants. I want to create individual graphs for each one of them. When I broke it up by study, it took a total of 11 minutes. This code is show below:

 

libname TLFB  'S:\Alcstudy\Analysis\Time Horizons\Brice\BL-TLFB';
libname combonew 'S:\AlcStudy\ArcStudy\Dataset';

* Run to find drivers on you computer;
proc gdevice catalog=sashelp.devices nofs;
list _all_;
run;
quit;
proc gdevice catalog=sashelp.devices nofs;
list actximg;
run;
quit;

options orientation=landscape;
goptions reset=all
device=actximg/*SASWMF*/ gsfmode=replace 
xmax=8.3in xpixels=800 ymax=6.2in ypixels=600
ftext="arial" ctext=black;

******************** STUDY 2 ************************;

data ST2; set combonew.study2alcohol;rename Name=PID;
run;
* List of all the subject IDs;
proc sort data=ST2 out=Ref (keep=PID) nodupkey; by PID; run;
* Set values over 499 to 499;
data Plot1;  set ST2; if Alc>499 then Alc=499; by PID; run;

ods rtf file="S:\Alcstudy\Analysis\Time Horizons\Joe\BL-TLFB\alc plots\ST2_Trimester_ALCPLOTS.rtf" style=Styles.MyGrids;

* Macro to plot 1 subject's data;
%macro alcplot(ID); 
axis1 	order=1462 to 2445 by 121
		label=('Day')
		minor = none ;

axis2 order=0 to 500 by 25
		label=(angle=90 'mL/ethanol Comsumed')
		minor=none;
symbol i=join value=none color=Black ;
title "Daily reported ethanol consumption for ST2: PID# &ID";

/* days to mark - 1462, 1826,1917,2008, 2099, 2191, 2445 */
proc gplot data=Plot1;
  plot Alc*Day/overlay href = (1462,1826,1947,2068,2191,2445) ctext=black chref=blue whref=1 lhref= 25 vref=90 cvref=red wvref=1
               haxis=axis1 vaxis=axis2  skipmiss;
where PID=&ID;
run; quit;
%mend alcplot;

goptions gsfmode=replace;
%alcplot(486);
goptions gsfmode=append;

data _null_;   set Ref; if PID ne 486 then call execute('%alcplot('||PID||')'); run;
ods rtf close;

******************** STUDY 3 ************************;

data ST3; set combonew.study3alcohol;rename Name=PID;
run;
* List of all the subject IDs;
proc sort data=ST3 out=Ref (keep=PID) nodupkey; by PID; run;
* Set values over 499 to 499;
data Plot2;  set ST3; if Alc>499 then Alc=499; by PID; run;

ods rtf file="S:\Alcstudy\Analysis\Time Horizons\Joe\BL-TLFB\alc plots\ST3_Trimester_ALCPLOTS.rtf" style=Styles.MyGrids;

* Macro to plot 1 subject's data;
%macro alcplot(ID); 
axis1 	order=1462 to 2445 by 121
		label=('Day')
		minor = none ;

axis2 order=0 to 500 by 25
		label=(angle=90 'mL/ethanol Comsumed')
		minor=none;
symbol i=join value=none color=Black ;
title "Daily reported ethanol consumption for ST3: PID# &ID";

/* days to mark - 1462, 1826,1917,2008, 2099, 2191, 2445 */
proc gplot data=plot2;
  plot Alc*Day/overlay href = (1462,1826,1947,2068,2191,2445) ctext=black chref=blue whref=1 lhref= 25 vref=90 cvref=red wvref=1
               haxis=axis1 vaxis=axis2  skipmiss;
where PID=&ID;
run; quit;
%mend alcplot;

goptions gsfmode=replace;
%alcplot(2605);
goptions gsfmode=append;

data _null_;   set Ref; if PID ne 2605 then call execute('%alcplot('||PID||')'); run;
ods rtf close;

******************** STUDY 4 ************************;

data ST4; set combonew.study4alcohol; rename Name=PID;
if Name in (2917,2930,2933,2934,2945,2949,2957,2961,2970) then delete; 
run;
* List of all the subject IDs;
proc sort data=ST4 out=Ref (keep=PID) nodupkey; by PID; run;
* Set values over 499 to 499;
data Plot3;  set ST4; if Alc>499 then Alc=499; by PID; run;
proc print data=Ref; run;
ods rtf file="S:\Alcstudy\Analysis\Time Horizons\Joe\BL-TLFB\alc plots\ST4_Trimester_ALCPLOTS.rtf" style=Styles.MyGrids;

* Macro to plot 1 subject's data;
%macro alcplot(ID); 
axis1 	order=1462 to 2445 by 121
		label=('Day')
		minor = none ;

axis2 order=0 to 500 by 25
		label=(angle=90 'mL/ethanol Comsumed')
		minor=none;
symbol i=join value=none color=Black ;
title "Daily reported ethanol consumption for ST4: PID# &ID";

/* days to mark - 1462, 1826,1917,2008, 2099, 2191, 2445 */
proc gplot data=plot3;
  plot Alc*Day/overlay href = (1462,1826,1947,2068,2191,2445) ctext=black chref=blue whref=1 lhref= 25 vref=90 cvref=red wvref=1
               haxis=axis1 vaxis=axis2  skipmiss;
where PID=&ID;
run; quit;
%mend alcplot;

goptions gsfmode=replace;
%alcplot(3037);
goptions gsfmode=append;

data _null_;   set Ref; if PID ne 3037 then call execute('%alcplot('||PID||')'); run;
ods rtf close;

******************** STUDY 5 ************************;

data ST5; set combonew.study5alcohol;rename Name=PID;
run;
* List of all the subject IDs;
proc sort data=ST5 out=Ref (keep=PID) nodupkey; by PID; run;
* Set values over 499 to 499;
data Plot4;  set ST5; if Alc>499 then Alc=499; by PID; run;

ods rtf file="S:\Alcstudy\Analysis\Time Horizons\Joe\BL-TLFB\alc plots\ST5_Trimester_ALCPLOTS.rtf" style=Styles.MyGrids;

* Macro to plot 1 subject's data;
%macro alcplot(ID); 
axis1 	order=1462 to 2445 by 121
		label=('Day')
		minor = none ;

axis2 order=0 to 500 by 25
		label=(angle=90 'mL/ethanol Comsumed')
		minor=none;
symbol i=join value=none color=Black ;
title "Daily reported ethanol consumption for ST5: PID# &ID";

/* days to mark - 1462, 1826,1917,2008, 2099, 2191, 2445 */
proc gplot data=plot4;
  plot Alc*Day/overlay href = (1462,1826,1947,2068,2191,2445) ctext=black chref=blue whref=1 lhref= 25 vref=90 cvref=red wvref=1
               haxis=axis1 vaxis=axis2  skipmiss;
where PID=&ID;
run; quit;
%mend alcplot;

goptions gsfmode=replace;
%alcplot(5001);
goptions gsfmode=append;

data _null_;   set Ref; if PID ne 5001 then call execute('%alcplot('||PID||')'); run;
ods rtf close;

****************************************Study 6**********************;

libname TLFB  'S:\Alcstudy\Analysis\Time Horizons\Brice\BL-TLFB';
libname source5 'S:\Alcstudy\2013 Combo2345\2013 Combo2345 Analysis\SourceData_Study5';


data TH_GRAPHSET; set TLFB.TLFB_0mo_12mo_sex_quarterly;
run;

* List of all the subject IDs;
proc sort data=TH_GRAPHSET out=Ref (rename=(PID=SubjID_) keep=PID Mo6_DS) nodupkey; 
by PID;
where PID ge 6000;
 run;
* Set values over 499 to 499;

data Plot5;
  set TH_GRAPHSET(rename=(PID=SUBJID_));
if Alc>499 then Alc=499;
by SUBJID_;
where SUBJID_ ge 6000;
run;

proc univariate data= plot5;
var day;
run;


options orientation=landscape;
goptions reset=all
device=actximg/*SASWMF*/ gsfmode=replace 
xmax=8.3in xpixels=800 ymax=6.2in ypixels=600
ftext="arial" ctext=black;
 
ods rtf file="S:\Alcstudy\Analysis\Time Horizons\Joe\BL-TLFB\alc plots\ST6_Trimester_ALCPLOTS.rtf" style=Styles.MyGrids;

* Macro to plot 1 subject's data;
%macro alcplot2(ID,DS); 
axis1 	order=1462 to 2445 by 121
		label=('Day')
		minor = none ;

axis2 order=0 to 500 by 25
		label=(angle=90 'mL/ethanol Comsumed')
		minor=none;
symbol i=join value=none color=Black ;
title "Daily reported ethanol consumption for PID# &ID";

/* days to mark - 1462, 1826,1917,2008, 2099, 2191, 2445 */
proc gplot data=Plot5 ;
  plot Alc*Day/overlay href = (1462,1826,1947,2068,2191,2445) ctext=black chref=blue whref=1 lhref= 25 vref=90 cvref=red wvref=1
               haxis=axis1 vaxis=axis2  skipmiss;
where SubjID_=&ID;
run; quit;
%mend alcplot2;

goptions gsfmode=replace;
%alcplot2(6003,1);


goptions gsfmode=append;

data _null_;
  set Ref;
if SubjID_ ne 6003 then call execute('%alcplot2('||SubjID_||','||Mo6_DS||')');
run;

ods rtf close;

However, when I did it all together while filtering for more specific criteria (Full_Alc_exp=1) creating an N=412, it takes HOURS.

Can someone help me remedy this issue or explain to me what I am missing?

 

The code for the second part is here:

 

data ALLdata;
set ST2 
ST3
ST4
ST5
TH_Graphset;
run;
libname Sub 'S:\Alcstudy\ComboDataEntry\Joe\Inflation\Subcategories';
data full_alc;
set sub.master_dataset053019;
where full_alc_exp =1;

proc sort data= alldata;
by PID;

Data AlcPlots;
merge  alldata Full_Alc (in=x);
by PID;
if x=1;
run;

proc sql;
create table new as 
select count(distinct(pid)) as PIDcount
from alcplots;
quit;



* List of all the subject IDs;
proc sort data=alcplots out=Ref (keep=PID) nodupkey; by PID; run;
* Set values over 499 to 499;
data PlotAll;  set alcplots; if Alc>499 then Alc=499; by PID; run;

ods rtf file="S:\Alcstudy\Analysis\Time Horizons\Joe\BL-TLFB\alc plots\AllStudies_Trimester_ALCPLOTS.rtf" style=Styles.MyGrids;

* Macro to plot 1 subject's data;
%macro alcplot(ID); 
axis1 	order=1462 to 2445 by 121
		label=('Day')
		minor = none ;

axis2 order=0 to 500 by 25
		label=(angle=90 'mL/ethanol Comsumed')
		minor=none;
symbol i=join value=none color=Black ;
title "Daily reported ethanol consumption for all studies: PID# &ID";

/* days to mark - 1462, 1826,1917,2008, 2099, 2191, 2445 */
proc gplot data=Plotall;
  plot Alc*Day/overlay href = (1462,1826,1947,2068,2191,2445) ctext=black chref=blue whref=1 lhref= 25 vref=90 cvref=red wvref=1
               haxis=axis1 vaxis=axis2  skipmiss;
where PID=&ID;
run; quit;
%mend alcplot;

goptions gsfmode=replace;
%alcplot(486);
goptions gsfmode=append;

data _null_;   set Ref; if PID ne 486 then call execute('%alcplot('||PID||')'); run;
ods rtf close;

Am I storing too much in the memory of SAS which is causing it to append very slowly?

1 REPLY 1
ballardw
Super User

No actual example data so I'm not going to spend a lot of time trying to parse the code.

 

However, if the idea is to create a graph of the same variables for each person and/or study then the appropriate approach would more likely be

1) sort the data by Id

2) use BY ID in a plot procedure to create a graph for each Id.

 

3) a WHERE could be in the plot code to reduce the data to specific sets of records

 

Some of your code is generally inefficient. Consider

data ST2; set combonew.study2alcohol;rename Name=PID;
run;
* List of all the subject IDs;
proc sort data=ST2 out=Ref (keep=PID) nodupkey; by PID; run;
* Set values over 499 to 499;
data Plot1;  set ST2; if Alc>499 then Alc=499; by PID; run;

Why not set the ALC value in the first data step?

Also since you are not using any of the features of BY processing in data Plot1 above why have the BY statement? That can add, a likely miniscule, time to the data step.

 

If you close all the ODS destinations except the target of interest, such as ODS RTF things likely run faster. Otherwise output is created for each destination.

 

If the goal is to create an RTF document then likely you don't need to mess around with the GFSMODE and such. If there is a specific reason for that option you'll need to be a bit more explicit.

 

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 436 views
  • 0 likes
  • 2 in conversation