I need to create the weekly report including the histograms for showing all the individual site info. Each week, new sites are added and I have to manually modify the code. I am wondering whether there is a way to program to create the data files including 13 sites in each histogram ordered by the Site ID.
This is the current code:
/* Create three datasets split by Site - 13 sites per graph*/
data plot_iq_hist_split1 plot_iq_hist_split2 plot_iq_hist_split3 plot_iq_hist_split4;
set plot_iq_hist;
if site IN ('001', '002', '004', '005', '008','013', '018','020', '021', '022', '025', '026','027') then output plot_iq_hist_split1;
else if site IN ('029', '030' '032', '033','035','038', '040', '041', '042', '043', '044','046', '051')
then output plot_iq_hist_split2;
else if site IN ('052', '054', '056', '057', '059', '061', '064', '065', '066', '068', '072', '074', '080') then output plot_iq_hist_split3;
else output plot_iq_hist_split4;
run;
Is it really necessary to split the data into four different datasets? Why not just make a variable named Split, with values 1-4?
You didn't provide sample data, but it could be done like:
data want ;
set plot_iq_hist ;
by site ;
if first.site then do ;
sitecounter++1 ;
if sitecounter=1 then split++1 ;
end ;
if last.site and sitecounter=13 then sitecounter=0 ;
run ;
If you have list of unique SITE values in a dataset then you can generate a SPLIT number easily with a DO loop around a SET statement. So if you have dataset name SITES with variable SITE in the order you want you can use this data step to make a dataset with the SPLIT value for each SITE.
data site_split;
split+1;
do i=1 to 13 ;
set site;
output;
end;
run;
Now you can use that dataset to generate your code.
filename code temp;
data _null_ ;
set site_split end=eof;
by split;
file code lrecl=80 ;
if _n_=1 then put 'data ' @;
if first.split then put 'plot_iq_hist_split' split @;
if eof then put ';' / 'set plot_iq_hist;' ;
run;
data _null_;
set site_split end=eof;
by split;
file code mod lrecl=80 ;
if first.split then do;
if _n_>1 then put 'else ' @;
put 'if site in (' @;
end;
put site :$quote. @;
if last.split then put ') then output plot_iq_hist_split' split ';' ;
if eof then put 'run;' ;
run;
%include code / source2;
Is it really necessary to split the data into four different datasets? Why not just make a variable named Split, with values 1-4?
You didn't provide sample data, but it could be done like:
data want ;
set plot_iq_hist ;
by site ;
if first.site then do ;
sitecounter++1 ;
if sitecounter=1 then split++1 ;
end ;
if last.site and sitecounter=13 then sitecounter=0 ;
run ;
The one I posted before made no sense whatsoever.
%macro split_histo_data(indata=, insheet=);
proc import out=plot_iq_hist dbms=xlsx replace
datafile="&indata";
sheet="&insheet";
run;
proc sql noprint;
select ceil(count(distinct site)/13) into :ndatasets trimmed from plot_iq_hist;
quit;
proc sort data=plot_iq_hist; by site; run;
DATA
%do i=1 %to &ndatasets;
plot_iq_hist_split&i
%end;
;
set plot_iq_hist;
by site;
sitenum+first.site;
%do i=1 %to &ndatasets;
if ceil(sitenum/13)=&i then output plot_iq_hist_split&i;
%end;
run;
%mend; *split_histo_data();
%split_histo_data(indata=%str(/path/to/data/exceldata.xlsx), insheet=histodata);
Instead of using SGPANEL, you could use SGPLOT with a BY statement, that will make one histogram per by-group.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.