I need to create the weekly report including the histograms for showing all the individual site info. Each week, new sites are added and I have to manually modify the code. I am wondering whether there is a way to program to create the data files including 13 sites in each histogram ordered by the Site ID.
This is the current code:
/* Create three datasets split by Site - 13 sites per graph*/
data plot_iq_hist_split1 plot_iq_hist_split2 plot_iq_hist_split3 plot_iq_hist_split4;
set plot_iq_hist;
if site IN ('001', '002', '004', '005', '008','013', '018','020', '021', '022', '025', '026','027') then output plot_iq_hist_split1;
else if site IN ('029', '030' '032', '033','035','038', '040', '041', '042', '043', '044','046', '051')
then output plot_iq_hist_split2;
else if site IN ('052', '054', '056', '057', '059', '061', '064', '065', '066', '068', '072', '074', '080') then output plot_iq_hist_split3;
else output plot_iq_hist_split4;
run;
Is it really necessary to split the data into four different datasets? Why not just make a variable named Split, with values 1-4?
You didn't provide sample data, but it could be done like:
data want ;
set plot_iq_hist ;
by site ;
if first.site then do ;
sitecounter++1 ;
if sitecounter=1 then split++1 ;
end ;
if last.site and sitecounter=13 then sitecounter=0 ;
run ;
If you have list of unique SITE values in a dataset then you can generate a SPLIT number easily with a DO loop around a SET statement. So if you have dataset name SITES with variable SITE in the order you want you can use this data step to make a dataset with the SPLIT value for each SITE.
data site_split;
split+1;
do i=1 to 13 ;
set site;
output;
end;
run;
Now you can use that dataset to generate your code.
filename code temp;
data _null_ ;
set site_split end=eof;
by split;
file code lrecl=80 ;
if _n_=1 then put 'data ' @;
if first.split then put 'plot_iq_hist_split' split @;
if eof then put ';' / 'set plot_iq_hist;' ;
run;
data _null_;
set site_split end=eof;
by split;
file code mod lrecl=80 ;
if first.split then do;
if _n_>1 then put 'else ' @;
put 'if site in (' @;
end;
put site :$quote. @;
if last.split then put ') then output plot_iq_hist_split' split ';' ;
if eof then put 'run;' ;
run;
%include code / source2;
Is it really necessary to split the data into four different datasets? Why not just make a variable named Split, with values 1-4?
You didn't provide sample data, but it could be done like:
data want ;
set plot_iq_hist ;
by site ;
if first.site then do ;
sitecounter++1 ;
if sitecounter=1 then split++1 ;
end ;
if last.site and sitecounter=13 then sitecounter=0 ;
run ;
The one I posted before made no sense whatsoever.
%macro split_histo_data(indata=, insheet=);
proc import out=plot_iq_hist dbms=xlsx replace
datafile="&indata";
sheet="&insheet";
run;
proc sql noprint;
select ceil(count(distinct site)/13) into :ndatasets trimmed from plot_iq_hist;
quit;
proc sort data=plot_iq_hist; by site; run;
DATA
%do i=1 %to &ndatasets;
plot_iq_hist_split&i
%end;
;
set plot_iq_hist;
by site;
sitenum+first.site;
%do i=1 %to &ndatasets;
if ceil(sitenum/13)=&i then output plot_iq_hist_split&i;
%end;
run;
%mend; *split_histo_data();
%split_histo_data(indata=%str(/path/to/data/exceldata.xlsx), insheet=histodata);
Instead of using SGPANEL, you could use SGPLOT with a BY statement, that will make one histogram per by-group.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.