Here is the "complete" program I think. You will still need to have proper SAS names when the extra text is added. The technique of creating a set of dummy observations to control how TRANSPOSE create variable names has been discussed here in the past. If you write many SAS program will will probably need it in future.
Edit:
It was brought to my attention that I was using the wrong _TYPE_ for the difference.
data fy11;
input Agency:$10. (Safety_Score Recreation_score Contacts_score)(:percent.);
cards;
Agency1 98% 95% 95%
Agency2 100% 96% 96%
Agency3 94% 91% 91%
Agency4 92% 92% 92%
Agency5 95% 95% 95%
Agency6 96% 83% 83%
Agency7 99% 87% 87%
Agency8 93% 88% 88%
Agency9 93% 86% 86%
Agency10 96% 87% 87%
;;;;
run;
proc sort sortseq=linguistic(numeric_collation=on);
by agency;
run;
data fy12;
input Agency:$10. (Safety_Score Recreation_score Contacts_score)(:percent.);
cards;
Agency1 98% 93% 93%
Agency2 100% 96% 96%
Agency3 94% 93% 91%
Agency4 93% 92% 100%
Agency5 95% 95% 95%
Agency6 96% 83% 83%
Agency7 99% 100% 87%
Agency8 93% 88% 88%
Agency9 93% 86% 93%
Agency10 100% 87% 87%
;;;;
run;
proc sort sortseq=linguistic(numeric_collation=on);
by agency;
run;
%let var=Safety_Score--Contacts_score;
proc compare noprint base=fy11 compare=fy12 out=outall outall;
by agency;
var &var;
run;
proc print;
run;
proc format;
value $tcode 'BASE'='FY11' 'COMPARE'='FY12' 'DIF'='PDIF';
run;
proc transpose data=outall out=tall;
where _type_ ne 'PERCENT';
by agency _type_ notsorted;
var &var;
run;
proc transpose data=outall(obs=0) out=varlist;
var &var;
run;
data varlist;
set varlist;
if 0 then set outall(keep=_type_);
do _type_ = 'BASE','COMPARE','DIF';
output;
end;
run;
data tallV / view=tallV;
set varlist tall;
run;
proc transpose
data=tallV
out=wide(where=(not missing(agency)))
delim=_;
by agency notsorted;
var col1;
id _type_ _name_;
format _type_ $tcode.;
run;
proc print;
run;
Message was edited by: data _null_
hi ... here's another method with one data step (assumes data sets are sorted by agency and the variables are in the same order in the two data sets)
it uses the SET BY idea posted by MKEINTZ (and the INPUT buffer idea from PGStats posted a couple weeks ago)
filename nosee dummy;
data ax (drop=x: y: j);
length type $4;
file nosee;
* set up input buffer ... from PGStats;
if _n_ eq 1 then input @;
do until (last.agency);
* set two data sets with by ... from mkeintz;
set fy11 fy12 indsname=ds;
by agency;
type = scan(ds,2);
output;
array z(*) _numeric_;
* write values to output buffer;
put (_numeric_) (:) @;
end;
* read values, compute difference;
_infile_ = _file_;
input @1 x1-x3 y1-y3 @@;
array x(3);
array y(3);
do j=1 to dim(z);
z(j) = y(j) - x(j);
end;
type='PDIF';
output;
* clear output buffer;
put;
datalines;
*
;
* place AGENCY10 at the end ... data _null_'s idea;
proc sort data=ax sortseq=linguistic(numeric_collation=on);
by agency;
run;
partial output ...
Safety_ Recreation_ Contacts_
Obs type agency Score score score
1 FY11 Agency1 0.98 0.95 0.95
2 FY12 Agency1 0.98 0.93 0.93
3 PDIF Agency1 0.00 -0.02 -0.02
4 FY11 Agency2 1.00 0.96 0.96
5 FY12 Agency2 1.00 0.96 0.96
6 PDIF Agency2 0.00 0.00 0.00
7 FY11 Agency3 0.94 0.91 0.91
8 FY12 Agency3 0.94 0.93 0.91
Very interesting, but you still need another step to have one wide row per agency which I think is what the OP was looking for.
But I think that the more important concern about the technique of writing to _FILE_ as a temporary storage location is that you need to be certain that no precision is lost in the PUT statement (but of course you could use the RB8.0 format to preserve precision) In this example of course there's no problem of that sort.
Also the OP mentioned 100's of vars, which I presume means either a much longer INPUT statement, or the creation of a macrovar to use in the INPUT statement. (Struck out this sentence - it is wrong - MK).
Hi,
the line "%put &list;" produced the error messages. it doesn't affect the outcome. I have deleted "%put &list4" from the code.
Since you have so many and so long variables, Maybe you need substr it to fit the requirement of 32 long variable .
data FY11; input Agency $ (Safety_Score Recreation_score Contacts_score) (: percent.) ; cards; Agency1 98% 95% 95% Agency2 100% 96% 96% ; run; data FY12; input Agency $ (Safety_Score Recreation_score Contacts_score) (: percent.) ; cards; Agency1 98% 93% 93% Agency2 100% 96% 96% ; run; proc sql noprint; select substr(cats('DIFF_',name),1,32) into : list separated by ' ' from dictionary.columns where libname='WORK' and memname eq 'FY11' and upcase(name) ne 'AGENCY' ; quit; data _null_; set sashelp.vcolumn(keep=libname memname name where=(libname='WORK' and memname in ('FY11' 'FY12') and upcase(name) ne 'AGENCY')) end=last; if _n_ eq 1 then call execute('proc datasets library=work nolist; '); if memname ne lag(memname) then call execute('modify '||memname||'; '); call execute('rename '||name||'='||substr(cats(memname,'_',name),1,32)||';'); if last then call execute('quit;'); run; data want; merge FY11 FY12; by agency; array _a{*} FY11: ; array _b{*} FY12: ; array _c{*} &list ; do _n_=1 to dim(_a); _c{_n_}=_a{_n_}-_b{_n_}; end; run;
Ksharp
I think this is taks where a good case can be made for the simplification of using SET with BY, as in:
%let vars=safety_score--contacts_score;
proc transpose data=fy11 (keep=&vars obs=0) out=vnames;
run;
data _null_;
set vnames end=end_of_names;
length allnames $32000;
retain allnames ' ';
allnames=catx(' ',allnames,cats('FY11_',_name_),cats('FY12_',_name_),cats('PDIF_',_name_));
if end_of_names then call symput('allvars',trim(allnames));
run;
data final (drop=&vars);
do until (last.agency);
set fy11 (in=in11) fy12 (in=in12);
by agency;
array av {*} &allvars;
array v {*} &vars;
if in11 then do _n_=1 to dim(v);
av{3*_n_-2}=v{_n_};
end;
else if in12 then do _n_=1 to dim(v);
av{3*_n_-1}=v{_n_};
av{3*_n_} = av{3*_n_-1} - av{3*_n_-2};
end;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.