I thought I would control PROC TRANSPOSE well, but I'm having a problem with it.
Having NLEVELS=3 for the variable concept I would like to have create 3 columns LC, T and PET_COMM_ with the variable value.
But when executing the code I end up getting the error message
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
data WORK.TEST;
infile datalines dsd truncover;
input MODELO:$21. NAME_STAT:$33. _NAME_:$14. VALUE:32. CONCEPT:$30.;
datalines4;
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T12,-0.005,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T24,-0.01,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T36,-0.015,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T48,-0.02,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T60,-0.02,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_12,0.5642043742,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_24,0.5041646406,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_36,0.4441249367,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_48,0.3840852925,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_60,0.3290455588,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,LC12,-0.005,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC24,-0.01,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC36,-0.015,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC48,-0.02,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC60,-0.02,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,T12,-0.005,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T24,-0.01,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T36,-0.015,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T48,-0.02,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T60,-0.02,T
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_12,0.5853021073,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_24,0.5252623736,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_36,0.4652226698,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_48,0.4051830255,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_60,0.3501432919,PET_COMM_
;;;;
PROC SORT DATA=WORK.TEST;
BY MODELO NAME_STAT CONCEPT;
RUN;
PROC TRANSPOSE DATA=WORK.TEST(DROP=_NAME_);
BY MODELO NAME_STAT;
VAR VALUE;
ID CONCEPT;
RUN;
Not its not a rule, just good practice, like indenting lines within blocks, finishing macro variables with a point, ending blocks with run; etc. If I wrote my posts back in uppercase people would think I was rudely shouting at them. For your requirement, what I would suggest is an array and by group - you could do it in three transposes, and then merge together, but one datastep will suffice:
data WORK.TEST; infile datalines dsd truncover; input modelo:$21. name_stat:$33. _name_:$14. value:32. concept:$30.; datalines; ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T12,-0.005,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T24,-0.01,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T36,-0.015,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T48,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T60,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_12,0.5642043742,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_24,0.5041646406,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_36,0.4441249367,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_48,0.3840852925,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_60,0.3290455588,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,LC12,-0.005,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC24,-0.01,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC36,-0.015,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC48,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC60,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,T12,-0.005,T ODUMMYA FL 2017,ODUMMYA DIESEL,T24,-0.01,T ODUMMYA FL 2017,ODUMMYA DIESEL,T36,-0.015,T ODUMMYA FL 2017,ODUMMYA DIESEL,T48,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,T60,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_12,0.5853021073,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_24,0.5252623736,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_36,0.4652226698,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_48,0.4051830255,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_60,0.3501432919,PET_COMM_ ; run; data test; set test; ord=input(compress(_name_,,"kd"),best.); run; proc sort data=test; by modelo name_stat ord concept; run; data want (keep=modelo name_stat ord lc t pet_comm_); set test; retain lc t pet_comm_; by modelo name_stat ord; if first.ord then call missing(lc,t,pet_comm_); if concept="LC" then lc=value; if concept="T" then t=value; if concept="PET_COMM_" then pet_comm_=value; if last.ord then output; run;
Do note that I have created a new variable called ord, this is necessary otherwise there is not way of knowing which records should all be on the same line, I assume that LC12, should appear on the same row as T12 for instance.
Look at the data you posted:
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC
You are transposing this data by
BY MODELO NAME_STAT;
As we can see modelo + name_stat has five records all with the same LC result. SAS cannot transpose that. You have not provided a want output so its hard to say, but perhaps a datastep before which adds an incremental number to the LC column so each are distinct would solve it:
data test; set test; by modelo name_stat concept; retain c; c=ifn(first.concept,1,c+1); concept=cats(concept,put(c,best.)); run;
Put that after your sort.
And please stop CODING ALL IN UPPERCASE!
Thanks @RW9.
Is it an unwritten rule that coding in UPPERCASE is a NO GO?
I want to have this output:
If your data are already in the order shown in your example, then you could use something like:
data need;
set test;
by modelo name_stat concept notsorted;
if first.concept then order=1;
else order+1;
run;
PROC SORT DATA=need;
BY MODELO NAME_STAT order;
RUN;
PROC TRANSPOSE DATA=WORK.need out=want(DROP=_NAME_ order);
BY MODELO NAME_STAT order;
VAR VALUE;
ID CONCEPT;
RUN;
Art, CEO, AnalystFinder.com
I like @art297 solution, but instead of assigning order values sequentially use the _NAME_ to get the order values. If there are missing values then assigning sequential numbers might result in different results.
data need;
set test;
order=substr(_name_,length(_name_)-1,2);
run;
With missing T12 row you get this result with order values from _name_:
If using sequential order then missing value will the at last record.
Thanks, very good
Not its not a rule, just good practice, like indenting lines within blocks, finishing macro variables with a point, ending blocks with run; etc. If I wrote my posts back in uppercase people would think I was rudely shouting at them. For your requirement, what I would suggest is an array and by group - you could do it in three transposes, and then merge together, but one datastep will suffice:
data WORK.TEST; infile datalines dsd truncover; input modelo:$21. name_stat:$33. _name_:$14. value:32. concept:$30.; datalines; ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T12,-0.005,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T24,-0.01,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T36,-0.015,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T48,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T60,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_12,0.5642043742,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_24,0.5041646406,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_36,0.4441249367,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_48,0.3840852925,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_60,0.3290455588,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,LC12,-0.005,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC24,-0.01,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC36,-0.015,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC48,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC60,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,T12,-0.005,T ODUMMYA FL 2017,ODUMMYA DIESEL,T24,-0.01,T ODUMMYA FL 2017,ODUMMYA DIESEL,T36,-0.015,T ODUMMYA FL 2017,ODUMMYA DIESEL,T48,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,T60,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_12,0.5853021073,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_24,0.5252623736,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_36,0.4652226698,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_48,0.4051830255,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_60,0.3501432919,PET_COMM_ ; run; data test; set test; ord=input(compress(_name_,,"kd"),best.); run; proc sort data=test; by modelo name_stat ord concept; run; data want (keep=modelo name_stat ord lc t pet_comm_); set test; retain lc t pet_comm_; by modelo name_stat ord; if first.ord then call missing(lc,t,pet_comm_); if concept="LC" then lc=value; if concept="T" then t=value; if concept="PET_COMM_" then pet_comm_=value; if last.ord then output; run;
Do note that I have created a new variable called ord, this is necessary otherwise there is not way of knowing which records should all be on the same line, I assume that LC12, should appear on the same row as T12 for instance.
Awesome. AWESOME 🙂
Ok, I understand your code and it's a fine solution.
And regarding the good coding practice I didn't realize that it could be offensive. For my eyes it's cleaner to have all in uppercase.
Bye, Arne
I post as well the proc sql alternative to your array solution:
data test;
set test;
ord=input(compress(_name_,,"kd"),best.);
run;
proc sort data=test;
by modelo name_stat ord concept;
run;
proc sql;
create table work.test3 as
select modelo, name_stat, ord,
sum (case when concept = "lc" then value else 0 end) as lc format percent9.2,
sum (case when concept = "t" then value else 0 end) as t format percent9.2,
sum (case when concept = "pet_comm_" then value else 0 end) as pet_comm_ format percent9.2
from test
group by modelo, name_stat, ord;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.