I thought I would control PROC TRANSPOSE well, but I'm having a problem with it.
Having NLEVELS=3 for the variable concept I would like to have create 3 columns LC, T and PET_COMM_ with the variable value.
But when executing the code I end up getting the error message
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "LC" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
ERROR: The ID value "PET_COMM_" occurs twice in el mismo grupo BY.
data WORK.TEST;
infile datalines dsd truncover;
input MODELO:$21. NAME_STAT:$33. _NAME_:$14. VALUE:32. CONCEPT:$30.;
datalines4;
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T12,-0.005,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T24,-0.01,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T36,-0.015,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T48,-0.02,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T60,-0.02,T
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_12,0.5642043742,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_24,0.5041646406,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_36,0.4441249367,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_48,0.3840852925,PET_COMM_
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_60,0.3290455588,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,LC12,-0.005,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC24,-0.01,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC36,-0.015,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC48,-0.02,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,LC60,-0.02,LC
ODUMMYA FL 2017,ODUMMYA DIESEL,T12,-0.005,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T24,-0.01,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T36,-0.015,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T48,-0.02,T
ODUMMYA FL 2017,ODUMMYA DIESEL,T60,-0.02,T
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_12,0.5853021073,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_24,0.5252623736,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_36,0.4652226698,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_48,0.4051830255,PET_COMM_
ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_60,0.3501432919,PET_COMM_
;;;;
PROC SORT DATA=WORK.TEST;
BY MODELO NAME_STAT CONCEPT;
RUN;
PROC TRANSPOSE DATA=WORK.TEST(DROP=_NAME_);
BY MODELO NAME_STAT;
VAR VALUE;
ID CONCEPT;
RUN;
Not its not a rule, just good practice, like indenting lines within blocks, finishing macro variables with a point, ending blocks with run; etc. If I wrote my posts back in uppercase people would think I was rudely shouting at them. For your requirement, what I would suggest is an array and by group - you could do it in three transposes, and then merge together, but one datastep will suffice:
data WORK.TEST; infile datalines dsd truncover; input modelo:$21. name_stat:$33. _name_:$14. value:32. concept:$30.; datalines; ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T12,-0.005,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T24,-0.01,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T36,-0.015,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T48,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T60,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_12,0.5642043742,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_24,0.5041646406,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_36,0.4441249367,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_48,0.3840852925,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_60,0.3290455588,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,LC12,-0.005,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC24,-0.01,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC36,-0.015,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC48,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC60,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,T12,-0.005,T ODUMMYA FL 2017,ODUMMYA DIESEL,T24,-0.01,T ODUMMYA FL 2017,ODUMMYA DIESEL,T36,-0.015,T ODUMMYA FL 2017,ODUMMYA DIESEL,T48,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,T60,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_12,0.5853021073,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_24,0.5252623736,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_36,0.4652226698,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_48,0.4051830255,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_60,0.3501432919,PET_COMM_ ; run; data test; set test; ord=input(compress(_name_,,"kd"),best.); run; proc sort data=test; by modelo name_stat ord concept; run; data want (keep=modelo name_stat ord lc t pet_comm_); set test; retain lc t pet_comm_; by modelo name_stat ord; if first.ord then call missing(lc,t,pet_comm_); if concept="LC" then lc=value; if concept="T" then t=value; if concept="PET_COMM_" then pet_comm_=value; if last.ord then output; run;
Do note that I have created a new variable called ord, this is necessary otherwise there is not way of knowing which records should all be on the same line, I assume that LC12, should appear on the same row as T12 for instance.
Look at the data you posted:
ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC
You are transposing this data by
BY MODELO NAME_STAT;
As we can see modelo + name_stat has five records all with the same LC result. SAS cannot transpose that. You have not provided a want output so its hard to say, but perhaps a datastep before which adds an incremental number to the LC column so each are distinct would solve it:
data test; set test; by modelo name_stat concept; retain c; c=ifn(first.concept,1,c+1); concept=cats(concept,put(c,best.)); run;
Put that after your sort.
And please stop CODING ALL IN UPPERCASE!
Thanks @RW9.
Is it an unwritten rule that coding in UPPERCASE is a NO GO?
I want to have this output:
If your data are already in the order shown in your example, then you could use something like:
data need;
set test;
by modelo name_stat concept notsorted;
if first.concept then order=1;
else order+1;
run;
PROC SORT DATA=need;
BY MODELO NAME_STAT order;
RUN;
PROC TRANSPOSE DATA=WORK.need out=want(DROP=_NAME_ order);
BY MODELO NAME_STAT order;
VAR VALUE;
ID CONCEPT;
RUN;
Art, CEO, AnalystFinder.com
I like @art297 solution, but instead of assigning order values sequentially use the _NAME_ to get the order values. If there are missing values then assigning sequential numbers might result in different results.
data need;
set test;
order=substr(_name_,length(_name_)-1,2);
run;
With missing T12 row you get this result with order values from _name_:
If using sequential order then missing value will the at last record.
Thanks, very good
Not its not a rule, just good practice, like indenting lines within blocks, finishing macro variables with a point, ending blocks with run; etc. If I wrote my posts back in uppercase people would think I was rudely shouting at them. For your requirement, what I would suggest is an array and by group - you could do it in three transposes, and then merge together, but one datastep will suffice:
data WORK.TEST; infile datalines dsd truncover; input modelo:$21. name_stat:$33. _name_:$14. value:32. concept:$30.; datalines; ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC12,-0.005,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC24,-0.01,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC36,-0.015,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC48,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,LC60,-0.02,LC ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T12,-0.005,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T24,-0.01,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T36,-0.015,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T48,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,T60,-0.02,T ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_12,0.5642043742,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_24,0.5041646406,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_36,0.4441249367,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_48,0.3840852925,PET_COMM_ ODUMMYA COMBI FL 2017,ODUMMYA COMBI DIESEL,PET_COMM_60,0.3290455588,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,LC12,-0.005,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC24,-0.01,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC36,-0.015,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC48,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,LC60,-0.02,LC ODUMMYA FL 2017,ODUMMYA DIESEL,T12,-0.005,T ODUMMYA FL 2017,ODUMMYA DIESEL,T24,-0.01,T ODUMMYA FL 2017,ODUMMYA DIESEL,T36,-0.015,T ODUMMYA FL 2017,ODUMMYA DIESEL,T48,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,T60,-0.02,T ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_12,0.5853021073,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_24,0.5252623736,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_36,0.4652226698,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_48,0.4051830255,PET_COMM_ ODUMMYA FL 2017,ODUMMYA DIESEL,PET_COMM_60,0.3501432919,PET_COMM_ ; run; data test; set test; ord=input(compress(_name_,,"kd"),best.); run; proc sort data=test; by modelo name_stat ord concept; run; data want (keep=modelo name_stat ord lc t pet_comm_); set test; retain lc t pet_comm_; by modelo name_stat ord; if first.ord then call missing(lc,t,pet_comm_); if concept="LC" then lc=value; if concept="T" then t=value; if concept="PET_COMM_" then pet_comm_=value; if last.ord then output; run;
Do note that I have created a new variable called ord, this is necessary otherwise there is not way of knowing which records should all be on the same line, I assume that LC12, should appear on the same row as T12 for instance.
Awesome. AWESOME 🙂
Ok, I understand your code and it's a fine solution.
And regarding the good coding practice I didn't realize that it could be offensive. For my eyes it's cleaner to have all in uppercase.
Bye, Arne
I post as well the proc sql alternative to your array solution:
data test;
set test;
ord=input(compress(_name_,,"kd"),best.);
run;
proc sort data=test;
by modelo name_stat ord concept;
run;
proc sql;
create table work.test3 as
select modelo, name_stat, ord,
sum (case when concept = "lc" then value else 0 end) as lc format percent9.2,
sum (case when concept = "t" then value else 0 end) as t format percent9.2,
sum (case when concept = "pet_comm_" then value else 0 end) as pet_comm_ format percent9.2
from test
group by modelo, name_stat, ord;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.