Solved: Re: proc transpose on three columns

SASdevAnneMarie · Posted 03-26-2022 09:07 AM

Hello Experts,

My data is :

Prod	NOM_Mode de gestion	Mode de gestion	Taux de retro mg	Nombre_de_Profil
TF1	PROFIL SOFT	0,0225	0,0082	2
TF1	PROFIL TONIC	0,0224	0,0079	2
TF3	PROFIL DYN	0,0225	0,0082	2
TF3	PROFIL TONIC	0,0224	0,0079	2

I would like to transpose this data and include the number in column name :

Prod	NOM_Mode de gestion n1	Mode de gestion n1	Taux de retro mg n1	NOM_Mode de gestion n2	Mode de gestion n2	Taux de retro mg n2	Nombre_de_Profil
TF1	PROFIL SOFT	0,0225	0,0082	PROFIL TONIC	0,0224	0,0079	2
TF3	PROFIL DYN	0,0225	0,0082	PROFIL TONIC	0,0224	0,0079	2

I tested the proc transpose in different ways, but don't have the wanted result.

Do you know please, if I can have this type of wanted data via proc transpose or I need to create a macro programme ?

My code is :



proc sort data=ACCORD7bis;
	by Prod "NOM_Mode de gestion"n;
run;


data ACCORD8;
	set ACCORD7bis;
	by Prod "NOM_Mode de gestion"n;
	retain N;

	if first.Prod then
		do;
			N = 1;
			output;
		end;
	else
		do;
			N = N + 1;
			output;
		end;
run;


data ACCORD9;
	set ACCORD8;
	/***New columns with values 
	NOM_Mode de gestion n1	
	Mode de gestion n1	
	Taux de retro mg n1	NOM_Mode de gestion n2.......***/
	"NOM_Mode de gestion_bis"n=("NOM_Mode de gestion n")||compress(N);
	"Mode de gestion_bis"n=("Mode de gestion n")||compress(N);
	"Taux de retro mg_bis"n=("Taux de retro mg n")||compress(N);
run;



proc transpose data=ACCORD9 out=ACCORD10(drop=_LABEL_ _NAME_);
  by Prod ;
  id "NOM_Mode de gestion_bis"n;
  var "NOM_Mode de gestion"n;
  
run;


proc transpose data=ACCORD9 out=ACCORD11(drop=_LABEL_ _NAME_);
  by Prod ;
  id "Mode de gestion_bis"n;
  var "Mode de gestion"n;  
run;


proc transpose data=ACCORD9 out=ACCORD12(drop=_LABEL_ _NAME_);
  by Prod ;
  id "Taux de retro mg_bis"n;
  var "Taux de retro mg"n;  
run;

proc sort data=ACCORD10;
	by Prod;
run;

proc sort data=ACCORD11;
	by Prod;
run;

proc sort data=ACCORD12;
	by Prod;
run;

data ACCORD13;
	merge ACCORD10 ACCORD11 ACCORD12;
	by Prod;
run;

Thank you for your guidance !

Tom · Posted 03-27-2022 11:37 AM

It is a lot easier to just use PROC SUMMARY instead. https://support.sas.com/resources/papers/proceedings10/102-2010.pdf

First let's convert that listing back into an actual dataset so we have something program with. Let's remove the spaces from the variable names so they are a lot easier to work with. Note if you set the VALIDVARNAME option to V7 instead of ANY then tools like PROC IMPORT won' generate such goofy names to begin with.

data have;
  input Prod $ NOM_Mode_de_gestion $ Mode_de_gestion $ Taux_de_retro_mg  Nombre_de_Profil;
cards;
TF1 PROFIL SOFT 0.0225 0.0082 2
TF1 PROFIL TONIC 0.0224 0.0079 2
TF3 PROFIL DYN 0.0225 0.0082 2
TF3 PROFIL TONIC 0.0224 0.0079 2
;

To use PROC SUMMARY to generate the wide dataset you need to know how many repetitions there are. You can use PROC SQL to count for you.

proc sql noprint;
select max(count) into :ngroups
from (select prod,count(*) as count from have group by prod)
;
quit;

Then just use that NGROUPS macro variable where you tell PROC SUMMARY how many copies of the variables you want per by group.

proc summary data=have ;
  by prod;
  output out=want idgroup(out[&ngroups] (NOM_Mode_de_gestion Mode_de_gestion Taux_de_retro_mg  Nombre_de_Profil)=);
run;

View solution in original post

Sajid01 · Posted 03-26-2022 07:05 PM

Hello @SASdevAnneMarie
Can you please post your code and the log?

SASdevAnneMarie · Posted 03-27-2022 08:02 AM

Thank you Sajid01,
I forgt to dd the code, the question is updated.

ballardw · Posted 03-27-2022 03:01 AM

Transpose doesn't do well with transposing character and numeric variables in a single call.

Are there only ever 2 records in each Prod variable group, or will it vary a lot depending on the actual Prod?

If you know exactly how many groups there might be then perhaps an array based solution is feasible.

data example;
   input group $ textvar $ numvar ;
   
datalines;
AA  bb  1.23
AA  cc  44
BB  abc 0.222
BB  pdq 99
;

data want;
   set example;
   by group;
   length textvar1-textvar2 $ 3;
   Retain index textvar1-textvar2  numvar1-numvar2;
   array t (*) textvar1-textvar2 ;
   array n (*) numvar1-numvar2 ;
   if first.group then call missing(index, of t(*), of n(*));
   index+1;
   t[index]= textvar;
   n[index]= numvar ;
   if last.group then output;
   drop index textvar numvar;
run;

You would need an array for each variable that you need to create the 1,2,3 versions.

However, this only works if you know the maximum number of observations that might appear in all of the groups.

Also, the order of the variables will be different. If the order of the variables is important than search the forum for the many examples.

Does Nomber_de_profil ever vary between records in the same Prod group? If so, you need to provide examples and the rule(s) for determining which value is kept.

And just what is to be done with that somewhat ugly resulting data set?

SASdevAnneMarie · Posted 03-27-2022 08:07 AM

Thank you ballardw,

Sorry, I forgot to add the code, the question is updated.

I wondering how to reduce the 3 proc transpose to one ?

I do this with my data for the odsout.object() print later : I would like to have one line per observation and replace all the &NOM_Mode_de_gest_n1. by loop on n%i


%if &NOM_Mode_de_gest_n1. ne "" %then %do;
            obj.row_start();        
            obj.format_cell(text: "&NOM_Mode_de_gest_n1. (moyenne)",  overrides: "just=left color=black font_size=10pt font_weight=medium  cellheight=0.4cm leftmargin=1cm cellwidth=8.9cm fontfamily=arial");
			obj.format_cell(data: "&Mode_de_gestion_n1.", overrides: "color=black font_size=10pt font_weight=medium just=left  cellheight=0.4cm fontfamily=arial");
            obj.row_end();
			obj.row_start();  
			obj.format_cell(data: "&Taux_de_retro_mg_n1.", overrides: "color=black font_size=10pt font_weight=medium just=left font_style=italic cellheight=0.4cm fontfamily=arial");
            obj.row_end();
%end;
%if &NOM_Mode_de_gest_n2. ne "" %then %do;
           obj.row_start();        
            obj.format_cell(text: "&NOM_Mode_de_gest_n2. (moyenne)",  overrides: "just=left color=black font_size=10pt font_weight=medium  cellheight=0.4cm leftmargin=1cm cellwidth=8.9cm fontfamily=arial");
			obj.format_cell(data: "&Mode_de_gestion_n2.", overrides: "color=black font_size=10pt font_weight=medium just=left cellheight=0.4cm fontfamily=arial");
            obj.row_end();

			obj.row_start();  
 			obj.format_cell(data: "&Taux_de_retro_mg_n2.", overrides: "color=black font_size=10pt font_weight=medium just=left font_style=italic cellheight=0.4cm fontfamily=arial");
            obj.row_end();
%end;

Ksharp · Posted 03-27-2022 06:48 AM

data example;
   input group $ textvar $ numvar ;
datalines;
AA  bb  1.23
AA  cc  44
BB  abc 0.222
BB  pdq 99
;

data temp;
 set example;
 by group;
 if first.group then n=0;
 n+1;
run;

proc sql noprint;
select distinct catt('temp(where=(n=',n,') rename=(
textvar=textvar_n',n,' numvar=numvar_n',n,'))')  into : merge separated by ' '
 from temp ;
quit;

data want;
 merge &merge. ;
 by group;
 drop n;
run;

SASdevAnneMarie · Posted 03-27-2022 08:10 AM

Thank you Ksharp,
Sorry, I forgot to add the code, the question is updated.

Could you explain please your proc sql step ?

Ksharp · Posted 03-27-2022 06:53 AM

Or could check my paper about " Transpose data by Merge ".

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

Tom · Posted 03-27-2022 11:37 AM

It is a lot easier to just use PROC SUMMARY instead. https://support.sas.com/resources/papers/proceedings10/102-2010.pdf

First let's convert that listing back into an actual dataset so we have something program with. Let's remove the spaces from the variable names so they are a lot easier to work with. Note if you set the VALIDVARNAME option to V7 instead of ANY then tools like PROC IMPORT won' generate such goofy names to begin with.

data have;
  input Prod $ NOM_Mode_de_gestion $ Mode_de_gestion $ Taux_de_retro_mg  Nombre_de_Profil;
cards;
TF1 PROFIL SOFT 0.0225 0.0082 2
TF1 PROFIL TONIC 0.0224 0.0079 2
TF3 PROFIL DYN 0.0225 0.0082 2
TF3 PROFIL TONIC 0.0224 0.0079 2
;

To use PROC SUMMARY to generate the wide dataset you need to know how many repetitions there are. You can use PROC SQL to count for you.

proc sql noprint;
select max(count) into :ngroups
from (select prod,count(*) as count from have group by prod)
;
quit;

Then just use that NGROUPS macro variable where you tell PROC SUMMARY how many copies of the variables you want per by group.

proc summary data=have ;
  by prod;
  output out=want idgroup(out[&ngroups] (NOM_Mode_de_gestion Mode_de_gestion Taux_de_retro_mg  Nombre_de_Profil)=);
run;

SASdevAnneMarie · Posted 03-27-2022 12:20 PM

Thank you very much Tom,

Could you explain me please what means idgroup(out[&ngroups] ?

Tom · Posted 03-27-2022 01:16 PM

How many replicates of each variable to create. Read the linked article and the documentation of PROC SUMMARY.

SASdevAnneMarie · Posted 03-27-2022 01:37 PM

Thank you, Tom !

mkeintz · Posted 03-27-2022 02:22 PM

IDGROUP is one of the most beneficial (and likely underutilized) features of proc summary. Not the least of its advantages is the intuitive renaming of the group of variables being transposed.

It reduces this task, which I'll term "vargroup transpose" to very simple code, requiring the dataset to be read only twice and written only once - more efficient than other techniques I have seen published. And if you know the number of groups needed in advance, then you don't need to read the data into the preliminary PROC SQL, making this the absolute minimum usage of input/output resources.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SASdevAnneMarie · Posted 03-28-2022 04:40 AM

Thank you, mkeintz!

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!