BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Emjay
Obsidian | Level 7

I have a table in the following form: (Actually there are 10 Years and the columns don't end after Pieces, but I hope for the question it's fine.)

 

 

data have;
input Year Name_ID Type $ Price Pieces;
datalines;
2017 1 Aa 100 .
2017 1 Bb 200 3
2017 2 Cc 500 2
2018 1 Aa 30 11
2018 1 Bb 40 32
2018 2 Aa 50 5
2018 2 Cc 10 6
;
run;

What I want is to transpose the data to the following form:

 

 

 

data want;
input Name_ID Type $ Price_2017 Pieces_2017 Price_2018 Pieces_2018;
datalines;
1 Aa 100 . 30 11
1 Bb 200 3 40 32
2 Aa . . 50 5
2 Cc 500 2 10 6
;
run;

 

 

I've done this before on a different dataset with a transpose for every single column and merged them together. But now I have a lot of columns and I don't want to do this manually. And another downside is, that after the merge, the columns aren't in the proper order either and the retain would need a lot of typing.

 

Can you please help me in finding a smarter way?

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Emjay 

 

Here is an approach to do this:

proc sort data=have out=have_sorted;
	by Name_ID Type Year;
run;

/* Create macrovariables */
proc sql noprint;
	select min(Year), max(Year) into: min_year trimmed,:max_year trimmed from have;
	select name into:name1-  from dictionary.columns where libname="WORK" and memname= "HAVE"
		   and name not in('Year', 'Name_ID', 'Type');
	select name into:variables_sp separated by ' ' from dictionary.columns where libname="WORK" and memname= "HAVE"
		   and name not in('Year', 'Name_ID', 'Type');
	select count(name) into:nb from dictionary.columns where libname="WORK" and memname= "HAVE"
		   and name not in('Year', 'Name_ID', 'Type');
run;
		/* &min_year = 2017 */
		/* &max_year = 2018 */
		/* &name1 = price */
		/* &name2 = pieces */
		/* &variables_sp = Price Pieces*/
		/* &nb = 2*/

%macro _tr();
	
	/* Transpose data */
	
	data have_tr (drop=counter year &variables_sp);
	
		set have_sorted;
		by Name_ID Type;

		if first.Type then call missing (counter);
	
		%do i=1 %to &nb;
	
			array _&&name&i. (&min_year:&max_year) &&name&i.._&min_year - &&name&i.._&max_year;
			retain _&&name&i.;
 			if first.Type then call missing (counter, of _&&name&i.(*));
 			counter+1;	
			_&&name&i.(year) = &&name&i.;

		%end;
	
		if last.Type then output;
	run;
	
	/* Re-order variables */
	
	proc sql noprint;
		select name into:var_keep separated by ',' from dictionary.columns where libname="WORK" and memname= "HAVE_TR" and name not in('Year', 'Name_ID', 'Type') order by scan(name,2,'_'), name desc; 
		create table want as select Name_ID, Type, &var_keep. from have_tr;
	quit;
%mend;

%_tr;

 

 Capture d’écran 2020-02-24 à 12.35.20.png

 

View solution in original post

17 REPLIES 17
PeterClemmensen
Tourmaline | Level 20

"the columns aren't in the proper order". This makes me wonder if this is for reporting purposes or you actually need a data set for this? 

Emjay
Obsidian | Level 7

I have to provide the output to my colleagues in Excel and there it has to be like:

 

Price_2017 Pieces_2017 Price_2018 Pieces_2018

and not:

 

Price_2017 Price_2018 Pieces_2017 Pieces_2018

 That's what I mean with the not propper order.

ed_sas_member
Meteorite | Level 14

Hi @Emjay 

 

Here is an approach to do this:

proc sort data=have out=have_sorted;
	by Name_ID Type Year;
run;

/* Create macrovariables */
proc sql noprint;
	select min(Year), max(Year) into: min_year trimmed,:max_year trimmed from have;
	select name into:name1-  from dictionary.columns where libname="WORK" and memname= "HAVE"
		   and name not in('Year', 'Name_ID', 'Type');
	select name into:variables_sp separated by ' ' from dictionary.columns where libname="WORK" and memname= "HAVE"
		   and name not in('Year', 'Name_ID', 'Type');
	select count(name) into:nb from dictionary.columns where libname="WORK" and memname= "HAVE"
		   and name not in('Year', 'Name_ID', 'Type');
run;
		/* &min_year = 2017 */
		/* &max_year = 2018 */
		/* &name1 = price */
		/* &name2 = pieces */
		/* &variables_sp = Price Pieces*/
		/* &nb = 2*/

%macro _tr();
	
	/* Transpose data */
	
	data have_tr (drop=counter year &variables_sp);
	
		set have_sorted;
		by Name_ID Type;

		if first.Type then call missing (counter);
	
		%do i=1 %to &nb;
	
			array _&&name&i. (&min_year:&max_year) &&name&i.._&min_year - &&name&i.._&max_year;
			retain _&&name&i.;
 			if first.Type then call missing (counter, of _&&name&i.(*));
 			counter+1;	
			_&&name&i.(year) = &&name&i.;

		%end;
	
		if last.Type then output;
	run;
	
	/* Re-order variables */
	
	proc sql noprint;
		select name into:var_keep separated by ',' from dictionary.columns where libname="WORK" and memname= "HAVE_TR" and name not in('Year', 'Name_ID', 'Type') order by scan(name,2,'_'), name desc; 
		create table want as select Name_ID, Type, &var_keep. from have_tr;
	quit;
%mend;

%_tr;

 

 Capture d’écran 2020-02-24 à 12.35.20.png

 

Emjay
Obsidian | Level 7

Hi @ed_sas_member thanks for your help!

 

I'm not really familiar with the array-function, so I don't understand all of your code, but it looks fine. 🙂

 

Is there no easy way to do it with a transpose, or is array just nicer?

 

The only problem is, that I have columns that already contain a "_" in the beginning, or even more than one. So the scan is not really working. I may have to change the original names then...

 

ed_sas_member
Meteorite | Level 14

Hi @Emjay 

You're welcome.

You can also obtain a similar output using multiple PROC TRANSPOSE (one transpose pieces, another to transpose price, ...) and then merge the results. The only 'difficulty' is to label the new columns (price2017, ...).

 

Regarding, the last part of the code, using a proc sql to rearrange columns, I suggest that you use another function than the SCAN function to identify variable ending with the year (e.. SUBSTR, etc.)

 

Best,

 

Emjay
Obsidian | Level 7

Well labeling was not the problem, but sorting was. 🙂 I just thought there may be a nicer way to it, than writing a proc transpose for every single column.

I didn't think about substr. I'll try that later, thanks. 

novinosrin
Tourmaline | Level 20

Hi @Emjay  Good morning and Sorry my late entry to the part as I have been away for a few days missing all the excitement here. If I understand your requirement correctly, all you need is a very simple double transpose. Of course, your data must be sorted.

 

So,in SAS words

 

data have;
input Year Name_ID Type $ Price Pieces;
datalines;
2017 1 Aa 100 .
2017 1 Bb 200 3
2017 2 Cc 500 2
2018 1 Aa 30 11
2018 1 Bb 40 32
2018 2 Aa 50 5
2018 2 Cc 10 6
;
run;

proc sort data=have out=_have;
by name_id type year;
run;
proc transpose data=_have out=temp;
by name_id type year;
var price pieces;
run;

proc transpose data=temp out=want(drop=_:);
by  name_id type;
id _name_ year;
run;

 

Emjay
Obsidian | Level 7

Hi @novinosrin 

this looks like the thing I had in mind. I never really manage to get my head around when to use ID and when it's not needed. 😕

 

Any idea on how I can get a "_" between the names in the transpose? So that there is Price_2017 and not Price2017.

novinosrin
Tourmaline | Level 20

HI @Emjay  Oh sorry I forgot to add the delimiter option,  Please notice the Revised with delimiter option in the 2nd transpose step

proc sort data=have out=_have;
by name_id type year;
run;
proc transpose data=_have out=temp;
by name_id type year;
var price pieces;
run;

proc transpose data=temp out=want(drop=_:) delimiter=_;
by  name_id type;
id _name_ year;
run;

 

 

Emjay
Obsidian | Level 7
Cool, thanks!
Emjay
Obsidian | Level 7

Sorry to bother you again @novinosrin but I came across an issue, when I ran your code on my whole data set.

 

I've made a small adaption and added line number 3 with the year 2016. Since the year is not present for Name_ID 1 the order in the final data set ist not ascending anymore.

 

data have;
input Year Name_ID Type $ Price Pieces;
datalines;
2017 1 Aa 100 .
2017 1 Bb 200 3
2016 2 Cc 80 4
2017 2 Cc 500 2
2018 1 Aa 30 11
2018 1 Bb 40 32
2018 2 Aa 50 5
2018 2 Cc 10 6
;
run;

 

If I change the first sort and start with year, it will fail in the end due to wrong sorting. I thought about maybe adding some fake data with all years and deleting it later. Can you maybe think of something more elegant?


 

novinosrin
Tourmaline | Level 20

Hi @Emjay  No bother at all and don't have to be so formal. Can you also illustrate/show the expected output for the modified one. I have some assumptions, but before playing with assumptions I believe it's better to know what you want as output for quicker solutions

Emjay
Obsidian | Level 7

Hi @novinosrin 

I want the final set to like that:

 

data want;
input Name_ID Type $ Price_2016 Pieces_2016 Price_2017 Pieces_2017 Price_2018 Pieces_2018;
datalines;
1 Aa . . 100 . 30 11
1 Bb . . 200 3 40 32
2 Aa . . . . 50 5
2 Cc 80 4 500 2 10 6
;
run;

So, persist the order of the base-columns but start with the smallest year. 

novinosrin
Tourmaline | Level 20

I am gonna seek Guru @data_null__'s help if he doesn't mind..

 

Guru @data_null__ , If you don;t mind, can you give us a hand in getting the correct combination in TYPES/WAYS in Proc Summary to get all year values for each ID TYPE _NAME_ independent group.  The idea is to avoid a cartesian(SQL) or any datastep programming fix rather to use Proc summary options before the proc transpose so that we get the missing years and the years in order just how OP wants.   Once you could help this part, then it's the same transpose after. The completetypes pulls the unintended types and ID's and my mind is not working well. May i request your time plz . Thank you*1e6 in advance. 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1821 views
  • 5 likes
  • 5 in conversation