Solved: Re: How to transpose a file partially?

PierreYvesILY · Posted 11-15-2019 09:00 AM

dear SAS-Experts,

I have such a following dataset:

data Beispiel2(label='Beispiel2');

input ID:8. Reason:$35. Code:5.;

datalines;

20180101 AAA 25

20190101 AAA 25

20190101 BBB 12

20180102 AAA 25

20180105 CCC 84

20190105 BBB 12

20190105 AAA 25

20180101 CCC 84

20190102 BBB 12

20180105 AAA 25

20190102 AAA 25

20190101 CCC 84

;

run;

proc sort data=Beispiel2; by id Reason; run;

I want to get this final result:

ID	Reason_1	Code_1	Reason_2	Code_2	Reason_3	Code_3
20180101	AAA	25	CCC	84
20180102	AAA	25
20180105	AAA	25	CCC	84
20190101	AAA	25	BBB	12	CCC	84
20190102	AAA	25	BBB	12
20190105	AAA	25	BBB	12

rules:

from the 1st dataset, for each ID, I copy each (code + reason) into 2 new variables Code_n and reason_n

- if I have the same ID 2 times in the dataset, n=2, I have 2 blocks and create 2 sets of new variables

- if I have the same ID 3 times in the dataset, n=3, I have 2 blocks and create 3 sets of new variables

...

n max is 10.

how should I proceed?

thanks for your help

regards

PY

PaigeMiller · Posted 11-15-2019 09:30 AM

proc transpose data=beispiel2 out=transp prefix=id_;
    by id;
	var code;
run;
proc transpose data=beispiel2 out=transp1 prefix=reason_;
    by id;
	var reason;
run;
data transpose;
	merge transp transp1;
	by id;
run;

Adding: the macro mentioned by @art297 works very well, I have used it, and if there were a lot of variables to be transposed (in this example, there are only two variables to transpose), it's probably easier and faster (in terms of programming time) to use that macro.

--
Paige Miller

View solution in original post

art297 · Posted 11-15-2019 09:27 AM

Task is easy if you use the %transpose macro. e.g.:

data Beispiel2(label='Beispiel2');
  input ID:8. Reason:$35. Code:5.;
  datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;

proc sort data=Beispiel2;
  by id Reason;
run;


filename tr url 'https://raw.githubusercontent.com/art297/transpose/master/transpose.sas';
%include tr ;

%transpose(data=Beispiel2, out=want, by=id, Guessingrows=1000,
  var=reason code)

Art, CEO, AnalystFinder.com

PierreYvesILY · Posted 11-15-2019 09:30 AM

hallo,

thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞

PaigeMiller · Posted 11-15-2019 09:34 AM

@PierreYvesILY wrote:

thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞

Yes but you could paste that URL into your web browser, and then copy (as text) the SAS program and save it locally or on a server you can access, and do the %INCLUDE from that local file.

--
Paige Miller

PierreYvesILY · Posted 11-15-2019 09:38 AM

Hallo Paige,

not even that is possible... Almost everything that Looks like importing anything is blocked here.

PaigeMiller · Posted 11-15-2019 09:43 AM

@PierreYvesILY wrote:

Hallo Paige,

not even that is possible... Almost everything that Looks like importing anything is blocked here.

Not sure what you mean by "importing", as I did not suggest any "importing". Its just a copy and paste from a web page ... is that blocked at your site?

--
Paige Miller

Kurt_Bremser · Posted 11-15-2019 09:36 AM

@PierreYvesILY wrote:

hallo,

thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞

Load it in your browser, and copy/paste the code to your SAS program editor, then save it in your local environment for including.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

PaigeMiller · Posted 11-15-2019 09:30 AM

proc transpose data=beispiel2 out=transp prefix=id_;
    by id;
	var code;
run;
proc transpose data=beispiel2 out=transp1 prefix=reason_;
    by id;
	var reason;
run;
data transpose;
	merge transp transp1;
	by id;
run;

Adding: the macro mentioned by @art297 works very well, I have used it, and if there were a lot of variables to be transposed (in this example, there are only two variables to transpose), it's probably easier and faster (in terms of programming time) to use that macro.

--
Paige Miller

PierreYvesILY · Posted 11-15-2019 09:37 AM

Thanks a lot Paige!
It works perfectly.

PierreYvesILY · Posted 11-15-2019 10:30 AM

I want to use the solution proposed by Paige, which I find easier for me to understand.

When I use it fr my real data, I Encounter the Problem that the variables Code and Reason have 16 different values.

Thus, I obtain a file where 99% of the IDs have less than 3 columns filled for both variables Code and Reason, and less than 1% more than 3 columns.

How can I Limit the number of columns created by the transpose procedure for each variable (ex: reason)

un max of 4 columns is enough, thus having in the result dataset the variables Code_1 to Code_4?

art297 · Posted 11-15-2019 12:18 PM

Since you find @PaigeMiller's solution easiest to follow, I'll answer your question using a slightly modified version of his code.

However, since you asking how to omit some of your data from the output, is that really something that you want/need to do? And, since you are re-ordering your data by Reason within ID, in those cases where you currently have more than four Reason/Code entries, you will systematically be dropping Reason/Code combinations that occur later in the sort.

I changed the code so that it produces Reason_1-Reason_4 rather than ID_!-ID_4 which @PaigeMiller's code produced.

proc transpose data=beispiel2 out=transp1 (keep=id Reason_1-Reason_4) prefix=Reason_;
    by id;
	var Reason;
run;
proc transpose data=beispiel2 out=transp2 (keep=id Code_1-Code_4) prefix=Code_;
    by id;
	var Code;
run;
data transpose;
	merge transp1 transp2;
	by id;
run;

Art, CEO, AnalystFinder.com

PierreYvesILY · Posted 11-18-2019 04:03 AM

yes, I need to keep only the 4 first reasons occuring.

I checked the data and find until 16 reasons for each ID-case, but most of the time if the number of reasons is > 5, it's irrelevant, because it's due to technical problems (obviously). And it's less than 1% of all ID-cases.

So, the result of the analysis will be clearer and still accurate.

However, I just said the solution of Paige Miller is the easiest for me to understand. I'm new to SAS and don't understand how the other solution is built.

novinosrin · Posted 11-15-2019 09:52 AM

Something I learned from King of SAS @data_null__ IDGROUP transpose.



data Beispiel2(label='Beispiel2');
  input ID:8. Reason:$35. Code:5.;
  datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;
proc sql;
 select max(obs) into :obs 
from  (select count(*) as obs from Beispiel2 group by id )  ;
quit;

proc summary nway data=Beispiel2 missing;
 class id; 
 output out =want(drop=_type_ _freq_)
 idgroup(out[&obs](Reason Code)=) ;
run;

Registration is open