BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PierreYvesILY
Pyrite | Level 9

dear SAS-Experts,

 

I have such a following dataset:

 

data Beispiel2(label='Beispiel2');

input ID:8. Reason:$35. Code:5.;

datalines;

 

20180101 AAA 25

20190101 AAA 25

20190101 BBB 12

20180102 AAA 25

20180105 CCC 84

20190105 BBB 12

20190105 AAA 25

20180101 CCC 84

20190102 BBB 12

20180105 AAA 25

20190102 AAA 25

20190101 CCC 84

;

run;

proc sort data=Beispiel2; by id Reason; run;

 

I want to get this final result:

 

IDReason_1Code_1Reason_2Code_2Reason_3Code_3
20180101AAA25CCC84  
20180102AAA25    
20180105AAA25CCC84  
20190101AAA25BBB12CCC84
20190102AAA25BBB12  
20190105AAA25BBB12  

 

rules:

from the 1st dataset, for each ID, I copy each (code + reason) into 2 new variables Code_n and reason_n

- if I have the same ID 2 times in the dataset, n=2, I have 2 blocks and create 2 sets of new variables

- if I have the same ID 3 times in the dataset, n=3, I have 2 blocks and create 3 sets of new variables

...

 

n max is 10.

 

how should I proceed?

 

thanks for your help

regards

PY

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
proc transpose data=beispiel2 out=transp prefix=id_;
    by id;
	var code;
run;
proc transpose data=beispiel2 out=transp1 prefix=reason_;
    by id;
	var reason;
run;
data transpose;
	merge transp transp1;
	by id;
run;

Adding: the macro mentioned by @art297 works very well, I have used it, and if there were a lot of variables to be transposed (in this example, there are only two variables to transpose), it's probably easier and faster (in terms of programming time) to use that macro.

--
Paige Miller

View solution in original post

12 REPLIES 12
art297
Opal | Level 21

Task is easy if you use the %transpose macro. e.g.:

 

data Beispiel2(label='Beispiel2');
  input ID:8. Reason:$35. Code:5.;
  datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;

proc sort data=Beispiel2;
  by id Reason;
run;


filename tr url 'https://raw.githubusercontent.com/art297/transpose/master/transpose.sas';
%include tr ;

%transpose(data=Beispiel2, out=want, by=id, Guessingrows=1000,
  var=reason code)

Art, CEO, AnalystFinder.com

 

PierreYvesILY
Pyrite | Level 9

hallo,

 

thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞

 

 

PaigeMiller
Diamond | Level 26

@PierreYvesILY wrote:

 

thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞

 

 


Yes but you could paste that URL into your web browser, and then copy (as text) the SAS program and save it locally or on a server you can access, and do the %INCLUDE from that local file.

--
Paige Miller
PierreYvesILY
Pyrite | Level 9

Hallo Paige,

 

not even that is possible... Almost everything that Looks like importing anything is blocked here.

PaigeMiller
Diamond | Level 26

@PierreYvesILY wrote:

Hallo Paige,

 

not even that is possible... Almost everything that Looks like importing anything is blocked here.


Not sure what you mean by "importing", as I did not suggest any "importing". Its just a copy and paste from a web page ... is that blocked at your site?

--
Paige Miller
Kurt_Bremser
Super User

@PierreYvesILY wrote:

hallo,

 

thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞

 

 


Load it in your browser, and copy/paste the code to your SAS program editor, then save it in your local environment for including.

PaigeMiller
Diamond | Level 26
proc transpose data=beispiel2 out=transp prefix=id_;
    by id;
	var code;
run;
proc transpose data=beispiel2 out=transp1 prefix=reason_;
    by id;
	var reason;
run;
data transpose;
	merge transp transp1;
	by id;
run;

Adding: the macro mentioned by @art297 works very well, I have used it, and if there were a lot of variables to be transposed (in this example, there are only two variables to transpose), it's probably easier and faster (in terms of programming time) to use that macro.

--
Paige Miller
PierreYvesILY
Pyrite | Level 9
Thanks a lot Paige!
It works perfectly.
PierreYvesILY
Pyrite | Level 9

I want to use the solution proposed by Paige, which I find easier for me to understand.

 

When I use it fr my real data, I Encounter the Problem that the variables Code and Reason have 16 different values.

Thus, I obtain a file where 99% of the IDs have less than 3 columns filled for both variables Code and Reason, and less than 1% more than 3 columns.

 

How can I Limit the number of columns created by the transpose procedure for each variable (ex: reason)

un max of 4 columns is enough, thus having in the result dataset the variables Code_1 to Code_4?

art297
Opal | Level 21

Since you find @PaigeMiller's solution easiest to follow, I'll answer your question using a slightly modified version of his code.

 

However, since you asking how to omit some of your data from the output, is that really something that you want/need to do? And, since you are re-ordering your data by Reason within ID, in those cases where you currently have more than four Reason/Code entries, you will systematically be dropping Reason/Code combinations that occur later in the sort.

 

I changed the code so that it produces Reason_1-Reason_4 rather than ID_!-ID_4 which @PaigeMiller's code produced.

 

proc transpose data=beispiel2 out=transp1 (keep=id Reason_1-Reason_4) prefix=Reason_;
    by id;
	var Reason;
run;
proc transpose data=beispiel2 out=transp2 (keep=id Code_1-Code_4) prefix=Code_;
    by id;
	var Code;
run;
data transpose;
	merge transp1 transp2;
	by id;
run;

Art, CEO, AnalystFinder.com

 

 

PierreYvesILY
Pyrite | Level 9

yes, I need to keep only the 4 first reasons occuring.

I checked the data and find until 16 reasons for each ID-case, but most of the time if the number of reasons is > 5, it's irrelevant, because it's due to technical problems (obviously). And it's less than 1% of all ID-cases.

 

So, the result of the analysis will be clearer and still accurate.

 

However, I just said the solution of Paige Miller is the easiest for me to understand. I'm new to SAS and don't understand how the other solution is built.

novinosrin
Tourmaline | Level 20

Something I learned from King of SAS @data_null__   IDGROUP transpose. 

 



data Beispiel2(label='Beispiel2');
  input ID:8. Reason:$35. Code:5.;
  datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;
proc sql;
 select max(obs) into :obs 
from  (select count(*) as obs from Beispiel2 group by id )  ;
quit;

proc summary nway data=Beispiel2 missing;
 class id; 
 output out =want(drop=_type_ _freq_)
 idgroup(out[&obs](Reason Code)=) ;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1437 views
  • 6 likes
  • 5 in conversation