dear SAS-Experts,
I have such a following dataset:
data Beispiel2(label='Beispiel2');
input ID:8. Reason:$35. Code:5.;
datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;
proc sort data=Beispiel2; by id Reason; run;
I want to get this final result:
ID | Reason_1 | Code_1 | Reason_2 | Code_2 | Reason_3 | Code_3 |
20180101 | AAA | 25 | CCC | 84 | ||
20180102 | AAA | 25 | ||||
20180105 | AAA | 25 | CCC | 84 | ||
20190101 | AAA | 25 | BBB | 12 | CCC | 84 |
20190102 | AAA | 25 | BBB | 12 | ||
20190105 | AAA | 25 | BBB | 12 |
rules:
from the 1st dataset, for each ID, I copy each (code + reason) into 2 new variables Code_n and reason_n
- if I have the same ID 2 times in the dataset, n=2, I have 2 blocks and create 2 sets of new variables
- if I have the same ID 3 times in the dataset, n=3, I have 2 blocks and create 3 sets of new variables
...
n max is 10.
how should I proceed?
thanks for your help
regards
PY
proc transpose data=beispiel2 out=transp prefix=id_;
by id;
var code;
run;
proc transpose data=beispiel2 out=transp1 prefix=reason_;
by id;
var reason;
run;
data transpose;
merge transp transp1;
by id;
run;
Adding: the macro mentioned by @art297 works very well, I have used it, and if there were a lot of variables to be transposed (in this example, there are only two variables to transpose), it's probably easier and faster (in terms of programming time) to use that macro.
Task is easy if you use the %transpose macro. e.g.:
data Beispiel2(label='Beispiel2');
input ID:8. Reason:$35. Code:5.;
datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;
proc sort data=Beispiel2;
by id Reason;
run;
filename tr url 'https://raw.githubusercontent.com/art297/transpose/master/transpose.sas';
%include tr ;
%transpose(data=Beispiel2, out=want, by=id, Guessingrows=1000,
var=reason code)
Art, CEO, AnalystFinder.com
hallo,
thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞
@PierreYvesILY wrote:
thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞
Yes but you could paste that URL into your web browser, and then copy (as text) the SAS program and save it locally or on a server you can access, and do the %INCLUDE from that local file.
Hallo Paige,
not even that is possible... Almost everything that Looks like importing anything is blocked here.
@PierreYvesILY wrote:
Hallo Paige,
not even that is possible... Almost everything that Looks like importing anything is blocked here.
Not sure what you mean by "importing", as I did not suggest any "importing". Its just a copy and paste from a web page ... is that blocked at your site?
@PierreYvesILY wrote:
hallo,
thank you, but I cannot use external URLs in our Environment, and thus I cannot use this transpose function 😞
Load it in your browser, and copy/paste the code to your SAS program editor, then save it in your local environment for including.
proc transpose data=beispiel2 out=transp prefix=id_;
by id;
var code;
run;
proc transpose data=beispiel2 out=transp1 prefix=reason_;
by id;
var reason;
run;
data transpose;
merge transp transp1;
by id;
run;
Adding: the macro mentioned by @art297 works very well, I have used it, and if there were a lot of variables to be transposed (in this example, there are only two variables to transpose), it's probably easier and faster (in terms of programming time) to use that macro.
I want to use the solution proposed by Paige, which I find easier for me to understand.
When I use it fr my real data, I Encounter the Problem that the variables Code and Reason have 16 different values.
Thus, I obtain a file where 99% of the IDs have less than 3 columns filled for both variables Code and Reason, and less than 1% more than 3 columns.
How can I Limit the number of columns created by the transpose procedure for each variable (ex: reason)
un max of 4 columns is enough, thus having in the result dataset the variables Code_1 to Code_4?
Since you find @PaigeMiller's solution easiest to follow, I'll answer your question using a slightly modified version of his code.
However, since you asking how to omit some of your data from the output, is that really something that you want/need to do? And, since you are re-ordering your data by Reason within ID, in those cases where you currently have more than four Reason/Code entries, you will systematically be dropping Reason/Code combinations that occur later in the sort.
I changed the code so that it produces Reason_1-Reason_4 rather than ID_!-ID_4 which @PaigeMiller's code produced.
proc transpose data=beispiel2 out=transp1 (keep=id Reason_1-Reason_4) prefix=Reason_;
by id;
var Reason;
run;
proc transpose data=beispiel2 out=transp2 (keep=id Code_1-Code_4) prefix=Code_;
by id;
var Code;
run;
data transpose;
merge transp1 transp2;
by id;
run;
Art, CEO, AnalystFinder.com
yes, I need to keep only the 4 first reasons occuring.
I checked the data and find until 16 reasons for each ID-case, but most of the time if the number of reasons is > 5, it's irrelevant, because it's due to technical problems (obviously). And it's less than 1% of all ID-cases.
So, the result of the analysis will be clearer and still accurate.
However, I just said the solution of Paige Miller is the easiest for me to understand. I'm new to SAS and don't understand how the other solution is built.
Something I learned from King of SAS @data_null__ IDGROUP transpose.
data Beispiel2(label='Beispiel2');
input ID:8. Reason:$35. Code:5.;
datalines;
20180101 AAA 25
20190101 AAA 25
20190101 BBB 12
20180102 AAA 25
20180105 CCC 84
20190105 BBB 12
20190105 AAA 25
20180101 CCC 84
20190102 BBB 12
20180105 AAA 25
20190102 AAA 25
20190101 CCC 84
;
run;
proc sql;
select max(obs) into :obs
from (select count(*) as obs from Beispiel2 group by id ) ;
quit;
proc summary nway data=Beispiel2 missing;
class id;
output out =want(drop=_type_ _freq_)
idgroup(out[&obs](Reason Code)=) ;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.