hello All.
I have below data set
..
data have ;
input id a $2. b $2.;
datalines;
1 A S1
1 A S2
1 B S3
2 A S1
2 A S2
;
run;
Want like below
1 A B S1 S2 S3
2 A S1 S2
I got solution using transpose but problem is count of rows are 80lakh so lack of space issue I cant use transpose so is there any way to get same output using another way.Please suggest way so that I can code accrodingly.
Thanks in advance.!
So you are wanting a dataset with 80,000 columns? Its not going to work.
Rows are 80lakh not columns.
there are only 3 columns on which we need to do this transformation.
In which case, proc transpose which you already have working is likely going to be the fastest method.
Is A B S1 S2 S3 a string in a single variable or several variables?
several variables.
each value will in different columns.
So how many variables could you potentially end up with? This does not sound like a good idea at all. Most often it is best to keep your data sets in long formats.
@ draycut
Yes, Its not good idea to expand columns but this could end up with additional 3 - 4 columns only.
data have ;
input id a $2. b $2.;
datalines;
1 A S1
1 A S2
1 B S3
2 A S1
2 A S2
;
run;
data want;
array x1{999} $ 100 _temporary_;
array x2{999} $ 100 _temporary_;
call missing(of x1{*} x2{*} );
i1=0;i2=0;
do until(last.id);
set have;
by id;
if a not in x1 then do; i1+1;x1{i1}=a;end;
if b not in x2 then do; i2+1;x2{i2}=b;end;
end;
new_a=catx(' ',of x1{*});
new_b=catx(' ',of x2{*});
drop a b i1 i2 ;
run;
A few notes, if you want answers that actually address your problem ...
The data you begin with is useful. The data you are trying to create is virtually useless. Why would you want to do this? What is the final goal here? There may be other ways to get there, other than transforming your data.
In a SAS data set, each variable has a name. If you are describing your output, you need to supply variable names as part of the description. Unless you want text as output, rather than a SAS data set. But you haven't specified that.
Why can't you use transpose? Do you not know how to use it? Were you told not to use it? It's a good tool for the job, probably the best tool for the job. If someone told you to bake a cake, but you can't use an oven, what would you do?
Hello,
I propose this solution; it is optimal.
The idea is to work column a after column b.
At the end group the data
data test ;
input id a $2. b $2.;
datalines;
1 A S1
1 A S2
1 B S3
2 A S1
2 A S2
;
run;
/*A*/
proc sql;
create table test1 as select distinct id, a from test
;
quit;
data test2(keep=id a tmp);
set test1;
by id ;
if first.id then tmp=0;
tmp+1;
run;
proc sql noprint;
select max(tmp) into : maxia from
test2;
data _null_;
set test2;
call symputx('maxi1',&maxia);
run;
data test3(keep=id col: );
array col(*) $ col1-col&maxi1;
do until (last.id);
set test2;
by id ;
col(tmp)=A ;
end;
run;
/*B*/
proc sql;
create table test11 as select distinct id, b from test
;
quit;
data test22(keep=id b tmp);
set test11;
by id ;
if first.id then tmp=0;
tmp+1;
run;
proc sql noprint;
select max(tmp) into : maxib from
test22;
data _null_;
set test22;
call symputx('maxi2',&maxib+&maxia);
call symputx('maxi1n',&maxi1+1);
;run;
%put &maxi2.;
data test33(keep=id col: );
array col(*) $ col&maxi1n-col&maxi2;
do until (last.id);
set test22;
by id ;
col(tmp)=b ;
end;
run;
data fin;
retain id col:;
merge test3 test33;
by id;
run;
data fin1(drop=rc i j);
set fin;
array coll(*) $ col:;
do i=1 to dim(coll);
if coll(i)="" then do;
rc=i;
do j=rc to dim(coll);
if coll(j) ne "" then do;
coll(i)=coll(j);
coll(j)="";
leave;
end;
end;
end;
end;
run;
Thank you all for suggesting solutions and giving time for query.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.