Re: Creating data in one line with multiple columns using two columns

india2016 · Posted 08-23-2018 08:45 AM

hello All.

I have below data set

..

data have ;
input id a $2. b $2.;
datalines;
1 A S1
1 A S2
1 B S3
2 A S1
2 A S2
;
run;

Want like below

1 A B S1 S2 S3

2 A S1 S2

I got solution using transpose but problem is count of rows are 80lakh so lack of space issue I cant use transpose so is there any way to get same output using another way.Please suggest way so that I can code accrodingly.

Thanks in advance.!

RW9 · Posted 08-23-2018 08:59 AM

So you are wanting a dataset with 80,000 columns? Its not going to work.

india2016 · Posted 08-23-2018 09:04 AM

Rows are 80lakh not columns.

there are only 3 columns on which we need to do this transformation.

RW9 · Posted 08-23-2018 09:16 AM

In which case, proc transpose which you already have working is likely going to be the fastest method.

PeterClemmensen · Posted 08-23-2018 08:59 AM

Is A B S1 S2 S3 a string in a single variable or several variables?

The DATA to DATA Step Macro
Blog: SASnrd

india2016 · Posted 08-23-2018 09:05 AM

several variables.

each value will in different columns.

PeterClemmensen · Posted 08-23-2018 09:07 AM

So how many variables could you potentially end up with? This does not sound like a good idea at all. Most often it is best to keep your data sets in long formats.

The DATA to DATA Step Macro
Blog: SASnrd

india2016 · Posted 08-23-2018 09:09 AM

@ draycut

Yes, Its not good idea to expand columns but this could end up with additional 3 - 4 columns only.

Ksharp · Posted 08-23-2018 09:27 AM

 
data have ;
input id a $2. b $2.;
datalines;
1 A S1
1 A S2
1 B S3
2 A S1
2 A S2
;
run;
data want;
 array x1{999} $ 100 _temporary_;
 array x2{999} $ 100 _temporary_;
 call missing(of x1{*} x2{*} );
 i1=0;i2=0;
 do until(last.id);
    set have;
    by id;
    if a not in x1 then do; i1+1;x1{i1}=a;end;
	if b not in x2 then do; i2+1;x2{i2}=b;end;
 end;
new_a=catx(' ',of x1{*});
new_b=catx(' ',of x2{*});
drop a b  i1 i2 ;
run;

Astounding · Posted 08-23-2018 03:26 PM

A few notes, if you want answers that actually address your problem ...

The data you begin with is useful. The data you are trying to create is virtually useless. Why would you want to do this? What is the final goal here? There may be other ways to get there, other than transforming your data.

In a SAS data set, each variable has a name. If you are describing your output, you need to supply variable names as part of the description. Unless you want text as output, rather than a SAS data set. But you haven't specified that.

Why can't you use transpose? Do you not know how to use it? Were you told not to use it? It's a good tool for the job, probably the best tool for the job. If someone told you to bake a cake, but you can't use an oven, what would you do?

india2016 · Posted 08-24-2018 03:29 AM

@Astounding

Thanks for advice,

Will keep in mind while asking any query.

Thanks.

mansour_ib_sas · Posted 08-23-2018 07:22 PM

Hello,

I propose this solution; it is optimal.

The idea is to work column a after column b.

At the end group the data

data test ;
input id a $2. b $2.;
datalines;
1 A S1
1 A S2
1 B S3
2 A S1
2 A S2
;
run;

/*A*/


proc sql;
create table test1 as select distinct id, a from test
; 
quit;

data test2(keep=id a tmp);
set test1;
by id ;
if first.id then tmp=0;
tmp+1;
run;

proc sql noprint; 
select max(tmp) into : maxia from 
test2;


data _null_;
set test2;
call symputx('maxi1',&maxia);
run;



data  test3(keep=id col: );
array col(*) $ col1-col&maxi1;
  do until (last.id);
    set test2;
    by  id ;
    col(tmp)=A ;
end;
run;

/*B*/

proc sql;
create table test11 as select distinct id, b from test
; 
quit;

data test22(keep=id b tmp);
set test11;
by id ;
if first.id then tmp=0;
tmp+1;
run;

proc sql  noprint; 
select max(tmp) into : maxib from 
test22;


data _null_;
set test22;
call symputx('maxi2',&maxib+&maxia);
call symputx('maxi1n',&maxi1+1);
;run;

%put &maxi2.;



data  test33(keep=id col: );
array col(*) $ col&maxi1n-col&maxi2;
  do until (last.id);
    set test22;
    by  id ;
    col(tmp)=b ;
end;
run;



data fin;
retain id col:;
merge test3 test33;
by id;
run; 



data fin1(drop=rc i j);
set fin;
array coll(*) $ col:;
do i=1 to dim(coll);
   if coll(i)="" then do;
   rc=i;
   do j=rc to dim(coll);
   if coll(j) ne "" then do;
   coll(i)=coll(j);
   coll(j)="";
   leave;
 
   end;  
  end;
 end;
end;
run;

india2016 · Posted 08-24-2018 03:34 AM

@Ksharp @mansour_ib_sas

@RW9 @PeterClemmensen

Thank you all for suggesting solutions and giving time for query.

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!