DATA Step, Macro, Functions and more

re arranging rows

Reply
New Contributor
Posts: 3

re arranging rows

[ Edited ]

I am having data 

like below

Variable RC is continued by other 10 more variables but main goal is to re arrange AC and ID please let me know 

 

looking to re arrange rows not variable arrangement 


data aa;
input ac 3. id 3. rc $2.;

datalines;
12 45 qq
78 0 da
56 34 ad
45 56 gh
34 78 cc
;
run;

 

Wanted output is 

ac id rc
12 45 qq
45 56 gh
56 34 ad
34 78 cc
78 0 da

 

I have 20 Million records using sas 9

 

suggest me without using Arrays and hash if possible (please suggest in Proc sql ) or other methods 

 

 

Super Contributor
Posts: 472

Re: re arranging rows

What are you trying to re-arrange, variables or rows?

 

If variables are needed to rearrange use RETAIN statement and for rows use PROC SORT

Thanks,
Suryakiran
Super User
Posts: 13,066

Re: re arranging rows

Please explicitly and in great detail describe how we derive the order of the output.

It appears that you want id on one row to be followed by an AC value on the next.

But how do you handle multiple values of AC that would match such as:

12 45 qq

78 0 da

56 34 ad

45 56 gh

34 78 cc

45 33 zz

Please do not tell me that in 20 million rows that will not occur (unless your AC and ID variables have many more digits).

 

Also what happens if you have an ID with NO matching AC?

 

If RC does not play any role in the exercise than don't include it. If it does have some importance explain how it and the other variables might affect the order.

New Contributor
Posts: 3

Re: re arranging rows

RC is the most important key for my analysis, AC is current ID and RC is old ID at a instance(date)

 

AC RC date

11  22 today

22 33  yesterday

33 44 so on

44 55 so on

55 0 so on

 

at the beginning I have 55 and 0 initially

 

55 changed to 44 --33 -- 22 finally to 11 

 

Yes If I sort by date I can get it but if all changes  happens on same day I cannot track it if it is unsorted 

 

Please all the numbers are not in sequence like 11 --55 its random number  I have to track

 

Super User
Posts: 13,066

Re: re arranging rows


@bobi wrote:

RC is the most important key for my analysis, AC is current ID and RC is old ID at a instance(date)

 

AC RC date

11  22 today

22 33  yesterday

33 44 so on

44 55 so on

55 0 so on

 

 


Supplying an actual date in the initial question would have helped a whole lot.

 

 

Yes If I sort by date I can get it but if all changes  happens on same day I cannot track it if it is unsorted 

 


Is there any guarantee that the order you have is correct then?

PROC Star
Posts: 1,340

Re: re arranging rows

data aa;
input ac 3. id  rc $2.;
datalines;
12 45 qq
78 0 da
56 34 ad
45 56 gh
34 78 cc
;
run;

data want;
if _n_=1 then do;
if 0 then set aa;
  dcl hash h(dataset:'aa');
   h.definekey('ac');
   h.definedata('ac','id','rc');
   h.definedone();
end;
set aa;
retain _id;
if _n_>1 then do;
if h.find(key:_id)=0;
end;
_id=id;
drop _:;
run;
Super User
Posts: 10,618

Re: re arranging rows

data aa;
input ac 3. id  rc $2.;
datalines;
12 45 qq
78 0 da
56 34 ad
45 56 gh
34 78 cc
;
run;
proc sql;
create table ancient as
 select * from aa 
  where ac not in (select id from aa);
run;
data want;
if _n_=1 then do;
if 0 then set aa;
  dcl hash h(dataset:'aa');
   h.definekey('ac');
   h.definedata(all:'y');
   h.definedone();
end;
set ancient;
output;
do while(h.find(key:id)=0);
 output;
end;
run;
proc print;run;

Notice: there is only one to one match .

New Contributor
Posts: 3

Re: re arranging rows

please suggest using Proc sql 

 

For hash object I am getting following error 

ERROR: Hash object added xxxxx items when memory failure occurred.
FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.
ERROR: The SAS System stopped processing this step because of insufficient memory.

Super User
Posts: 10,618

Re: re arranging rows

No. SQL can't do it due to don't know how many hierarchy/strata it is .

How big size of your table is it ?

PROC Star
Posts: 228

Re: re arranging rows

Here is a solution using SET with KEY=. I assume that you want to start with the first record, keep buildng the thread from there, and when the thread stops, read the next unused record:

data aa;                           
input ac id  rc $;                 
datalines;                         
12 45 qq                           
78 0 da                            
56 34 ad                           
45 56 gh                           
34 78 cc                           
;                                  
run;                               
                                   
data temp(index=(ac));
  set aa;
run;

data want temp;
  set temp(cntllev=rec);
  do until(0);
    modify temp(cntllev=rec) key=ac;
    if _iorc_ then leave;
    output want;
    remove temp;
    ac=id;
    end;
  _error_=0;
run;

Unlike the solution proposed by @novinosrin, this solution is able to accomodate more than one thread, and unlike the solution proposed by @Ksharp, if you have two records with the same AC, it is possible for the second one to start a new thread.

Ask a Question
Discussion stats
  • 9 replies
  • 159 views
  • 4 likes
  • 6 in conversation