BookmarkSubscribeRSS Feed
bobi
Fluorite | Level 6

I am having data 

like below

Variable RC is continued by other 10 more variables but main goal is to re arrange AC and ID please let me know 

 

looking to re arrange rows not variable arrangement 


data aa;
input ac 3. id 3. rc $2.;

datalines;
12 45 qq
78 0 da
56 34 ad
45 56 gh
34 78 cc
;
run;

 

Wanted output is 

ac id rc
12 45 qq
45 56 gh
56 34 ad
34 78 cc
78 0 da

 

I have 20 Million records using sas 9

 

suggest me without using Arrays and hash if possible (please suggest in Proc sql ) or other methods 

 

 

9 REPLIES 9
SuryaKiran
Meteorite | Level 14

What are you trying to re-arrange, variables or rows?

 

If variables are needed to rearrange use RETAIN statement and for rows use PROC SORT

Thanks,
Suryakiran
ballardw
Super User

Please explicitly and in great detail describe how we derive the order of the output.

It appears that you want id on one row to be followed by an AC value on the next.

But how do you handle multiple values of AC that would match such as:

12 45 qq

78 0 da

56 34 ad

45 56 gh

34 78 cc

45 33 zz

Please do not tell me that in 20 million rows that will not occur (unless your AC and ID variables have many more digits).

 

Also what happens if you have an ID with NO matching AC?

 

If RC does not play any role in the exercise than don't include it. If it does have some importance explain how it and the other variables might affect the order.

bobi
Fluorite | Level 6

RC is the most important key for my analysis, AC is current ID and RC is old ID at a instance(date)

 

AC RC date

11  22 today

22 33  yesterday

33 44 so on

44 55 so on

55 0 so on

 

at the beginning I have 55 and 0 initially

 

55 changed to 44 --33 -- 22 finally to 11 

 

Yes If I sort by date I can get it but if all changes  happens on same day I cannot track it if it is unsorted 

 

Please all the numbers are not in sequence like 11 --55 its random number  I have to track

 

ballardw
Super User

@bobi wrote:

RC is the most important key for my analysis, AC is current ID and RC is old ID at a instance(date)

 

AC RC date

11  22 today

22 33  yesterday

33 44 so on

44 55 so on

55 0 so on

 

 


Supplying an actual date in the initial question would have helped a whole lot.

 

 

Yes If I sort by date I can get it but if all changes  happens on same day I cannot track it if it is unsorted 

 


Is there any guarantee that the order you have is correct then?

novinosrin
Tourmaline | Level 20
data aa;
input ac 3. id  rc $2.;
datalines;
12 45 qq
78 0 da
56 34 ad
45 56 gh
34 78 cc
;
run;

data want;
if _n_=1 then do;
if 0 then set aa;
  dcl hash h(dataset:'aa');
   h.definekey('ac');
   h.definedata('ac','id','rc');
   h.definedone();
end;
set aa;
retain _id;
if _n_>1 then do;
if h.find(key:_id)=0;
end;
_id=id;
drop _:;
run;
Ksharp
Super User
data aa;
input ac 3. id  rc $2.;
datalines;
12 45 qq
78 0 da
56 34 ad
45 56 gh
34 78 cc
;
run;
proc sql;
create table ancient as
 select * from aa 
  where ac not in (select id from aa);
run;
data want;
if _n_=1 then do;
if 0 then set aa;
  dcl hash h(dataset:'aa');
   h.definekey('ac');
   h.definedata(all:'y');
   h.definedone();
end;
set ancient;
output;
do while(h.find(key:id)=0);
 output;
end;
run;
proc print;run;

Notice: there is only one to one match .

bobi
Fluorite | Level 6

please suggest using Proc sql 

 

For hash object I am getting following error 

ERROR: Hash object added xxxxx items when memory failure occurred.
FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.
ERROR: The SAS System stopped processing this step because of insufficient memory.

Ksharp
Super User

No. SQL can't do it due to don't know how many hierarchy/strata it is .

How big size of your table is it ?

s_lassen
Meteorite | Level 14

Here is a solution using SET with KEY=. I assume that you want to start with the first record, keep buildng the thread from there, and when the thread stops, read the next unused record:

data aa;                           
input ac id  rc $;                 
datalines;                         
12 45 qq                           
78 0 da                            
56 34 ad                           
45 56 gh                           
34 78 cc                           
;                                  
run;                               
                                   
data temp(index=(ac));
  set aa;
run;

data want temp;
  set temp(cntllev=rec);
  do until(0);
    modify temp(cntllev=rec) key=ac;
    if _iorc_ then leave;
    output want;
    remove temp;
    ac=id;
    end;
  _error_=0;
run;

Unlike the solution proposed by @novinosrin, this solution is able to accomodate more than one thread, and unlike the solution proposed by @Ksharp, if you have two records with the same AC, it is possible for the second one to start a new thread.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2739 views
  • 4 likes
  • 6 in conversation