BookmarkSubscribeRSS Feed
Bromlem
Calcite | Level 5

I am trying to add a Unique row identifier to a large data set that is loaded into memory in the LASR server that is distributed over 32 nodes.

 

When i use _N_ it counts the rows as it sees them in each thread but the end result is a non-unique row number.  Also it doesn't seem like _ThreadID_ is a valid variable in LASR server data steps like it is in CAS.

 

current program:

Data mylasr.out; set mylasr.in;

rownum=_N_;

run;

 

I searched and couldn't find this info anywhere...

 

thoughts?

4 REPLIES 4
SASKiwi
PROC Star

The only solution I can think of would be to go back to your non-distributed source table, add the unique row Identifier, then reload the table into distributed LASR.

hashman
Ammonite | Level 13

@Bromlem:

I am not sure whether using CUROBS= option instead of _N_ will help in your situation. But I suspect that it quite possibly might because while _N_ has nothing to do with the input data set per se (it's just populated at the top of the implied DATA step loop with the number of its current iteration from an internal counter), the variable assigned to CUROBS= is linked with the physical row numbers of the data set specified in the SET statement. In fact, it reflects them accurately even in the case there's a WHERE clause or observations marked for deletion. So, what the heck, give it a shot (the variable Q will be auto-dropped):

 

 

data mylasr.out ;                                                                                                                       
  set mylasr.in curobs = q ;                                                                                                       
  rownum = q ;                                                                                                                     
run ;

  For the sake of curiosity, you can also try the MONOTONIC() function, though I'd be very surprised if it worked:

 

 

data mylasr.out ;                                                                                                                       
  set mylasr.in ;                                                                                                       
  rownum = monotonic() ;                                                                                                                     
run ;

It will be interesting to see what you will have discovered.

 

Kind regards

Paul D.

 

Bromlem
Calcite | Level 5

@hashman 

 

Thanks for the ideas!  

 

Unfortunately it seems like curobs= option in the set statement is not possible in the LASR engine.

 

Also the monotonic() function is not recognized when attempting to execute in LASR, so brings back to grid work space and executes.

 

 

hashman
Ammonite | Level 13

@Bromlem: Thanks. I wonder what else doesn't work there ;). 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 428 views
  • 1 like
  • 3 in conversation