BookmarkSubscribeRSS Feed
Bromlem
Calcite | Level 5

I am trying to add a Unique row identifier to a large data set that is loaded into memory in the LASR server that is distributed over 32 nodes.

 

When i use _N_ it counts the rows as it sees them in each thread but the end result is a non-unique row number.  Also it doesn't seem like _ThreadID_ is a valid variable in LASR server data steps like it is in CAS.

 

current program:

Data mylasr.out; set mylasr.in;

rownum=_N_;

run;

 

I searched and couldn't find this info anywhere...

 

thoughts?

4 REPLIES 4
SASKiwi
PROC Star

The only solution I can think of would be to go back to your non-distributed source table, add the unique row Identifier, then reload the table into distributed LASR.

hashman
Ammonite | Level 13

@Bromlem:

I am not sure whether using CUROBS= option instead of _N_ will help in your situation. But I suspect that it quite possibly might because while _N_ has nothing to do with the input data set per se (it's just populated at the top of the implied DATA step loop with the number of its current iteration from an internal counter), the variable assigned to CUROBS= is linked with the physical row numbers of the data set specified in the SET statement. In fact, it reflects them accurately even in the case there's a WHERE clause or observations marked for deletion. So, what the heck, give it a shot (the variable Q will be auto-dropped):

 

 

data mylasr.out ;                                                                                                                       
  set mylasr.in curobs = q ;                                                                                                       
  rownum = q ;                                                                                                                     
run ;

  For the sake of curiosity, you can also try the MONOTONIC() function, though I'd be very surprised if it worked:

 

 

data mylasr.out ;                                                                                                                       
  set mylasr.in ;                                                                                                       
  rownum = monotonic() ;                                                                                                                     
run ;

It will be interesting to see what you will have discovered.

 

Kind regards

Paul D.

 

Bromlem
Calcite | Level 5

@hashman 

 

Thanks for the ideas!  

 

Unfortunately it seems like curobs= option in the set statement is not possible in the LASR engine.

 

Also the monotonic() function is not recognized when attempting to execute in LASR, so brings back to grid work space and executes.

 

 

hashman
Ammonite | Level 13

@Bromlem: Thanks. I wonder what else doesn't work there ;). 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 721 views
  • 1 like
  • 3 in conversation