BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MikeZdeb
Rhodochrosite | Level 12

hi ... I use it as shorthand for "if I had given that a few more moments, I would have understood"

from Google (the source of all knowledge in the known universe) ... "the moment you realize that you don't know something or are just learning something most everyone knows"

ps  is "known universe" redundant

art297
Opal | Level 21

: I really liked your approach to solving this problem, but I just read a 2007 SGF paper by Jane Stroupe http://support.sas.com/rnd/papers/sgf07/arrays1780.pdf  that taught me something I didn't know before tonight and, not only was it applicable to your suggested code, but makes the code run 16 percent faster.  Since the OP is dealing with 40 million records, I thought he would be interested.

NOTE: Based on subsequent feedback from Ksharp, it was discovered that the temporary array doesn't reinitialize as it isn't reinitialized with each iteration of the implicit loop.  As such, the code will only work correctly as modified, below, and the performance difference is reversed.  However, I've elected not to delete the post as now find the new information even more important to know.

However, I thought everyone else would be equally interested, as it is a useful bit of information and, for me, took all the guess work out of what your code was doing.

When one declares an array for variables that don't exist in the pdv, they don't have to assign any dummy variable names.  I took that one step further and declared the array as _temporary_ so that it doesn't even have to be dropped.

Thus the code I ended up with was:

/*Create some test data*/

data have;

  infile datalines dsd truncover;

  input person $ phone1-phone3;

  datalines;

A,111,222,111

B,,444,444

C,111,,222,

;

run;

data want(drop=i j);

  set have;

  array _p{*} phone1-phone3 ;

  array _pp{3} _temporary_;

  j=0;

  do i=1 to dim(_p);

   if _p{i} not in _pp then do;j+1;_pp{j}=_p{i};end;

  end;

  do i=1 to dim(_p);

   _p{i}=_pp{i};

  end;

  call missing (of _pp(*));

run;

Ksharp
Super User

Arthur,

Yes. You are right. Using temperary array is faster and better .

But I think you need call missing of them at the end of data step, because they are retained during the data step.

Ksharp

art297
Opal | Level 21

You are absolutely correct.  I've modified my post to reflect that after adding the necessary statement to reinitialize the array, the performance benefit is not only lost, but reversed.  However, I didn't delete the post as I think following the thought process is important in itself.

Patrick
Opal | Level 21

Hi to all

I'm really impressed how many quality answers you've given me.

Ksharp: You nailed it! That's exactly what I need.

Like Hai Kuo I wasn't aware that an array can be addressed in the way you've done it. So besides of solving my problem you've also taught me something very valuable Smiley Happy

I couldn't mark all posts as helpful and I ended up to select different approaches. I consider every single answer in this thread as really interesting and helpful and I'm a disappointed that I can't express this through my marking.

Thanks to all of you.

Patrick

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 34 replies
  • 2859 views
  • 6 likes
  • 10 in conversation