BookmarkSubscribeRSS Feed
Paari
Calcite | Level 5

dataset k;

input a b c d;

datalines;

1 2 4 4

1 3 3 4

2 3 4 4

so on ;

run;

data k;

set k;

if( a=xyz and b=lmn and c=hjk and d=cde) then;

flag1=x1;

flag2=x2;

flag3=x3;

else;

flag1=y1;

flag2=y2;

flag3=y3;

run;

I am worried about the execution time of the above. Could anybody help me optimizing the above problem. One solution I have is to index the four columns. Please provide some useful solution to optimize it in best possible manner.

Thanks in advance.

8 REPLIES 8
jakarman
Barite | Level 11

1M records  4 numeric of length 8 bytes that is 16Mbyte having 16 numeric 16Mbyte something so small  it will be cached in memory.  Nothing to bother about. Do you have specs that is more big?

---->-- ja karman --<-----
Paari
Calcite | Level 5

Thank you for the reply. But the dataset is much bigger more 4 Millions observations and we have limited resource( Executing Capacity). So I am thinking to optimize in some way. If there is any possible way to reduce the execution time please let me know.

Paari
Calcite | Level 5

if dataset has 4 millions row  then there will 128Mbyte data to cached and I don't know how well will SAS be able to tackle it with its cached memory.

jakarman
Barite | Level 11

Check your hardware settings, even UE is coming with the option set on about 2-3Gb. Your desktop is coming with at least 4Gb and can be easily 12G. Mostly server sessions even far higher.

With those low values (128 Mb) the IO OS systemcache will do. Optimizing buffersize at eg 128kb  and there will a 1000 buffers needs. Unless you are needing split second response time looking not a real issue.
Your selection looks simple (not complicated) your data looks simple (not complicated) unless you are having some challenges there I do not see a lot to optimize with all sequential processing on all data (fastest was)

---->-- ja karman --<-----
art297
Opal | Level 21

Like Jaap said, 4 million is NOT a large dataset for SAS.

I'd be more concerned about whether your code is doing what you want. You have:

if( a=xyz and b=lmn and c=hjk and d=cde) then;

flag1=x1;

flag2=x2;

flag3=x3;

else;

flag1=y1;

flag2=y2;

flag3=y3;

What do xya, lmn, hjk and cde represent? If they're not variables, the above code will fail.

Also, the code doesn't make sense as written. Don't you really want something like?:

if( a=xyz and b=lmn and c=hjk and d=cde) then do;

  flag1=x1;

  flag2=x2;

  flag3=x3;

end;

else do;

  flag1=y1;

  flag2=y2;

  flag3=y3;

end;

ven
Calcite | Level 5 ven
Calcite | Level 5

Forgetting the fact that your dataset is too small for a significant performance improvement in SAS , fix your code by including where clause instead of if statements, which will give you a better performance gain theoretically.

KachiM
Rhodochrosite | Level 12

Do all records receive the same set of flags when match occurs and similarly other set of flags when match fails? Do the flags are CONSTANT?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1885 views
  • 0 likes
  • 6 in conversation