BookmarkSubscribeRSS Feed
Miracle
Barite | Level 11

Hi there.

I'd like to ask if I could achieve from A to B in just one single datastep instead of what I've done below( a datastep and sql)?

Thank you very much.


data A;

  set tmp;

  by id;

  n+1;

  if first.idthen n=1;

run;

proc sql;

  create table B as

  select *, count(id) as total

  from A

  group by id;

run;

quit;

A:

id val n

1 11 1

1 22 2

1 33 3

2 22 1

B:

id val n total

1 11 1  3

1 22 2  3

1 33 3  3

2 22 1  1

12 REPLIES 12
LinusH
Tourmaline | Level 20

Can't really see that you could, maybe using some advanced read-ahead programming, or using hash objects.

But why, is there a real problem, or optimizing for fun?

Data never sleeps
Miracle
Barite | Level 11

Hi Linus. Thanks for your prompt reply.

I'm just wondering may be there are other more elegant coding ways to achieve this than what I did.

Astounding
PROC Star

Well, you can in a way.  You still have to read the data twice, but you can accomplish that in a single DATA step:

data B;

   n=0;

   total=0;

   do until (last.id);

       set A (keep=id);

       by id;

       total + 1;

   end;

   do until (last.id);

       set A;

       by id;

       n + 1;

       output;

   end;

run;

Each SET statement reads the entire data set independently of the other SET statement.  So for each ID, the top loop counts the number of records, and the bottom loop reads the same records, calculates N, and outputs.

Miracle
Barite | Level 11

Hi Astounding.

Thanks for your input.
It's always great to know other coding ways.

robertrao
Quartz | Level 8

I have a question...

Why the upper half is taking only the last occurance number.......

whereas  the lower half of the code is taking all the counts even though we said do until(last.id) for both of them??

IS IT THE OUTPUT STATEMENT WHICH IS MAKING THE DIFFERENCE FOR THE BOTTOM HALF ALONE TO DISPLAY THE 1,2,3 instead of just 3!!1

Haikuo
Onyx | Level 15

It probably takes more than a brief post to explain the whole nine yard. The structure that Astounding used is call DOW, there are lots of histories around it. Google SAS DOW, you will know.

Here is one paper by Dorfman, who participated this part of history:

http://support.sas.com/resources/papers/proceedings09/038-2009.pdf

Haikuo

Peter_C
Rhodochrosite | Level 12

robertrao

read the paper offered by Haikuo   to find the explanation.

http://support.sas.com/resources/papers/proceedings09/038-2009.pdf

Peter_C
Rhodochrosite | Level 12

Although it appears that the data set is read twice, working with a BY statement, it is probable that the required buffers or pages of the data set, will be in memory when the second SET is performed - so unless BY groups are very large, performance is unlikely to be affected.

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9

so after SAS loops through the 1st by group it moves to the second loop,right,is that how DOWs work?

art297
Opal | Level 21

Yes!  Then, at the end of the last loop, it goes back to the beginning to get the next level of the by variable(s).

Peter_C
Rhodochrosite | Level 12

Tal

what you describe is the way this program works ("after SAS loops through the 1st by group it moves to the second loop")

In general, DoW loops place a DO loop around a SET statement

The program here is more complex, having two interrelated DoW loops.

peterC

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9

thanks guys

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1304 views
  • 1 like
  • 8 in conversation