BookmarkSubscribeRSS Feed
rkumar23
Calcite | Level 5

I need to calculate no. of observation by group so I used below method however it look like it's messing the MAX value as you see in the output ...Any idea how to circumvent it?

Also just in case if you have more idea to calculate no. of observation by group say DISTNAME in below example via PROC Report itself that would be helpful as well..thanks

DATA LAB1;                                                            

INFILE DATALINES truncover ;                                          

INPUT DATE DATE9. GLIBSEQN $ CLUSTER $ DLIB $ DISTNAME $ DEFERR        

      THRESHOLD    ;                                                   

DATALINES;                                                             

01OCT14  10000  04  10015  XXXXXX15      2        8                    

01OCT14  10000  04  10015  XXXXXX15      8        8                    

01OCT14  10000  04  10015  XXXXXX15      7        8                    

01OCT14  10000  02  10018  yyyyyy18      3        9                    

01OCT14  10000  02  10018  yyyyyy18      3        10                   

01OCT14  10000  02  10018  yyyyyy18      8        8                    

01OCT14  10000  05  10016  ZZZZZZ16      9        8                    

01OCT14  10000  00  10017  MMMMMM17      10       9                    

01OCT14  20000  03  1001A  KKKKKK1A      12       9                    

01OCT14  20000  02  10018  yyyyyy18      19       9                    

01OCT14  20000  05  10016  ZZZZZZ16      20       0                    

01OCT14  20000  00  10017  MMMMMM17      1        9                    

;                                                                      

proc sort data=lab1;by distname;                                       

DATA  LAB2;                                                           

SET LAB1;                                                            

BY DISTNAME;                                                         

if first.distname then last = 0 ;                                    

last+1;                                                              

if last.distname then output;                                        

RUN;                                                                 

PROC REPORT DATA=LAB2 NOWD;                                          

COLUMN DATE GLIBSEQN CLUSTER DLIB DISTNAME deferr  THRESHOLD last;   

DEFINE DATE / GROUP 'DATE' ID ;                                      

define glibseqn / group 'glib' ;                                     

define cluster  / group 'cluster';                                   

define dlib     / group 'dlib';                                      

define distname / group 'distname';                                  

define deferr   /  max 'defer value' ;                               

define threshold / max 'threshold value';                            

define Last    /  display 'no. of counts' ;                          

run;                                                                

output produced is

                                                   DEFER  THRESHOLD     NO. OF 

DATE  GLIB      CLUSTER   DLIB      DISTNAME      VALUE      VALUE     COUNTS 

19997  10000     04        10015     XXXXXX15          7          8          3 

       20000     00        10017     MMMMMM17          1          9          2 

                 02        10018     YYYYYY18         19          9          4 

                 03        1001A     KKKKKK1A         12          9          1 

                 05        10016     ZZZZZZ16         20          0          2

3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

In your datastep:

DATA  LAB2;                                                           

SET LAB1;

retain last;   /* add this */

Also, please format your code in a readable method (consistent indentations and casing, finishing steps, putting things on separate lines etc.), it is very difficult to read that code.

rkumar23
Calcite | Level 5

RW9, Thanks tried with Retain still output is same...no change the deferr should have max value however it's still showing last observation value...

It look like format of the code changes after it was uploaded on site...may try better next time..

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, I thought you were asking about counts?  You would be better off sorting your data out first then proc reporting it:

proc sql;

     create table DATA_TO_REPORT as

     select     DATE,

                    GLIBSEQN,

                    CLUSTER,

                    DLIB,

                    DISTNAME,

                    DEFERR,      

                    THRESHOLD,

                    MAX(DEFERR) as MAX_DEFERR,

                    MAX(THRESHOLD) as MAX_THRESHOLD,

                    COUNT(1) as NUM_RECORDS

     from        HAVE

     group by  GLIBSEQN,

                     CLUSTER,

                     DLIB,

                     DISTNAME;

quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 766 views
  • 0 likes
  • 2 in conversation