BookmarkSubscribeRSS Feed
gsnidow
Obsidian | Level 7

Greetings all.  I am needing to access a mainframe dataset, and I am running into issues with memory allocated to my user_id on the host.  I am thinking this is because I am trying to grab the whole file to download to my local machine, but the file is almost 3000 characters wide, and there are tens of millions of obs.  I would like to just sum in one step.  Below is the code I am trying (and I know it works on other datasets, because I use it on other smaller datasets, and it works fine). 

rsubmit;

    filename gl "PPPCTK.ZZZ.BDRXXXXX.MNTHEND.DSKO745(0)" DISP=SHR;

        data glrecs ;

            infile gl obs=100 ;

             input

                @16        DT_DB                $10.

                @48        AT_BILLED            PD6.2   /* need to sum this and group by dt_db and state */

                @2650    STATE                $2. ;   

        run ; quit ;

        proc download data=glrecs out=work.glrecs;

        proc datasets library=work;

endrsubmit;

Now, if I were doing this with SQL, I would want the remote query to be...

select state, dt_db, sub(at_billed) as at_billed

  from dataset

group by state, dt_db

Then I would only be creating a download file that is a few hundred obs instead of many millions.  I sure would appreciate any help or advice.  Thank you.

Greg

9 REPLIES 9
data_null__
Jade | Level 19

Why don you add the SQL or other PROC etc to the RSUBMIT and PROC DOWNLOAD the summary.

gsnidow
Obsidian | Level 7

I'm not sure how I would reference the fields with SQL, since they are not defined other than my knowing the starting positions.  I also don't know how I would reference the dataset in the FROM clause, or if it is even possible to run proc sql on a raw file.

data_null__
Jade | Level 19

After the data step GLRECS executes.  You will have a SAS data WORK.GLRECS on the server, with three variable DT_BD AT_BILLED and STATE.  Instead of downloading the data run PROC SQL an the server as part of the RSUBMIT.

       data glrecs ;

            infile gl obs=100 ;

             input

                @16        DT_DB                $10.

                @48        AT_BILLED            PD6.2   /* need to sum this and group by dt_db and state */

                @2650    STATE                $2. ;  

        run ; quit ;

gsnidow
Obsidian | Level 7

Ah, yes, and therein lies the problem.  I probably could have explained better at the beginning.  The space available on the mainframe to each end user is limited.  In most cases we have sufficient resources on the host to create the datasets we need to download.  However, in this case I cannot even create the dataset because it's size exceeds the memory available to my user_id.  That is why I'm looking for a way to maybe sum from the buffer and just write the summed values to the dataset before downloading.  I guess I could try using retain.

gsnidow
Obsidian | Level 7

Wow, Tom, that is going to take me a while to digest it all.  In the mean time, I have managed to have my resources temporarily raised, so I am able to create the dataset.  Now my problem is that, for some reason, proc sql does not run on the mainframe.  I think I can use proc means to do what I want, but there are so many options and examples out there that I can't pin down what I need.  If my dataset has only 2 variables, i'll call them date_column and number_column, is there another way to do what the below sql does?

proc sql ;

     create table sum_table as

     select date_column, sum(number_column) as sum

     from my_table

     group by date_column ;

quit ;

All I need is a dataset that sums the number_column by the date_column.  I don't need any other stats.

data_null__
Jade | Level 19

Another possibility would be a VIEW and PROC SUMMARY.  It may not work if SAS needs to make big swap files, but it might be work a try.

rsubmit;
filename gl "PPPCTK.ZZZ.BDRXXXXX.MNTHEND.DSKO745(0)" DISP=SHR;
data glrecsV / view=glrecsV;
   infile gl obs=100 ;
    input
       @
16        DT_DB                $10.
       @
48        AT_BILLED            PD6.2   /* need to sum this and group by dt_db and state */
       @
2650    STATE                $2. ;  
run; quit;
proc summary data=glrecsV nway;
  
class state dt_db;
   output out=glrecs(drop=_:) sum(at_billed);
   run;
proc download data=glrecs out=work.glrecs;
   run;

endrsubmit;
gsnidow
Obsidian | Level 7

data_null_, thank you for that.  I did not know about proc summary.  That did just what I needed to do.  One thing, I was getting an "expecting an =." error, and found that I had to provide an alias for the sum variable.  I changed this...

output out=glrecs(drop=_:) sum(at_billed);


to


output out=glrecs(drop=_:) sum(at_billed) = at_billed;


Thank you


Greg

data_null__
Jade | Level 19

Sorry yes that was untested.  You can also use sum(at_billed)= to use the same name.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1193 views
  • 6 likes
  • 3 in conversation