Greetings all. I am needing to access a mainframe dataset, and I am running into issues with memory allocated to my user_id on the host. I am thinking this is because I am trying to grab the whole file to download to my local machine, but the file is almost 3000 characters wide, and there are tens of millions of obs. I would like to just sum in one step. Below is the code I am trying (and I know it works on other datasets, because I use it on other smaller datasets, and it works fine).
rsubmit;
filename gl "PPPCTK.ZZZ.BDRXXXXX.MNTHEND.DSKO745(0)" DISP=SHR;
data glrecs ;
infile gl obs=100 ;
input
@16 DT_DB $10.
@48 AT_BILLED PD6.2 /* need to sum this and group by dt_db and state */
@2650 STATE $2. ;
run ; quit ;
proc download data=glrecs out=work.glrecs;
proc datasets library=work;
endrsubmit;
Now, if I were doing this with SQL, I would want the remote query to be...
select state, dt_db, sub(at_billed) as at_billed
from dataset
group by state, dt_db
Then I would only be creating a download file that is a few hundred obs instead of many millions. I sure would appreciate any help or advice. Thank you.
Greg
Why don you add the SQL or other PROC etc to the RSUBMIT and PROC DOWNLOAD the summary.
I'm not sure how I would reference the fields with SQL, since they are not defined other than my knowing the starting positions. I also don't know how I would reference the dataset in the FROM clause, or if it is even possible to run proc sql on a raw file.
After the data step GLRECS executes. You will have a SAS data WORK.GLRECS on the server, with three variable DT_BD AT_BILLED and STATE. Instead of downloading the data run PROC SQL an the server as part of the RSUBMIT.
data glrecs ;
infile gl obs=100 ;
input
@16 DT_DB $10.
@48 AT_BILLED PD6.2 /* need to sum this and group by dt_db and state */
@2650 STATE $2. ;
run ; quit ;
Ah, yes, and therein lies the problem. I probably could have explained better at the beginning. The space available on the mainframe to each end user is limited. In most cases we have sufficient resources on the host to create the datasets we need to download. However, in this case I cannot even create the dataset because it's size exceeds the memory available to my user_id. That is why I'm looking for a way to maybe sum from the buffer and just write the summed values to the dataset before downloading. I guess I could try using retain.
Look at using HASH objects.
Wow, Tom, that is going to take me a while to digest it all. In the mean time, I have managed to have my resources temporarily raised, so I am able to create the dataset. Now my problem is that, for some reason, proc sql does not run on the mainframe. I think I can use proc means to do what I want, but there are so many options and examples out there that I can't pin down what I need. If my dataset has only 2 variables, i'll call them date_column and number_column, is there another way to do what the below sql does?
proc sql ;
create table sum_table as
select date_column, sum(number_column) as sum
from my_table
group by date_column ;
quit ;
All I need is a dataset that sums the number_column by the date_column. I don't need any other stats.
Another possibility would be a VIEW and PROC SUMMARY. It may not work if SAS needs to make big swap files, but it might be work a try.
data_null_, thank you for that. I did not know about proc summary. That did just what I needed to do. One thing, I was getting an "expecting an =." error, and found that I had to provide an alias for the sum variable. I changed this...
output out=glrecs(drop=_:) sum(at_billed);
to
output out=glrecs(drop=_:) sum(at_billed) = at_billed;
Thank you
Greg
Sorry yes that was untested. You can also use sum(at_billed)= to use the same name.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.