Help using Base SAS procedures

Not enough memory for proc transpose

Accepted Solution Solved
Reply
Valued Guide
Posts: 860
Accepted Solution

Not enough memory for proc transpose

Good Morning All,

 

I have a solution that I've used in the past with smaller datasets, the current dataset I am working with has 350 'rules' and about 1 million records, so the transpose step isn't working due to the wide dataset that is produced.  I tried to split the original dataset up into 3rds but the 300,000 columns are still too much.  Below is a short example of what I am doing, any suggestions or different solutions are greatly appreciated.

 

data have;
infile cards dsd;
informat run_date reporting_month mmddyy10.;
format run_date reporting_month mmddyy10.;
input run_date reporting_month    _rule1 _rule2 _rule3;
cards;
04/20/2016,03/31/2016,.,1,0
04/20/2016,03/31/2016,1,1,1
04/20/2016,03/31/2016,1,1,1
04/20/2016,03/31/2016,1,1,0
04/20/2016,03/31/2016,.,0,0
;run;

proc transpose data=have out=tran_have;by Run_Date reporting_month;var _:;

data want(keep=Rule Fail_Total Run_Date reporting_month);
set tran_have;
Rule = substr(_NAME_,2,5);
Fail_Total = sum(of colSmiley Happy;
run;


Accepted Solutions
Solution
‎04-20-2016 10:46 AM
Super User
Posts: 5,498

Re: Not enough memory for proc transpose

Posted in reply to Steelers_In_DC

It looks like your data set is sorted, so the BY statement should work:

 

proc summary data=have;

by run_date reporting month;

var _rule1 _rule2 _rule3;

output out=want (drop=_type_ _freq_) sum=;

run;

 

Of course, the task of typing out the VAR statement is not so easy in real life.

View solution in original post


All Replies
Super User
Posts: 5,498

Re: Not enough memory for proc transpose

Posted in reply to Steelers_In_DC

Surely there will be another way to program this, without transposing the data.  It looks like you need to get some totals or subtotals for each run_date/reporting_month combination.  SAS has lots of ways to accomplish this.  It's probably best to start with a clean slate and describe the end result you would like, based on the original data.

Valued Guide
Posts: 860

Re: Not enough memory for proc transpose

Posted in reply to Astounding

I have several datasets that have edits that need to be applied for governmental checks.  I'm working on the largest one now which is 1st lien mortgage data.  I wrote out 350+ rules like this:

 

'_1_All Failures'n =1;
'_2_All Failures'n = missing(loan_closing_date);
'_2.1_All Failures'n = loan_closing_date > reporting_month;
if not missing(loan_closing_date) then do;
'_2.2_All Failures'n = loan_closing_date < Forty_years_ago;end;

 

I chose the name of the rules to coincide with a name provided by the government so I can join and get additional information downstream.  This produces a dataset with 558 variables by 987,144 rows.

Super User
Posts: 5,498

Re: Not enough memory for proc transpose

Posted in reply to Steelers_In_DC

OK, so what is your intent from that point?  It looks like you are getting the sum (for each reporting month/date) of each of your rule variables.  Is it more complex than that?

Valued Guide
Posts: 860

Re: Not enough memory for proc transpose

Posted in reply to Astounding

Nope, that's it.  Sounds remedial when I say it but I'm not sure what to do.

Valued Guide
Posts: 860

Re: Not enough memory for proc transpose

Posted in reply to Steelers_In_DC

Here's what I get when I limit the population in the first step.  This is what I want:

 

run_date reporting_month Rule Fail_Total
04/20/2016 01/31/2016 1_All Failures  
04/20/2016 01/31/2016 2_All Failures 0
04/20/2016 01/31/2016 2.1_All Failures 0
04/20/2016 01/31/2016 2.2_All Failures 0
04/20/2016 01/31/2016 2.4_All Failures 0
04/20/2016 01/31/2016 3_All Failures 11
04/20/2016 01/31/2016 3.1_All Failures 0
04/20/2016 01/31/2016 3.2_All Failures 0
04/20/2016 01/31/2016 3.3_All Failures 11
04/20/2016 01/31/2016 3.5_All Failures 0
04/20/2016 01/31/2016 4.1_All Failures  
04/20/2016 01/31/2016 5.1_All Failures  
04/20/2016 01/31/2016 5.2_All Failures 0
04/20/2016 01/31/2016 6.2_All Failures 145
04/20/2016 01/31/2016 6.3_All Failures 0
04/20/2016 01/31/2016 7.2_All Failures 0
04/20/2016 01/31/2016 8.1_All Failures 23
04/20/2016 01/31/2016 8.2_All Failures  
04/20/2016 01/31/2016 8.4_All Failures 1595
04/20/2016 01/31/2016 9_All Failures 0
04/20/2016 01/31/2016 9.1_All Failures  
04/20/2016 01/31/2016 10.2_All Failures 0
04/20/2016 01/31/2016 11_All Failures 98919
04/20/2016 01/31/2016 11.1_All Failures 0
04/20/2016 01/31/2016 11.3_All Failures 51
04/20/2016 01/31/2016 12_All Failures 98765
Solution
‎04-20-2016 10:46 AM
Super User
Posts: 5,498

Re: Not enough memory for proc transpose

Posted in reply to Steelers_In_DC

It looks like your data set is sorted, so the BY statement should work:

 

proc summary data=have;

by run_date reporting month;

var _rule1 _rule2 _rule3;

output out=want (drop=_type_ _freq_) sum=;

run;

 

Of course, the task of typing out the VAR statement is not so easy in real life.

Super User
Posts: 5,498

Re: Not enough memory for proc transpose

Posted in reply to Astounding

We're crossing paths on our posts, but a little additional processing would be needed:

 

data really_want;

set want;

array sums {350} /* long list of names here */;

length rule_name $ 50;

do _n_=1 to 350;

   rule_name = vname(sums{_n_});

   total = sums{_n_};

   output;

end;

keep reporting_month run_date rule_name total;

run;

 

Where it is appropriate, you may need to fill in some missing values with zero.

Valued Guide
Posts: 860

Re: Not enough memory for proc transpose

[ Edited ]
Posted in reply to Astounding

That is awesome, thank you so much for your help.  Much appreciated.  I will definitely work to become more familiar with proc summary!!!  By the way, no need to list the vars, that's why I started with an underscore.  The final product which works perfectly looks like this:

 

proc summary data=FL_Rules;
by run_date reporting_month;
var _:;
output out=want (drop=_type_ _freq_ ) sum=;
run;

proc transpose data=want out=want_tran (rename=(col1=Fail_Total));by Run_Date reporting_month;var _:;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 456 views
  • 0 likes
  • 2 in conversation