Hi friends,
Being a newbie in SAS sure do need your help because of the system reporting:
ERROR: Hash object added 65520 items when memory failure occurred.
FATAL: Insufficient memory to execute data step program. Aborted during the EXECUTION phase.
Most probably because of extended DataSet with numerous fields. I was using the following code: to group datasets in different monthly files.
data _null_;
if 0 then set Temp2.Allprod2;
dcl hash groups ( ordered: 'a');
groups.definekey ('MIS_Month','_n_');
groups.definedata ('MIS_Month', 'MOB', 'Customer_No', 'Customer_Name', 'CNIC', 'Loan_Number', 'Disbursment_Date', 'Vintage_Date', 'Rehab_Date', 'Product_Desc', 'Product', 'Loan_Type', 'Rehab_Desc', 'Rehab_Cat', 'Bucket', 'Dpd', 'POS', 'Loan_Status', 'WriteOff_Date', 'Provision_Status', 'Provision_Charge', 'Provision_Balance', 'Loan_Amount', 'Subsegmentall', 'Account_Tagging' );
groups.definedone ();
do _n_ = 1 by 1 until (last.MIS_Month);
set Temp2.Allprod2;
by MIS_Month;
groups.add();
end;
groups.output (dataset: compress('Temp2.Data_'||YEAR(MIS_Month)||'_'||MONTH(MIS_Month)));
run;
Any recommendation how to use the MIS_Month, currently representing last day of each month, but this time breaking data on yearly basis so as to later on carryout month wise breakup to avoid halt of procedure.
Thanx in advance.
WHY do you want to split the dataset into pieces?
Is it for some reporting purpose? Some analysis?
You do not need to use HASH to do this.
If you know the range of years then just write the code to do it.
data y2011 y2012 y2013 other;
set Temp2.Allprod2;
select (year(mis_month)) ;
when (2011) output y2011;
when (2012) output y2012;
when (2013) output y2013;
otherwise output other;
end;
run;
If you need you can add some more code to query the data to determine the number actual values of year so that it can generate the code.
Looks like your need to upgrade your RAM.
Or use a macro and SQL/Data step approach instead.
http://www.sascommunity.org/wiki/Split_Data_into_Subsets
EDIT: What does your date field look like, formatted as? Also, are you trying to resolve the hash error or change the code to split the file?
Thanx for your interest in the topic Reeza.
Currently the dataset as a whole reflects the date field (i.e. labeled as MIS_Month) as:
01JAN2007
01FEB2007
...
01AUG2013
Firstly I wanted to break the data based on Date basis as a whole because of only single first day of every month. But now, I think it would be better to consider the same variable for breaking the dataset on yearly basis instead of date as a whole.
You could create a year variable using the year() function.
Richard
Sorry for a late reply Richard.
However Couldn't I use any other way to determine the year from the date variable without adding an additional field/function of yr?
Had to use the SQL procedure of creating datasets but for different years
Any idea how to achieve the same with a looping code instead of running similar SQL codes multiple times, for different years?
Thanx again
Why do you need/want to split the data set? Did you consider creating an index instead?
Thanx for your suggestion Patrick
Please see if u can guide me in understanding the role of index and way to use the same in this condition.
Can you please tell us WHY you want to split the data set? What do you want to achieve with this?
There are often better ways than physically splitting data set and replicating data. So in giving us your reasons we might come up with appropriate suggestions.
WHY do you want to split the dataset into pieces?
Is it for some reporting purpose? Some analysis?
You do not need to use HASH to do this.
If you know the range of years then just write the code to do it.
data y2011 y2012 y2013 other;
set Temp2.Allprod2;
select (year(mis_month)) ;
when (2011) output y2011;
when (2012) output y2012;
when (2013) output y2013;
otherwise output other;
end;
run;
If you need you can add some more code to query the data to determine the number actual values of year so that it can generate the code.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.