Split DataSets In Years Based On Date Variables

Accepted Solution Solved
Reply
Contributor
Posts: 51
Accepted Solution

Split DataSets In Years Based On Date Variables

Hi friends,

Being a newbie in SAS sure do need your help because of the system reporting:

ERROR: Hash object added 65520 items when memory failure occurred.

FATAL: Insufficient memory to execute data step program. Aborted during the EXECUTION phase.

Most probably because of extended DataSet with numerous fields. I was using the following code:  to group datasets in different monthly files.

data _null_;

if 0 then set Temp2.Allprod2;

dcl hash groups ( ordered: 'a');

groups.definekey ('MIS_Month','_n_');

groups.definedata ('MIS_Month', 'MOB', 'Customer_No', 'Customer_Name', 'CNIC', 'Loan_Number', 'Disbursment_Date', 'Vintage_Date', 'Rehab_Date', 'Product_Desc', 'Product', 'Loan_Type', 'Rehab_Desc', 'Rehab_Cat', 'Bucket', 'Dpd', 'POS', 'Loan_Status', 'WriteOff_Date', 'Provision_Status', 'Provision_Charge', 'Provision_Balance', 'Loan_Amount', 'Subsegmentall', 'Account_Tagging' );

groups.definedone ();

do _n_ = 1 by 1 until (last.MIS_Month);

  set Temp2.Allprod2;

  by MIS_Month;

  groups.add();

end;

groups.output (dataset: compress('Temp2.Data_'||YEAR(MIS_Month)||'_'||MONTH(MIS_Month)));

run;

Any recommendation how to use the MIS_Month, currently representing last day of each month, but this time breaking data on yearly basis so as to later on carryout month wise breakup to avoid halt of procedure.

Thanx in advance.


Accepted Solutions
Solution
‎09-06-2013 02:50 PM
Super User
Super User
Posts: 6,499

Re: Split DataSets In Years Based On Date Variables

WHY do you want to split the dataset into pieces?

Is it for some reporting purpose?  Some analysis?

You do not need to use HASH to do this.

If you know the range of years then just write the code to do it.

data y2011 y2012 y2013 other;

set Temp2.Allprod2;

select (year(mis_month)) ;

   when (2011) output y2011;

   when (2012) output y2012;

   when (2013) output y2013;

   otherwise output other;

end;

run;

If you need you can add some more code to query the data to determine the number actual values of year so that it can generate the code.

View solution in original post


All Replies
Respected Advisor
Posts: 3,124

Re: Split DataSets In Years Based On Date Variables

Looks like your need to upgrade your RAM.

Super User
Posts: 17,818

Re: Split DataSets In Years Based On Date Variables

Or use a macro and SQL/Data step approach instead.

http://www.sascommunity.org/wiki/Split_Data_into_Subsets

EDIT: What does your date field look like, formatted as? Also, are you trying to resolve the hash error or change the code to split the file?

Contributor
Posts: 51

Re: Split DataSets In Years Based On Date Variables

Thanx for your interest in the topic Reeza.

Currently the dataset as a whole reflects the date field (i.e. labeled as MIS_Month) as:

     01JAN2007

     01FEB2007

     ...

     01AUG2013

Firstly I wanted to break the data based on Date basis as a whole because of only single first day of every month. But now, I think it would be better to consider the same variable for breaking the dataset on yearly basis instead of date as a whole.

Super Contributor
Posts: 644

Re: Split DataSets In Years Based On Date Variables

You could create a year variable using the year() function.

Richard

Contributor
Posts: 51

Re: Split DataSets In Years Based On Date Variables

Sorry for a late reply Richard.

However Couldn't  I use any other way to determine the year from the date variable without adding an additional field/function of yr?

Had to use the SQL procedure of creating datasets but for different years

Any idea how to achieve the same with a looping code instead of running similar SQL codes multiple times, for different years?

Thanx again

Respected Advisor
Posts: 3,889

Re: Split DataSets In Years Based On Date Variables

Why do you need/want to split the data set? Did you consider creating an index instead?

Contributor
Posts: 51

Re: Split DataSets In Years Based On Date Variables

Thanx for your suggestion Patrick

Please see if u can guide me in understanding the role of index and way to use the same in this condition.

Respected Advisor
Posts: 3,889

Re: Split DataSets In Years Based On Date Variables

Can you please tell us WHY you want to split the data set? What do you want to achieve with this?

There are often better ways than physically splitting data set and replicating data. So in giving us your reasons we might come up with appropriate suggestions.

Solution
‎09-06-2013 02:50 PM
Super User
Super User
Posts: 6,499

Re: Split DataSets In Years Based On Date Variables

WHY do you want to split the dataset into pieces?

Is it for some reporting purpose?  Some analysis?

You do not need to use HASH to do this.

If you know the range of years then just write the code to do it.

data y2011 y2012 y2013 other;

set Temp2.Allprod2;

select (year(mis_month)) ;

   when (2011) output y2011;

   when (2012) output y2012;

   when (2013) output y2013;

   otherwise output other;

end;

run;

If you need you can add some more code to query the data to determine the number actual values of year so that it can generate the code.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 1197 views
  • 0 likes
  • 6 in conversation