DATA Step, Macro, Functions and more

Data step : Split data sets according to month

Accepted Solution Solved
Reply
Contributor
Posts: 32
Accepted Solution

Data step : Split data sets according to month

Hi,

Could you please help me to split the data in to multiple data sets based on the Year and month.

For e.g My input data contains two year data, I want to split it in to 24 small data sets based on the month and year.

I used the below query to get month and year.

data

set claim_file

informat x_date date8.

format x_date date8.

M = month(x_date)

Y = year(x_date)

run

Out put should have a data set name like Jan_2011, Feb_2011 etc..

Please help.

Thanks.


Accepted Solutions
Solution
‎03-30-2012 08:19 AM
Respected Advisor
Posts: 3,124

Data step : Split data sets according to month

Here is a version works on your real data based on Art's code, you may need to change libref location though.

libname myxl 'h:\data.xls';

data temp;

set myxl.'sheet1$'n;

format dateoffill date9.;

dateoffill=input(put(dateoffill,8.),yymmdd8.);

month=catx('_',put(dateoffill,monname3.),year(dateoffill));

run;

proc sort data=temp;

by month;

run;

proc sql noprint;

  select quote(name)

    into :vars separated by ','

      from dictionary.columns

        where libname="WORK" and

              memname="TEMP" and

              name ne "month"

  ;

quit;

data _null_ ;

  dcl hash hh   () ;

  hh.definekey  ('k') ;

  hh.definedata (&vars.) ;

  hh.definedone () ;

  do k = 1 by 1 until ( last.month ) ;

    set temp ;

    by month;

    hh.add () ;

  end ;

  hh.output (dataset: month) ;

run ;

Regards,

Haikuo

View solution in original post


All Replies
Regular Contributor
Posts: 184

Data step : Split data sets according to month

See http://www.sascommunity.org/wiki/Split_Data_into_Subsets

Be sure to notice the advisory against doing this.

skallamp wrote:

Hi,

Could you please help me to split the data in to multiple data sets based on the Year and month.

For e.g My input data contains two year data, I want to split it in to 24 small data sets based on the month and year.

I used the below query to get month and year.

data

set claim_file

informat x_date date8.

format x_date date8.

M = month(x_date)

Y = year(x_date)

run

Out put should have a data set name like Jan_2011, Feb_2011 etc..

Please help.

Thanks.

Super User
Posts: 9,691

Data step : Split data sets according to month

data have;
 input month : yymmn. no;
 format month yymmn.;
 cards;
201001     1
201001     2
201002     1
201002     3
201003     4
201003     5
201004     4
201004     5
201004     6
201005     1
201005     5
201006     1
201006     5
201006     7
201006     8
201006     9
201007     5
201007     7
201008     9
201009     4
201010     10
201010     11
201011     1
201011     12
201012     7
201101     1
201102     11
201103     12
;
run;

proc sort data=have;by month;run;
data _null_ ;
  dcl hash hh   () ;
  hh.definekey  ('k') ;
  hh.definedata ('month','no') ;
  hh.definedone () ;
  do k = 1 by 1 until ( last.month ) ;
    set have ;
    by month groupformat ;
    format month monyy.;
    hh.add () ;
  end ;
  hh.output (dataset: put(month,monyy5.)) ;
run ;

Ksharp

Contributor
Posts: 32

Data step : Split data sets according to month

Thanks for the thoughts. I am getting the below error while applying it to my data.

15 data _null_ ;

16 dcl hash hh () ;

17 hh.definekey ('k') ;

18 hh.definedata ('RX_DATE','no') ;

19 hh.definedone () ;

20 do k = 1 by 1 until ( last.RX_DATE ) ;

21 set date ;

22 by rx_date groupformat ;

23 format rx_date monyy.;

24 hh.add () ;

25 end ;

26 hh.output(dataset: put(rx_date,monyy5.)) ;

27 run ;

ERROR: Undeclared data symbol no for hash object at line 19 column 3.

ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.

Contributor
Posts: 32

Data step : Split data sets according to month

If I have more variables, how do I put everything with out typing it one by one.

Here hh.definedata('rx_date') is giving me only one variable. I need all of them which is present in my data set.

Super Contributor
Posts: 1,636

Data step : Split data sets according to month

add all your variables:

hh.definedata('rx_date','your_var1','your_var2')

Contributor
Posts: 32

Data step : Split data sets according to month

My Variables are keep changing from one work to another.

Is there a way I can add all variables using a simple statement? somethin like *..

PROC Star
Posts: 7,366

Re: Data step : Split data sets according to month

I just learned something new from Hai.kuo but, since I created a new field in my suggested code, used proc sql to build a list of all the variables EXCEPT the one I created.  While a I agree with Howles that you probably DON'T want to do this, would something like the following suffice?:

data date;

  informat rx_date mmddyy8.;

  input rx_date no value;

  format rx_date date8.;

  cards;

01012010     1 1

01012010     2 2

02012010     1 3

02012010     3 4

03012010     4 3

03012010     5 2

04012010     4 1

04012010     5 0

04012010     6 7

05012010     1 6

05012010     5 5

06012010     1 4

06012010     5 3

06012010     7 2

06012010     8 1

06012010     9 2

07012010     5 3

07012010     7 4

08012010     9 5

09012010     4 6

10012010     10 7

10012010     11 8

11012010     1 9

11012010     12 10

12012010     7 33

01012011     1 32

02012011     11 47

03012011     12 24

;

data temp;

  set date;

  month=catx('_',put(rx_date,monname3.),year(rx_date));

run;

proc sort data=temp;

  by month;

run;

proc sql noprint;

  select quote(name)

    into :vars separated by ','

      from dictionary.columns

        where libname="WORK" and

              memname="TEMP" and

              name ne "month"

  ;

quit;

data _null_ ;

  dcl hash hh   () ;

  hh.definekey  ('k') ;

  hh.definedata (&vars.) ;

  hh.definedone () ;

  do k = 1 by 1 until ( last.month ) ;

    set temp ;

    by month;

    hh.add () ;

  end ;

  hh.output (dataset: month) ;

run ;

Contributor
Posts: 32

Data step : Split data sets according to month

Hi Art,

Its giving me the below error.

36 dcl hash hh () ;

37 hh.definekey ('k') ;

38 hh.definedata (&vars.) ;

_

386

200

76

WARNING: Apparent symbolic reference VARS not resolved.

ERROR 386-185: Expecting an arithmetic expression.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

39 hh.definedone () ;

ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase.

NOTE: The SAS System stopped processing this step because of errors.

PROC Star
Posts: 7,366

Data step : Split data sets according to month

Like Linlin mentioned, there was an error in my sql statement.  However, it should not have been "date" but, rather, "temp".  i.e.,

proc sql noprint;

  select quote(name)

    into :vars separated by ','

      from dictionary.columns

        where libname="WORK" and

              memname="TEMP" and

              name ne "month"

  ;

quit;

Since you ran my original code suggesting using "need", you should have gotten an error when you ran the proc sql code.

Art

Super Contributor
Posts: 1,636

Data step : Split data sets according to month

Art,

Why 'temp' not 'date'?  Thank you!

Contributor
Posts: 32

Data step : Split data sets according to month

Hi This is what I am using...still i m getting error

data temp;

  set claim_detail;

  month=catx('_',put(rx_date,monname3.),year(rx_date));

run;

proc sort data=temp;

  by month;

run;

proc sql noprint;

  select quote(name)

    into :vars separated by ','

      from dictionary.columns

        where libname="WORK" and

              memname="NEED" and

              name ne "month"

  ;

quit;

data _null_ ;

  dcl hash hh   () ;

  hh.definekey  ('k') ;

  hh.definedata (&vars.) ;

  hh.definedone () ;

  do k = 1 by 1 until ( last.month ) ;

    set temp ;

    by month;

    hh.add () ;

  end ;

  hh.output (dataset: month) ;

run ;

Super Contributor
Posts: 1,636

Data step : Split data sets according to month

change

memname="NEED"

to

memname="TEMP"

Contributor
Posts: 32

Data step : Split data sets according to month

My error is b'cse of this statement.

Temp output has blank value for month...my rx_date format is yyyymmdd format

month=catx(

),year(rx_date));

Super Contributor
Posts: 1,636

Re: Data step : Split data sets according to month

Hi Art,

I think ' memname="NEED" and' in your code should be ' memname="TEMP" and'

Thank you! - Linlin

Updated after Art's comments

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 24 replies
  • 2349 views
  • 6 likes
  • 7 in conversation