BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
statz
Obsidian | Level 7

Hi, 

 

I am struggling in creating a macro to split a data into different sizes. For example, if I have 84 observations in my current data, and I want to split this into 4 datasets with sizes (10,20,30,24).

 

For example, I have the following

 

data new;
do i =1 to 84;
output;
end;
run;

 

how do i get the following datasets?

dataset1: i=1,2,3,4,5,6,7,8,9,10

dataset2: i=11,12,13,14,15,16,17,18,19,20, ...... 30

dataset3: i=31,32,33,34,..........................., 60

dataset4: i=61,62,63,.......... 84

 

and then for each dataset I output the following: mean,sd, histogram

 

Im thinking of creating a macro since the number of datasets to create and the sizes will change for every different original data....

 

Thanks..

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

To follow @Astounding's suggestion, you should do something like this:

 

data new;
do i =1 to 84;
    output;
    end;
run;

%macro mySplit(dsn,sizes);
data split;
do s = &sizes.;
    set+1;
    do j = 1 to s;
        set &dsn;
        output;
        end;
    end;
drop s j;
run;

ods graphics / imagename="&dsn._graph";
proc univariate data=split;
by set;
var i;
histogram;
output out=out_&dsn. mean=mi std=stdi;
run;
%mend mySplit;

%mySplit(new,%str(10,20,30,24));
PG

View solution in original post

5 REPLIES 5
Astounding
PROC Star

Most likely, the best advice would be this:  Don't do it!  Instead of splitting up the data, just add a new variable to your existing data set.  The new variable could be "1" for the first 10 observations, "2" for the next 20, "3" for the next 30, etc.

 

You can always process with a BY statement later to get statistics for each group, or possibly with a WHERE statement to select just a single group.

 

You'll save a lot of headaches trying to come up with data set names and tracking which is which.

 

Good luck.

PGStats
Opal | Level 21

To follow @Astounding's suggestion, you should do something like this:

 

data new;
do i =1 to 84;
    output;
    end;
run;

%macro mySplit(dsn,sizes);
data split;
do s = &sizes.;
    set+1;
    do j = 1 to s;
        set &dsn;
        output;
        end;
    end;
drop s j;
run;

ods graphics / imagename="&dsn._graph";
proc univariate data=split;
by set;
var i;
histogram;
output out=out_&dsn. mean=mi std=stdi;
run;
%mend mySplit;

%mySplit(new,%str(10,20,30,24));
PG
statz
Obsidian | Level 7

Thank you PG Stats! this is perfect! 🙂

LinusH
Tourmaline | Level 20
Kinda like @Astounding said, but don't do this.
A data set with 84 variables is unlikely normalized. If you normalize you will get a robust structure that seldom needs to be changed. Also you minimize the maintenance of having variable specific code and avoiding the need for macro coding.
Data never sleeps
statz
Obsidian | Level 7

Thanks... my main goal is to check if the data for each segment is normally distributed.... and in reality, I may have different number of observations. If the number of observations is small, then I might have only one  or 2 segments.... or when the data is large, i may have many segments. Thanks!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2383 views
  • 2 likes
  • 4 in conversation