- Subsetting Dataset based on number of observations

2 weeks ago

Hi Experts,

I need to create an automation system for creating subsets based on number of observations present in master dataset. For example: if my master dataset has 100 observations, I need to create 4 subsets each having 25 observations. I am unable to think of a logic. If my master dataset has 200 observations then I need to create 4 datasets of 50 observations each. Please help me in writing this logic.

Thank you,

Gurpreet

Posted in reply to gurpreetkaur

2 weeks ago

Get number of observations, then run code, e.g:

proc sql noprint; select nobs/4 into :cnt from dictionary.tables where libname="<yourlib>" and memname="<yourdataset>"; quit; data want1 want2 want3 want4; set <yourlib>.<yourdataset>; if _n_ < (1 * &cnt.) then output want1; else if _n_ < (2 *&cnt.) then output want2; ...; run;

Posted in reply to gurpreetkaur

2 weeks ago

If your data source is a SAS data set, you can pull the number of observations and divide them up in the same step:

data subset1 subset2 subset3 subset4;

set have nobs=_nobs_;

if _n_ <= _nobs_ / 4 then output subset1;

else if _n_ <= _nobs_ / 2 then output subset2;

else if _n_ <= _nobs_ * 3 / 4 then output subset3;

else output subset4;

run;

Posted in reply to gurpreetkaur

2 weeks ago

```
proc surveyselect data=sashelp.air out=want group=4;
run;
```

Posted in reply to Ksharp

2 weeks ago

Nice, I always forget the surveyselect function - even though I posted on it some time back myself )

2 weeks ago - last edited 2 weeks ago

ME TOO !

remember and forget something everyday .