Solved
New Contributor
Posts: 4

Prioritising a variable whilst doing a 50/50 Split in EG

Hi am currently trying to split a cell 50/50 but I need to prioritise a variable to appear in the first cell. Example:

Table A has priority 1 & 2 cells, which I need to split 50/50 yet it's currently moving all priority 1 into cell 1 and then splitting priority 2 50/50 between cell 1 & 2. Therefore cell 1 has 20k of priority 1 and 35k of priority 2 and cell 2 just has 35k priority 2. It's easy enough to do this if I know what volumes are involved by a count but I would prefer a random split.

What I want to achieve:

Table A = 90k

cell1 = 20k priority 1

cell1 = 25k priority 2

cell2 = 45k priority 2

total = 90k

Currently have:

Table A = 90k

cell1 = 20k priority 1

cell1 = 35k priority 2

cell2 = 35k priority 2

total = 90k

The information is very sensitive so difficult to insert code.

Thanks

Accepted Solutions
Solution
‎07-12-2018 06:07 AM
Super User
Posts: 10,571

Re: Prioritising a variable whilst doing a 50/50 Split in EG

Try this (automatically retrieve the observation count):

``````data cell002;
set sashelp.class (rename=(sex=priority));
run;

proc sort data=cell002;
by priority;
run;

proc sql noprint;
select int(nobs/2) into :nobs_half
from dictionary.tables
where libname = 'WORK' and memname = 'CELL002';
quit;

data cell002_003;
set cell002;
if _n_ <= &nobs_half
then cellid = '002';
else cellid = '003';
run;``````
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

All Replies
Super User
Posts: 10,571

Re: Prioritising a variable whilst doing a 50/50 Split in EG

Make up some example data, post it here in a data step (see my footnotes for how to post code), and also post the expected output from that.

Since SAS does not have the concept of cells, but works (like any database) with rows and columns, it's hard to get a mental picture from what you wrote.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
PROC Star
Posts: 1,334

Re: Prioritising a variable whilst doing a 50/50 Split in EG

First question: Are you trying to do this with the EG point-and-click facilities, or are you using SAS code in a code window?

Second: I'm afraid that I don't understand your requirements. Could you elaborate a little more on what the data stores look like?

Tom

New Contributor
Posts: 4

Re: Prioritising a variable whilst doing a 50/50 Split in EG

Morning, I am actually using SAS code in a code window rather than point and click in EG. My data contains 89,382 records which I need to split out into two separate cells but I need to prioritise one of the cells above the other. I was doing the following code to randomly sort them out but this wouldn't allow me to prioritise the data.

``````proc sort data = cell002;
by priority;
run;

/* ALSO CELL 002 SPLIT 50% WITH CELL 003 */
data cell002_003;
set cell002;
if priority = '2' and mod (_n_,2) = 1
then do;
cellid = '003';
end;
run;``````

I found the only way I could get around this was to use the count feature but this required me to know the exact volumes before hand.

``````proc sort data = cell002;
by priority;
run;

/* ALSO CELL 002 SPLIT 50% WITH CELL 003 */
data cell002_003;
set cell002;
if _n_ <= 44691 then cellid = '002';
else if _n_ > 44691 then cellid = '003';
run;``````

Am not sure if this help you at all as it's not the easiest to explain. But I need cell's 002 & 003 to hold equal volumes but I also need to prioritise 19,176 records that have the variable Priority that's equal to 1 into cell002 and then the remaining split out.

Solution
‎07-12-2018 06:07 AM
Super User
Posts: 10,571

Re: Prioritising a variable whilst doing a 50/50 Split in EG

Try this (automatically retrieve the observation count):

``````data cell002;
set sashelp.class (rename=(sex=priority));
run;

proc sort data=cell002;
by priority;
run;

proc sql noprint;
select int(nobs/2) into :nobs_half
from dictionary.tables
where libname = 'WORK' and memname = 'CELL002';
quit;

data cell002_003;
set cell002;
if _n_ <= &nobs_half
then cellid = '002';
else cellid = '003';
run;``````
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
New Contributor
Posts: 4

Re: Prioritising a variable whilst doing a 50/50 Split in EG

Hi it seems to be falling over immediately.

``````32         proc sql noprint;
33         	select int(nobs/2) into :nobs_half
34         	from dictionary.tables
35         	where libname = 'work' and memname = 'cell002';
NOTE: No rows were selected.
36         quit;
NOTE: PROCEDURE SQL used (Total process time):
real time           0.00 seconds
user cpu time       0.00 seconds
system cpu time     0.00 seconds
memory              5160.37k
OS Memory           38316.00k
Timestamp           12/07/2018 08:51:55 AM
Step Count                        82  Switch Count  34
Page Faults                       0
Page Reclaims                     13
Page Swaps                        0
Voluntary Context Switches        84
Involuntary Context Switches      1
Block Input Operations            0
Block Output Operations           0

37
38         data cell002_003;
39         	set work.cell002;
40         		if _n_ <= &nobs_half
WARNING: Apparent symbolic reference NOBS_HALF not resolved.
40         		if _n_ <= &nobs_half
_
22
ERROR 22-322: Syntax error, expecting one of the following: a name,
a quoted string, a numeric constant, a datetime constant,
2                                The SAS System    07:45 Thursday, July 12, 2018

a missing value, bitstring, INPUT, PUT.

41         		then cellid = '002';
______ _
180    180
ERROR 180-322: Statement is not valid or it is used out of proper order.

42         		else cellid = '003';
43         run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CELL002_003 may be incomplete.  When this step was
stopped there were 0 observations and 74 variables.
WARNING: Data set WORK.CELL002_003 was not replaced because this step was
stopped.``````
New Contributor
Posts: 4

Re: Prioritising a variable whilst doing a 50/50 Split in EG

thanks you, just changed some of the code and works perfectly
Super User
Posts: 10,571

Re: Prioritising a variable whilst doing a 50/50 Split in EG

As you probably noticed, library and member names in SAS are always uppercase. Only variable names can have mixed case.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
☑ This topic is solved.