Make two datasets based on specific statements

mmea · Posted 02-06-2021 04:37 AM

Hi

I have a data set looking like this:

Test place	Method	Capacity
Test place A	Normal	450
Test place A	Close contact	23
Test place B	Normal	500
Test place B	Close contact	900
Test place C	Normal	60
Test place C	Close contact	458
Test place D	Normal	47
Test place E	close contact	4

I need to have to datasets based on the value of capacity and method

If capacity is over or equal to 400 for method Normal, then this testplace should be in a dataset called OVER, ALSO the matching testplace but with method close contact should follow with, even though it has a capacity below 400. (e.g. example test place A and B)

For the case of test place C which has a capacity below 400 and its matching test place C close contact has over 400 - these corresponding places should be in the dataset UNDER.

We only consider whether the normal test place is over or under 400 - the close contact for that same test place shall always follow.

In the case for test place D, where it does not have a close contact place - then it should be in the UNDER dataset, as it is under 400.

For a case as test place E, where there is only a close contact, but it is over 400, then it should be in the UNDER dataset, as I dont want a close contact without a normal test place in the over dataset

Can someone help me please?

Kurt_Bremser · Posted 02-06-2021 04:57 AM

That's a nice example for using a hash object:

data
  over
  under
;
set have;
if _n_ = 1
then do;
  declare hash h (dataset:"have (where=(method = 'Normal' and capacity ge 400))");
  h.definekey("test_place");
  h.definedone();
end;
if h.check() = 0
then output over;
else output under;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

mmea · Posted 02-06-2021 05:22 AM

Hi.

This code only gives me a dataset with all testplaces over 400 and the other dataset with under 400.

It does not take into consideration the rules I showed before:)

If a test places - normal is over 400, but its corresponding close contact is under 400 - it should follow with in the OVER dataset.

if a test place - normal is under 400, but its corresponding close contact is over 400 - they should both be in the under dataset.

Each test place - normal is the one that determine where its should go with its corresponding close contact

Kurt_Bremser · Posted 02-06-2021 01:26 PM

Quote from your initial post:

"We only consider whether the normal test place is over or under 400 - the close contact for that same test place shall always follow."

So, in your own words, only the "normal" observation is to be considered.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

mmea · Posted 02-06-2021 01:42 PM

Yes the Normal testplace is the main thing that we look at to confirm whether it is over or under.

the normals, close contact should just always follow with the normal testplace, no matter the capacity of the close contact.

It is only if the close contact do NOT have a Normal test place then you will look at the capacityhere

Kurt_Bremser · Posted 02-06-2021 01:49 PM

Post example data in usable and unambiguous form (data step with datalines, DO NOT SKIP THIS), and the exact result expected out of this.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Tom · Posted 02-06-2021 05:24 PM

Transpose the data so that both capacity values are on the same observation.

Then it is easier to write your logical conditions.

data have;
  infile cards dlm='|' dsd truncover;
  input place :$30. Method :$30. Capacity ;
cards;
Test place A|Normal|450
Test place A|Close contact|23
Test place B|Normal|500
Test place B|Close contact|900
Test place C|Normal|60
Test place C|Close contact|458
Test place D|Normal|47
Test place E|Close contact|4
;

proc transpose data=have out=wide(drop=_name_);
  by place;
  id method ;
  var capacity;
run;

data want ;
  set wide ;
  if normal < 400 then target='UNDER';
  else target='OVER';
run;

proc print;
run;

                                  Close_
Obs       place        Normal    contact    target

 1     Test place A      450        23      OVER
 2     Test place B      500       900      OVER
 3     Test place C       60       458      UNDER
 4     Test place D       47         .      UNDER
 5     Test place E        .         4      UNDER

Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Re: Make two datasets based on specific statements

Registration is open