Hi if I have a data set like this:
Test center | Capacity |
Test center Toronto | 460 |
Test center Toronto - close contact | 80 |
Test center Melbourne | 6566 |
Test center Melbourne - close contact | 70 |
Test Center Copenhagen | 666 |
Test Center Vaan | 70 |
Test Center Thisted | 152 |
Test Center Thisted - close contact | 12 |
Test Center A | 200 |
Test Center A - close contact | 12 |
I want to make two new datasets out of this dataset (I have more observations in the real one)
The first dataset should contain all test center who has capacity over 400, but the matching "close contact" center should follow with, even though it can have below 400.
The second dataset should only contain test centers below 400 and the matching close contact center should follow with.
How can I do these statements?
Do this
data have;
input Testplace $ 1-23 Method $ 24-37 Capacity;
datalines;
Test place Toronto Normal 599
Test place Toronto Close contact 23
Test place Copenhagen Normal 699
Test place Copenhagen Close contact 23
Test Place Thisted Normal 159
Test Place Thisted Close contact 15
;
data over under;
set have;
if Method = "Close contact" | Capacity > 400 then output over;
else if Capacity <= 400 then output under;
run;
You mention that the "Close Contact" values should be in both data sets, is that correct?
No, if a test place is over 400
the matching close contact center should be in this dataset despite their capacity value.
it should NOT not appear in the dataset for under 400.
Ok. Then do this.
data have;
input TestCenter $ 1-39 Capacity;
datalines;
Test center Toronto 460
Test center Toronto - close contact 80
Test center Melbourne 6566
Test center Melbourne - close contact 70
Test Center Copenhagen 666
Test Center Vaan 70
Test Center Thisted 152
Test Center Thisted - close contact 12
Test Center A 200
Test Center A - close contact 12
;
data over under;
set have;
if find(TestCenter, 'close contact', 'i') | Capacity > 400 then output over;
else if Capacity <= 400 then output under;
run;
Thanks!!
In a case where the data set can look like this, how will I then do the same:
Test place | Method | Capacity |
Test place Toronto | Normal | 599 |
Test place Toronto | Close contact | 23 |
Test place Copenhagen | Normal | 699 |
Test place Copenhagen | Close contact | 23 |
Test Place Thisted | Normal | 159 |
Test Place Thisted | Close contact | 15 |
So a data set for all "normal" over 400 but the matching "close contact" shall follow.
second data set for all "normal" under 400 and with their matching close contact
Do this
data have;
input Testplace $ 1-23 Method $ 24-37 Capacity;
datalines;
Test place Toronto Normal 599
Test place Toronto Close contact 23
Test place Copenhagen Normal 699
Test place Copenhagen Close contact 23
Test Place Thisted Normal 159
Test Place Thisted Close contact 15
;
data over under;
set have;
if Method = "Close contact" | Capacity > 400 then output over;
else if Capacity <= 400 then output under;
run;
Hi
I did this approach
data over under;
set test;
if method= "close contact" | capacity> 400 then output over;
else if capacity<= 400 then output under;
run;
But I still get some close contact places that has a capacity over 400 in the over dataset. But it should be really be in the under dataset, as the normal place for that close contact place has a capacity under 400?
So if i have this kindof data:
Test place | Method | capacity |
test place toronto | Normal | 23 |
test place toronto | close contact | 455 |
then these to testplaces should be in the UNDER dataset as the normal is under 400.
how can I correct for that?
This should work:
data have;
infile datalines dlm=",";
length Test_place $100 Method $20;
input Test_place $ method $ capacity;
datalines;
Test place Toronto,Normal,599
Test place Toronto,Close contact,23
Test place Copenhagen,Normal,699
Test place Copenhagen,Close contact,23
Test Place Thisted,Normal,159
Test Place Thisted,Close contact,15
;
run;
proc sql;
create table want1 as
select have.*, flag
from have as t1 left join (select test_place, 1 as flag
from have
where method="Normal" and capacity ge 400) as t2
on t1.test_place=t2.test_place;
quit;
data over under;
set want1;
if flag=1 then output over;
else output under;
drop flag;
run;
proc sql;
drop table want1;
run;
The first proc sql creates a flag that indicates the test places that meet the condition (capacity of normal method=>400).
The data step uses this flag to create the two data sets.
Thank you so much for both of your answers they both worked - Thank you!
Hi again I revised my question if anyone can help me here
if I have a data set like this:
Test center | Capacity |
Test center Toronto | 460 |
Test center Toronto - close contact | 80 |
Test center Melbourne | 6566 |
Test center Melbourne - close contact | 70 |
Test Center Copenhagen | 666 |
Test Center Vaan | 70 |
Test Center Thisted | 152 |
Test Center Thisted - close contact | 500 |
Test Center A | 200 |
Test Center A - close contact | 400 |
I want to make two new datasets out of this dataset (I have more observations in the real one)
The first dataset should contain all test center who has capacity over 400, but the matching "close contact" center should follow with, even though it can have below 400.
The second dataset should only contain test centers below 400 and the matching close contact center should follow with, even though the close contract can be over 400 (e.g. test center A and test center A close contact)
How can I do these statements?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.