BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmea
Quartz | Level 8

Hi if I have a data set like this:

 

Test center  Capacity
Test center Toronto 460
Test center Toronto - close contact 80
Test center Melbourne 6566
Test center Melbourne - close contact 70
Test Center Copenhagen 666
Test Center Vaan 70
Test Center Thisted 152
Test Center Thisted - close contact 12
Test Center A 200
Test Center A - close contact 12

 

I want to make two new datasets out of this dataset (I have more observations in the real one)

The first dataset should contain all test center who has capacity over 400, but the matching "close contact" center should follow with, even though it can have below 400.

 

The second dataset should only contain test centers below 400 and the matching close contact center should follow with.

 

How can I do these statements?

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Do this

 

data have;
input Testplace $ 1-23 Method $ 24-37 Capacity;
datalines;
Test place Toronto     Normal         599 
Test place Toronto     Close contact  23  
Test place Copenhagen  Normal         699 
Test place Copenhagen  Close contact  23  
Test Place Thisted     Normal         159 
Test Place Thisted     Close contact  15  
;

data over under;
    set have;
    if Method = "Close contact" | Capacity >  400 then output over;
    else if Capacity <= 400 then output under;
run;

View solution in original post

10 REPLIES 10
PeterClemmensen
Tourmaline | Level 20

You mention that the "Close Contact" values should be in both data sets, is that correct?

mmea
Quartz | Level 8

No, if a test place is over 400

the matching close contact center should be in this dataset despite their capacity value.

it should NOT not appear in the dataset for under 400.

 

 

PeterClemmensen
Tourmaline | Level 20

Ok. Then do this.

 

data have;
input TestCenter $ 1-39 Capacity;
datalines;
Test center Toronto                    460 
Test center Toronto - close contact    80  
Test center Melbourne                  6566
Test center Melbourne - close contact  70  
Test Center Copenhagen                 666 
Test Center Vaan                       70  
Test Center Thisted                    152 
Test Center Thisted - close contact    12  
Test Center A                          200 
Test Center A - close contact          12  
;

data over under;
    set have;
    if find(TestCenter, 'close contact', 'i') | Capacity >  400 then output over;
    else if Capacity <= 400 then output under;
run;
mmea
Quartz | Level 8

Thanks!!

 

In a case where the data set can look like this, how will I then do the same:

 

Test place Method Capacity
Test place Toronto Normal 599
Test place Toronto Close contact 23
Test place Copenhagen Normal 699
Test place Copenhagen Close contact 23
Test Place Thisted Normal 159
Test Place Thisted Close contact 15

 

So a data set for all "normal" over 400 but the matching "close contact" shall follow.

second data set for all "normal" under 400 and with their matching close contact

PeterClemmensen
Tourmaline | Level 20

Do this

 

data have;
input Testplace $ 1-23 Method $ 24-37 Capacity;
datalines;
Test place Toronto     Normal         599 
Test place Toronto     Close contact  23  
Test place Copenhagen  Normal         699 
Test place Copenhagen  Close contact  23  
Test Place Thisted     Normal         159 
Test Place Thisted     Close contact  15  
;

data over under;
    set have;
    if Method = "Close contact" | Capacity >  400 then output over;
    else if Capacity <= 400 then output under;
run;
mmea
Quartz | Level 8

Hi 

I did this approach 

data over under;
    set test;
    if method= "close contact" | capacity>  400 then output over;
    else if capacity<= 400 then output under;
run;

But I still get some close contact places  that has a capacity over 400 in the over dataset. But it should be really be in the under dataset, as the normal place for that close contact place has a capacity under 400?

 

So if i have this kindof data:

Test place Method capacity
test place toronto Normal 23
test place toronto  close contact 455

 

then these to testplaces should be in the UNDER dataset as the normal is under 400.

how can I correct for that?

Angel_Larrion
SAS Employee

This should work:

 

data have;
infile datalines dlm=",";
length Test_place $100 Method $20;
input Test_place $ method $ capacity;
datalines;
Test place Toronto,Normal,599
Test place Toronto,Close contact,23
Test place Copenhagen,Normal,699
Test place Copenhagen,Close contact,23
Test Place Thisted,Normal,159
Test Place Thisted,Close contact,15
;
run;


proc sql;
create table want1  as
select have.*, flag
from have as t1 left join (select test_place, 1 as flag
							from have
							where method="Normal" and capacity ge 400) as t2 

				on t1.test_place=t2.test_place;
quit;

data over under;
set want1;
if flag=1 then output over;
else output under;
drop flag;
run;

proc sql;
drop table want1;
run;

 

 

Angel_Larrion
SAS Employee

The first proc sql creates a flag that indicates the test places that meet the condition (capacity of normal method=>400).

The data step uses this flag to create the two data sets.

mmea
Quartz | Level 8

Thank you so much for both of your answers they both worked - Thank you!

mmea
Quartz | Level 8

Hi  again I revised my question if anyone can help me here

 

if I have a data set like this:

 

Test center  Capacity
Test center Toronto 460
Test center Toronto - close contact 80
Test center Melbourne 6566
Test center Melbourne - close contact 70
Test Center Copenhagen 666
Test Center Vaan 70
Test Center Thisted 152
Test Center Thisted - close contact 500
Test Center A 200
Test Center A - close contact 400

 

I want to make two new datasets out of this dataset (I have more observations in the real one)

The first dataset should contain all test center who has capacity over 400, but the matching "close contact" center should follow with, even though it can have below 400.

 

The second dataset should only contain test centers below 400 and the matching close contact center should follow with, even though the close contract can be over 400 (e.g. test center A and test center A close contact)

 

How can I do these statements?

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1619 views
  • 0 likes
  • 3 in conversation