BookmarkSubscribeRSS Feed
mayasak
Quartz | Level 8
data Exposure;
input Hospital $ Bug $ Drug $ it_n ps_n ns_n;
datalines;
ABC Ecoli MC 22 0.5 11
ABC EcoliES MC 13 0.7 9.1
XYZ Kleb MN 100 0.6 60
XYZ KlebES MN 33 1 33
RTY Ecoli FL 50 0.4 20
RTY EcoliES FL 20 0.1 2
;;;
run;

Hi,

I have the above data sample. The data contains the following variables Hospital, drug, bug, it_n (number of tests, ps_n (%susceptibility) ns_n (number of susceptible tests).

Some hospitals as in the dataset have values for 2 bugs instead of one,  we need to keep values for Ecoli and Kleb and remove EcoliES and KlebES based on this formula:  Total number of tests susceptible to Drug/ Total number of tests, for example, we'll end up with Ecoli only and calculate the values as (11 + 9.1 / 22+ 13) = 60%, so 0.6 for ps_n 20.1  for ns_n and 35 for it_n.

Is there a way to do this in SAS?

I appreciate your help.

Thank you,

Razina

2 REPLIES 2
whymath
Lapis Lazuli | Level 10

You may want a DOW-Loop way:

data want;
  do until(last.drug);
    set exposure;
    by hospital drug notsorted;
    sum_it_n=sum(sum_it_n,it_n);
    sum_ns_n=sum(sum_ns_n,ns_n);
  end;

  do until(first.drug);
    set exposure;
    by hospital drug notsorted;
    it_n=sum_it_n;
    ns_n=sum_ns_n;
    ps_n=ns_n/it_n;
  end;
run;

Dow-Loop: 052-2012: The DOW Loop: A Smarter Approach to Your Existing Code (sas.com)

mkeintz
PROC Star

Your sample data leaves a lot to presume.  I presume that

  1. Each hospital either has just one value for bug, or two values for bug, (one of the form xxxES and the other of the form xxx).
  2. Each hospital has a constant value for DRUG.
  3. You want one record per hospital with
    1. aggregate counts (it_n and ns_n)
    2. recalculated ps_n
    3. a bug value without the trailing "ES".
  4. The data are already grouped by hospital.

 

data Exposure;
  input Hospital $ Bug $ Drug $ it_n ps_n ns_n;
datalines;
ABC Ecoli   MC  22 0.5 11
ABC EcoliES MC  13 0.7  9.1
XYZ Kleb    MN 100 0.6 60
XYZ KlebES  MN  33 1   33
RTY Ecoli   FL  50 0.4 20
RTY EcoliES FL  20 0.1  2
run; 

data want (drop=sum_:);
  set exposure;
  by hospital notsorted;
  if first.hospital then call missing(sum_itn,sum_nsn);
  sum_itn+it_n;
  sum_nsn+ns_n;
  if last.hospital;
  it_n=sum_itn;
  ns_n=sum_nsn;
  ps_n=ns_n/it_n;
  if find(cats(bug,'!'),'ES!') then bug=substr(bug,1,length(bug)-2);
run;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 268 views
  • 0 likes
  • 3 in conversation