- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I'm working with bacteria names and here is what I would like to do. Each animal may have up to 5 types of bacteria. I want to count all the staphylococci that are not aureus. My first step would surely be to separate the genus and species in two variables... but I'm not sure how.... Or is it possible to simply calculate all the sample that have "staphylococcus" as first word and NOT "aureus as the second word?
My data are sensitive but I,m gonna create a little example
In the end, I want to create a binary variable, if one of the 5 species of the cow
data r_mam;
input cow SP_1$ SP_2$ SP_3$ SP_4$ SP_5$;
cards;
1 Streptoccous dysgalactiae Klebsiella pneumoniae Staphylococcus chromogenes
2 Staphylococcus aureus Staphylococcus xylosus Escherichia coli
3 Streptococcus uberis
;
run;
was a non-aureus Staphylococci, the variable is 1.
Thank you very much for your help!!!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Like this?
data cows;
set work.r_mam;
length all $ 200 sna 8;
all = catx(' ', of sp_1-sp_5);
sna = prxmatch('/staphylococcus (?!aureus)/i', all) and not prxmatch('/staphylococcus aureus/i', all);
drop all;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Annie_Fréchette wrote:
Hi, I'm working with bacteria names and here is what I would like to do. Each animal may have up to 5 types of bacteria. I want to count all the staphylococci that are not aureus. My first step would surely be to separate the genus and species in two variables... but I'm not sure how....
data r_mam;
infile cards truncover;
input cow SP_1 :$16. SP_2 :$16. SP_3 :$16. SP_4 :$16. SP_5 :$16.;
cards;
1 Streptoccous dysgalactiae Klebsiella pneumoniae Staphylococcus chromogenes
2 Staphylococcus aureus Staphylococcus xylosus Escherichia coli
3 Streptococcus uberis
;
run;
data want;
set r_mam;
array a sp_1 sp_3 sp_5;
array b sp_2 sp_4 sp_6;
do i=1 to dim(a);
if not missing(a(i)) then genus=a(i);
if not missing(b(i)) then species=b(i);
if not missing(a(i)) then output;
end;
drop sp_: i;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Paige, thank but it is not doing the right thing... each variable SP_ contain these infos (genus specie)
I would like to have on one line Genus_1 specie_1 Genus_2 specie_2 etc....
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The provided data-step does not work as expected: text is truncated ...
Is this as close as possible to what you have:
data r_mam;
length
cow 8
SP_1-SP_5 $ 40
;
infile datalines4 delimiter=';' missover;
input cow SP_1 SP_2 SP_3 SP_4 SP_5;
datalines4;
1;Streptoccous dysgalactiae;Klebsiella pneumoniae;Staphylococcus chromogenes
2;Staphylococcus aureus;Staphylococcus xylosus;Escherichia coli
3;Streptococcus uberis
;;;;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
yes this is correct!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi! I just wanted to give you an example of what I'm dealing with. My data are from an access file and between the Genus and species ther is a space (in SP_1, SP_2 etc)... Not sure how to explain better ... sorry!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Annie_Fréchette wrote:
yes this is correct!
You can use scan to look at each word or regular expressions.
What do you expect as result? The cows with staphylococcus, but not aureus? Or just the count?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What I want in the end is : if one of the Sp_1/SP_2/ etc was Staphylococci genus but Not an aureus species, SNA(a new binary variable)=1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Like this?
data cows;
set work.r_mam;
length all $ 200 sna 8;
all = catx(' ', of sp_1-sp_5);
sna = prxmatch('/staphylococcus (?!aureus)/i', all) and not prxmatch('/staphylococcus aureus/i', all);
drop all;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much! Can I ask you one more question? I don't see in the code how Sas know to attribute "1" for the SNA? Where it is write?
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @andreas_lds !
@Cynthia_sasraised up a good point... that is problematic at this moment with the code.... If a cow have a Staphylocooccus aureus AND a Staphylococcus xylosus I would lik my SNA=1.. presently this is not the case...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Annie_Fréchette wrote:
Hi @andreas_lds !
@Cynthia_sasraised up a good point... that is problematic at this moment with the code.... If a cow have a Staphylocooccus aureus AND a Staphylococcus xylosus I would lik my SNA=1.. presently this is not the case...
Of course not, because you have defined the flag-variable differently. But you still want sna = 0, if you find Staph. aureus and, e.g. Staph. epidermis (don't know if that is possible) are found? Please be precise! Maybe defining sna=0 is easier, if in understood all your posts, you want sna = 0 if
a) the only Staphylococcus found is aureus, or
b) no Staph. is found at all.
Right?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andreas! Sorry if my toughts were not clear about I wanted. I used the code from @SASJedi and it worked as I needed!
Thanks agains for your help!
Annie
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
What if a cow has THIS row of data with 2 types of Staphylococcus values, one aureus and one not:
2 Staphylococcus aureus Staphylococcus xylosus Escherichia coli
Then what would your binary variable look like? SNA=1 ?
What about this?
9 Staphylococcus aureus Staphylococcus xylosus Staphylococcus chromogenes Escherichia coli
Would SNA=1 or would SNA=2?
Cynthia