BookmarkSubscribeRSS Feed
Dhana18
Obsidian | Level 7
I have a data like this
IncidentID	CollDate	Specimen	TestType	Result
233	7/6/2017	Urethral	Culture	GC Positive
233	7/17/2017	Urethral	Culture	GC Negative
394	7/17/2017	Rectal	Culture	GC Negative
394	7/17/2017	Pharyngeal	Culture	GC Positive
394	7/24/2017	Pharyngeal	Culture	GC Negative
878	8/1/2017	       Endocervical	Culture	GC Negative
878	8/1/2017	       Pharyngeal	Culture	GC Positive
906	11/27/2017	Urethral	        Culture	GC Positive
906	11/27/2017	Pharyngeal	Culture	GC Negative
906	12/11/2017	Urethral	        Culture	GC Negative
916	11/29/2017	Endocervical	Culture	GC Positive
916	11/29/2017	Pharyngeal	Culture	GC Positive
916	1/10/2018	Endocervical	Culture	GC Negative
916	1/10/2018	Pharyngeal	Culture	GC Negative
830	11/29/2017	Pharyngeal	Culture	GC Positive
830	11/29/2017	Rectal	        Culture	GC Positive
830	11/29/2017	Endocervical	Culture	GC Negative
182	12/12/2017	Urethral	       Culture	GC Positive
182	12/12/2017	Rectal	      Culture	GC Positive

I am trying to analyze, how many of these id had more than one specimen collected? How many of them have more than one and or more specimen positive? How many days after same specimen tested negative for the same ID?

I want the table like this:


IncidentID	CollDate	Site tested	Site_infected	Days between Neg and Pos test
233	7/6/2017	       Urethral	                    urethra	11
394	7/17/2017	Rectal&Pharyngeal	    Pharyngeal	7
878	8/1/2017	       EndocervicalPharyngeal	Pharyngeal	NA
906	11/27/2017	Urethral&Pharyngeal	Urethral	14
916	11/29/2017	Endocervical&Pharyngeal	Pharyngeal	42
830	11/29/2017	Pharyngeal&Rectal&Endocervical	Pharyngeal& rectal	NA
182	12/12/2017	Urethral&Rectal	Urethral&Rectal	NA


I tried this code, but did not work
 data LABCHTLK.LAB_JAN20_CULTURE_TEST_D;
set LABCHTLK.LAB_JAN20_CULTURE_TEST_C;
LENGTH SITE_TESTED $20; 
IF SPECIMEN = "Urethral" then SITE_TESTED="URETHRAL_ONLY";
IF SPECIMEN = "Endocervical" then SITE_TESTED="Endocerv_Only";
IF SPECIMEN = "Pharyngeal" then SITE_TESTED="PHAR_only";
IF SPECIMEN = "Rectal" then SITE_TESTED="RECTAL_only" ;
IF SPECIMEN IN("Pharyngeal"  , "Rectal")then SITE_TESTED="PHAR&RECT";
IF SPECIMEN IN("Endocervical" , "Pharyngeal")then SITE_TESTED="ENDO&PHAR";
IF SPECIMEN IN("Pharyngeal" , "Urethral")then SITE_TESTED="URETH&PHAR";
IF SPECIMEN IN("Endocervical" , "Pharyngeal" ,"Rectal") then SITE_TESTED="ENDO&PHAR&RECT";
IF SPECIMEN IN("Pharyngeal" , "Rectal" , "Urethral") then SITE_TESTED="PHAR&RECT&URETH";
RUN;
6 REPLIES 6
PaigeMiller
Diamond | Level 26

Please explain "did not work". Are you getting the wrong results (if so, be specific about what is wrong)? Or is there an error in the log? In either case, please show us, and please provide the data following these instructions: How to create a data step version of your data AKA generate sample data for forums

--
Paige Miller
Dhana18
Obsidian | Level 7

log says

NOTE: There were 359 observations read from the data set LABCHTLK.LAB_JAN20_CULTURE_TEST_A.

NOTE: The data set LABCHTLK.LAB_JAN20_CULTURE_TEST_BAA has 359 observations and 18 variables.

NOTE: DATA statement used (Total process time):

real time 0.10 seconds

cpu time

The out put is this
IncidentID	CollDate	Specimen	TestType	Result	SITE_TESTED
233	7/6/2017	Urethral	Culture	GC Positive	PHAR&RECT&URETH
233	7/17/2017	Urethral	Culture	GC Negative	PHAR&RECT&URETH
394	7/17/2017	Rectal	Culture	GC Negative	PHAR&RECT&URETH
394	7/17/2017	Pharyngeal	Culture	GC Positive	PHAR&RECT&URETH
394	7/24/2017	Pharyngeal	Culture	GC Negative	PHAR&RECT&URETH
878	8/1/2017	Endocervical	Culture	GC Negative	ENDO&PHAR&RECT
878	8/1/2017	Pharyngeal	Culture	GC Positive	PHAR&RECT&URETH
906	11/27/2017	Urethral	Culture	GC Positive	PHAR&RECT&URETH
906	11/27/2017	Pharyngeal	Culture	GC Negative	PHAR&RECT&URETH
906	12/11/2017	Urethral	Culture	GC Negative	PHAR&RECT&URETH
916	11/29/2017	Endocervical	Culture	GC Positive	ENDO&PHAR&RECT
916	11/29/2017	Pharyngeal	Culture	GC Positive	PHAR&RECT&URETH
916	1/10/2018	Endocervical	Culture	GC Negative	ENDO&PHAR&RECT
916	1/10/2018	Pharyngeal	Culture	GC Negative	PHAR&RECT&URETH
830	11/29/2017	Pharyngeal	Culture	GC Positive	PHAR&RECT&URETH
830	11/29/2017	Rectal	Culture	GC Positive	PHAR&RECT&URETH
830	11/29/2017	Endocervical	Culture	GC Negative	ENDO&PHAR&RECT
182	12/12/2017	Urethral	Culture	GC Positive	PHAR&RECT&URETH
182	12/12/2017	Rectal	Culture	GC Positive	PHAR&RECT&URETH

It is not the way I like to have my table. I want the table I mentioned earlier. 

Shmuel
Garnet | Level 18

Next code is not tested and I believe it is not complete, which means that I have the filling

that it does not cover all possible combinations but it may give you some idea of how to deal with what you want.

Try it and check the output:

data have;
infile cards;
input IncidentID 
      CollDate mmddyy10.	
	  Specimen $	
	  TestType $ 
	  Result   $;
cards;
233	7/6/2017	Urethral	Culture	GC Positive
233	7/17/2017	Urethral	Culture	GC Negative
394	7/17/2017	Rectal	Culture	GC Negative
394	7/17/2017	Pharyngeal	Culture	GC Positive
394	7/24/2017	Pharyngeal	Culture	GC Negative
878	8/1/2017	Endocervical	Culture	GC Negative
878	8/1/2017	Pharyngeal	Culture	GC Positive
906	11/27/2017	Urethral	        Culture	GC Positive
906	11/27/2017	Pharyngeal	Culture	GC Negative
906	12/11/2017	Urethral	        Culture	GC Negative
916	11/29/2017	Endocervical	Culture	GC Positive
916	11/29/2017	Pharyngeal	Culture	GC Positive
916	1/10/2018	Endocervical	Culture	GC Negative
916	1/10/2018	Pharyngeal	Culture	GC Negative
830	11/29/2017	Pharyngeal	Culture	GC Positive
830	11/29/2017	Rectal	        Culture	GC Positive
830	11/29/2017	Endocervical	Culture	GC Negative
182	12/12/2017	Urethral	       Culture	GC Positive
182	12/12/2017	Rectal	      Culture	GC Positive
;
run;
proc sort data=have; 
  by IncidentID Specimen CollDate;
run;

data want;
 set have;
  by IncidentID Specimen;
     length Site_tested $40 Site_infected $40 days $3;
     retain save_colldate Site_tested Site_infected Days save_result;
     label days = "Days between Neg and Pos test";
	 
     if first.IncidentID then do;
	site_tested = ' ';
	Site_infected = ' ';
     end;
	
     if first.Specimen then do;
	save_colldate = colldate;
	if Site_tested = ' ' then Site_tested = Specimen;
	else site_tested = catx('&',site_tested,Specimen);
	save_result = result;
	if locase(result) = 'positive' then do;
	   if site_infected = ' ' then site_infected = Specimen;
	   else site_infected = catx('&',site_infected, Specimen);
	end;
     end;
     else if result ne save_result 
	  then days = put(colldate - save_colldate, 3.);
	  else days = 'NA';
	
    if last.IncidentID then output;
run;	    

I suggest to create a separate record for each IncidentID Specimen instead per IncidentID.

 

 

Dhana18
Obsidian | Level 7
Hi Thank you for taking time to help me.
The put put was like this after I run the codes you sent to me.
[cid:image001.png@01D5D5ED.3F031630]
The log says this

NOTE: The data set WORK.HAVE has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


926 ;
927 run;

928 proc sort data=have;
929 by IncidentID Specimen CollDate;
930 run;

NOTE: There were 19 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.HAVE has 19 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


931 data want;
932 set have;
933 by IncidentID Specimen;
934 length Site_tested $40 Site_infected $40 days $3;
935 retain save_colldate Site_tested Site_infected Days save_result;
936 label days = "Days between Neg and Pos test";
937
938 if first.IncidentID then do;
939 site_tested = ' ';
940 Site_infected = ' ';
941 end;
942
943 if first.Specimen then do;
944 save_colldate = colldate;
945 if Site_tested = ' ' then Site_tested = Specimen;
946 else site_tested = catx('&',site_tested,Specimen);
947 save_result = result;
948 if locase(result) = 'positive' then do;
------
68
ERROR 68-185: The function LOCASE is unknown, or cannot be accessed.

949 if site_infected = ' ' then site_infected = Specimen;
950 else site_infected = catx('&',site_infected, Specimen);
951 end;
952 end;
953 else if result ne save_result
954 then days = put(colldate - save_colldate, 3.);
955 else days = 'NA';
956
957 if last.IncidentID then output;
958 run;

NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
948:25
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds


959 data want;
960 set have;
961 by IncidentID Specimen;
962 length Site_tested $40 Site_infected $40 days $3;
963 retain save_colldate Site_tested Site_infected Days save_result;
964 label days = "Days between Neg and Pos test";
965
966 if first.IncidentID then do;
967 site_tested = ' ';
968 Site_infected = ' ';
969 end;
970
971 if first.Specimen then do;
972 save_colldate = colldate;
973 if Site_tested = ' ' then Site_tested = Specimen;
974 else site_tested = catx('&',site_tested,Specimen);
975 save_result = result;
976 if locase(result) = 'positive' then do;
------
68
ERROR 68-185: The function LOCASE is unknown, or cannot be accessed.

977 if site_infected = ' ' then site_infected = Specimen;
978 else site_infected = catx('&',site_infected, Specimen);
979 end;
980 end;
981 else if result ne save_result
982 then days = put(colldate - save_colldate, 3.);
983 else days = 'NA';
984
985 if last.IncidentID then output;
986 run;

NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
976:25
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 10 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time
Shmuel
Garnet | Level 18

Please replace locase into lowcase. It was a typo.

ballardw
Super User

You need to decide what you actually want.

Example: You are testing for the value "Endocervical" in 3 separate places, all with equality comparisons (that is what IN does: is the variable equal to any of the listed values) and attempting to assign the same variable Site_tested to different result values. So the last one encountered is going to be set. I suspect that you have way more "PHAR&RECT&URETH" values than you expected to see.

 

Also you have a "Days between Neg and Pos test" requirement. What do you want if there are multiple positive tests? Multiple negative tests? Is there a "within x days" requirement? And since you have multiple tests on the same day what are the rules involving one or more positive with one or more negative on the same day? These rules need a lot of fleshing out.

 

And if there are multiple positive tests what should Site_infected look like?

 

If you want to compare records across visits you need to either transpose the data to a wide format so all of the specimens and results are on one record (likely to be very messy) or create and maintain cumulative variables (Retain) and then keep the last record as the desired outcome. Since you have that "days between" requirement you will need to maintain a "first infection date" for comparisons with other collection dates.

 

I know that my testing data on similar topics can have 15 or more tests inside a 6 month period for some clients. So rules on any date limits need to be very carefully considered.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 606 views
  • 0 likes
  • 4 in conversation