BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rcleven2
Obsidian | Level 7

 

I have a data set with patient disease diagnosis and treatment information. Ultimately I am trying to break it down into which providers are treating their patients appropriately. Each disease has a different recommended treatment so while some treatments will work for multiple disease other treatments are not as encompassing. 

 

My problem is this, I have diseases that are being treated adequately but are not being read into the correct data set based on my code. Below I have listed my code. I have an Adequate data set, an Inadequate data set and a data set for the observations that do not meet the criteria as a safety net. Disease 200 gets read in and placed appropriately depending on the treatment given. Disease 300 does not. Disease 300 is being split between inade and unknown. 

 

I decided to troubleshoot using a proc SQL and found that the appropriate treatment for disease 300 is not recognized by any statement I use ( in(...),   = '....'   ,  etc.) but another treatment, which is inadequate for that disease it being recognized. 

 

Is there some character within ...

'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO'

... that is keeping it from being recognized and read into the appropriate data set?

 

I have tried to rearrange the order of which disease I reference first. I have sorted previous data sets by disease and by treatment. I have even used and made completely different data sets and nothing works. What is the mystery behind why I can not get this treatment to sort to the correct data set!?

 


data ade inade unknown;
set work.gccl;
if test_disea = 300 and
rx in('ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO',
'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS DOXYCYCLINE/VIBRAMYCIN 100 MG')
then output ade;
else if test_disea = 200 and
rx in('AZITHROMYCIN/ZITHROMAX 1 G PO X ONCE (RECOMMENDED)',
'DOXYCYCLINE/VIBRAMYCIN 100 MG BID X 7 DAYS (RECOMMENDED)', 
'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO',
'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS DOXYCYCLINE/VIBRAMYCIN 100 MG')
then output ade;
else if rx =' ' or inad ='Y'
then output inade;
else output unknown;
run; 

/* I have also use the OR statement between the "test_disea = 200 and rx in() OR test_disea = 300 and rx in()" combination. Its yields the same results*/



proc sql;
create table work.adequatetreatment as
select patient_id, event_id, provider, month, year, test_disea, rx, inad
from work.districthds
where test_disea = 300
and RX = 'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO'
;
/* No data is read into the table */


proc sql;
create table work.adequatetreatment as
select patient_id, event_id, provider, month, year, test_disea, rx, inad
from work.districthds
where test_disea = 300
and RX = 'AZITHROMYCIN/ZITHROMAX 1 G PO X ONCE (RECOMMENDED)'
;
/* 4 Observations are read in */



 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
rcleven2
Obsidian | Level 7

Someone helped me with the issue. We discovered that the treatment for whatever reason was being truncated even though I had still changed the Length in troubleshooting. We found what the true observation was by using...

FORMAT rx;

INFORMAT rx:

IF test_disea in(200,300,700) THEN put RX= RX $HEX300.;

... to write the actual value to the log. Now that the existing format has been stripped away and I can see what the actual value is it has solved the problem. 

View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

 

Post some test data in the form of a datastep and example of what you want out at the end.  I cannot tell anything from a snippet of code.  Follow this post if you need help creating some test data;

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

PeterClemmensen
Tourmaline | Level 20

Provide some sample of what your data looks like.

 

That makes it so much easier to help you 🙂

ballardw
Super User

Did you run something like Proc Freq to find all of the values of your rx variable? Are the search strings you are using coming from such output? Copy from the output may work to capture the proper values. Also if proc freq shows to similar strings in the output that might be the place to look.

 

Probable causes for not matching:

  the displayed length (and value used in code) does not match actual value of the variable. A display format may be cutting off characters not seen at the end of the variable.

  one or more leading spaces in the variable may be an issue but not noticed depending on how you determined the values of rx.

  Zero ( 0 ) and capital O confusion in either the coded search string or the value

  2 or more consecutive blanks in data but not in the coded search string (or vice versa)

  one or more characters other than actual blank a the end of the variable such a NULL character (ASCI 255)

  Unicode values in the data

 

Invisible characters are possible if you copied any strings from a document such as Word or other formatted text.

rcleven2
Obsidian | Level 7

Yes I have run proc freqs on the data set showing all possible treatment options and how many times each has been entered. There are not duplicates or variations for the same treatments. Treatments and medications are not entered by free typing. they are selected from a drop down menu within a website. I export the set from the website and put it into SAS. 

ChrisNZ
Tourmaline | Level 20

On easy way to address most of @ballardw 's very valid list of possible issues it to compare the compressed (for example only keep letters and digits) string values. 

Reeza
Super User

Is there some character within ...

 

Not that we can see with the naked eye, but there can invisible characters that mess things up. 

 

As someone else mentioned, run a PROC FREQ and see what the results are, do they all get categorized together or are they separate groups? And if it still doesn't match, you can use a hex format to display the values and pay close attention. There are '09'x and 0A that are invisible spaces (not sure on the 0A). 

rcleven2
Obsidian | Level 7

Someone helped me with the issue. We discovered that the treatment for whatever reason was being truncated even though I had still changed the Length in troubleshooting. We found what the true observation was by using...

FORMAT rx;

INFORMAT rx:

IF test_disea in(200,300,700) THEN put RX= RX $HEX300.;

... to write the actual value to the log. Now that the existing format has been stripped away and I can see what the actual value is it has solved the problem. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 580 views
  • 3 likes
  • 6 in conversation