Sorting observations from one Data set into 3 Data Sets

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Sorting observations from one Data set into 3 Data Sets

 

I have a data set with patient disease diagnosis and treatment information. Ultimately I am trying to break it down into which providers are treating their patients appropriately. Each disease has a different recommended treatment so while some treatments will work for multiple disease other treatments are not as encompassing. 

 

My problem is this, I have diseases that are being treated adequately but are not being read into the correct data set based on my code. Below I have listed my code. I have an Adequate data set, an Inadequate data set and a data set for the observations that do not meet the criteria as a safety net. Disease 200 gets read in and placed appropriately depending on the treatment given. Disease 300 does not. Disease 300 is being split between inade and unknown. 

 

I decided to troubleshoot using a proc SQL and found that the appropriate treatment for disease 300 is not recognized by any statement I use ( in(...),   = '....'   ,  etc.) but another treatment, which is inadequate for that disease it being recognized. 

 

Is there some character within ...

'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO'

... that is keeping it from being recognized and read into the appropriate data set?

 

I have tried to rearrange the order of which disease I reference first. I have sorted previous data sets by disease and by treatment. I have even used and made completely different data sets and nothing works. What is the mystery behind why I can not get this treatment to sort to the correct data set!?

 


data ade inade unknown;
set work.gccl;
if test_disea = 300 and
rx in('ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO',
'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS DOXYCYCLINE/VIBRAMYCIN 100 MG')
then output ade;
else if test_disea = 200 and
rx in('AZITHROMYCIN/ZITHROMAX 1 G PO X ONCE (RECOMMENDED)',
'DOXYCYCLINE/VIBRAMYCIN 100 MG BID X 7 DAYS (RECOMMENDED)', 
'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO',
'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS DOXYCYCLINE/VIBRAMYCIN 100 MG')
then output ade;
else if rx =' ' or inad ='Y'
then output inade;
else output unknown;
run; 

/* I have also use the OR statement between the "test_disea = 200 and rx in() OR test_disea = 300 and rx in()" combination. Its yields the same results*/



proc sql;
create table work.adequatetreatment as
select patient_id, event_id, provider, month, year, test_disea, rx, inad
from work.districthds
where test_disea = 300
and RX = 'ROCEPHIN/CEFTRIAXONE 250 MG IM X ONCE PLUS AZITHROMYCIN/ZITHROMAX 1 G PO'
;
/* No data is read into the table */


proc sql;
create table work.adequatetreatment as
select patient_id, event_id, provider, month, year, test_disea, rx, inad
from work.districthds
where test_disea = 300
and RX = 'AZITHROMYCIN/ZITHROMAX 1 G PO X ONCE (RECOMMENDED)'
;
/* 4 Observations are read in */



 

 

 

 

 

 


Accepted Solutions
Solution
‎01-16-2018 08:34 AM
New Contributor
Posts: 4

SOLVED!!!

Someone helped me with the issue. We discovered that the treatment for whatever reason was being truncated even though I had still changed the Length in troubleshooting. We found what the true observation was by using...

FORMAT rx;

INFORMAT rx:

IF test_disea in(200,300,700) THEN put RX= RX $HEX300.;

... to write the actual value to the log. Now that the existing format has been stripped away and I can see what the actual value is it has solved the problem. 

View solution in original post


All Replies
Super User
Super User
Posts: 8,634

Re: Sorting observations from one Data set into 3 Data Sets

 

Post some test data in the form of a datastep and example of what you want out at the end.  I cannot tell anything from a snippet of code.  Follow this post if you need help creating some test data;

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

PROC Star
Posts: 1,070

Re: Sorting observations from one Data set into 3 Data Sets

Provide some sample of what your data looks like.

 

That makes it so much easier to help you Smiley Happy

Super User
Posts: 12,148

Re: Sorting observations from one Data set into 3 Data Sets

Did you run something like Proc Freq to find all of the values of your rx variable? Are the search strings you are using coming from such output? Copy from the output may work to capture the proper values. Also if proc freq shows to similar strings in the output that might be the place to look.

 

Probable causes for not matching:

  the displayed length (and value used in code) does not match actual value of the variable. A display format may be cutting off characters not seen at the end of the variable.

  one or more leading spaces in the variable may be an issue but not noticed depending on how you determined the values of rx.

  Zero ( 0 ) and capital O confusion in either the coded search string or the value

  2 or more consecutive blanks in data but not in the coded search string (or vice versa)

  one or more characters other than actual blank a the end of the variable such a NULL character (ASCI 255)

  Unicode values in the data

 

Invisible characters are possible if you copied any strings from a document such as Word or other formatted text.

New Contributor
Posts: 4

Re: Sorting observations from one Data set into 3 Data Sets

Yes I have run proc freqs on the data set showing all possible treatment options and how many times each has been entered. There are not duplicates or variations for the same treatments. Treatments and medications are not entered by free typing. they are selected from a drop down menu within a website. I export the set from the website and put it into SAS. 

PROC Star
Posts: 1,922

Re: Sorting observations from one Data set into 3 Data Sets

On easy way to address most of @ballardw 's very valid list of possible issues it to compare the compressed (for example only keep letters and digits) string values. 

Super User
Posts: 21,481

Re: Sorting observations from one Data set into 3 Data Sets

Is there some character within ...

 

Not that we can see with the naked eye, but there can invisible characters that mess things up. 

 

As someone else mentioned, run a PROC FREQ and see what the results are, do they all get categorized together or are they separate groups? And if it still doesn't match, you can use a hex format to display the values and pay close attention. There are '09'x and 0A that are invisible spaces (not sure on the 0A). 

Solution
‎01-16-2018 08:34 AM
New Contributor
Posts: 4

SOLVED!!!

Someone helped me with the issue. We discovered that the treatment for whatever reason was being truncated even though I had still changed the Length in troubleshooting. We found what the true observation was by using...

FORMAT rx;

INFORMAT rx:

IF test_disea in(200,300,700) THEN put RX= RX $HEX300.;

... to write the actual value to the log. Now that the existing format has been stripped away and I can see what the actual value is it has solved the problem. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 127 views
  • 3 likes
  • 6 in conversation