About ammarhm

ammarhm · ‎04-18-2018

Thank you @Astounding I have to say I am very impressed by the solution you suggested using proc format. I like the simplicity yet lateral thinking approach here. Could you please elaborate on how to automate the maximun number of ICD columns (in the example i gave there were 3 columns and you created and array: array icds {3} L1-L3;) is there a way to expand this code so that it could account for any number of columns in the ICD10 table I had at the beginning of the example?

ammarhm · ‎04-18-2018

A quick question I am tying to join 2 tables using: The first table is ICD10s: data ICD10; input Group:$ L1:$ L2:$ L3:$ ; datalines; 1 F70 E16 G12 2 F71 E17 . 3 F72 E18 H4 4 F73 E19 . 5 F74 E76 . run; The second table has patient names and their ICD10 code What I am trying to achieve is to get the group number from the table above based on the matched ICD codes Say that a patient has an ICD code of E17, then I need the combined table to contain the patient's name and group "2" extracted from the table above If the patient has ICD of G12 then the column should have the group number 1 How could this be achieved please? The problem here is that I dont know beforehand how many rows and columns the ICD table above will contain, so it needs to be a "generic" code... Apologies if I am not making sense here!

ammarhm · ‎04-16-2018

Thank you Reeza, I fully agree that there should be certain rules enforced. However, in my example the only thing the users are allowed to do is to add rows/columns, and I believe this could be accounted for by the right code Kind regards AM

ammarhm · ‎04-15-2018

Hi everyone. I have a table (have) containing millions of entries with ICD code diagnosis. I also have a table (ICD10) which contains some ICD codes of interest: data ICD10; input L1:$ L2:$ L3:$ ; datalines; %F70 %E16% %G12% %F71 %E17% . %F72 %E18% %H4% %F73 %E19% . %F74% %E76% . run; This table (ICD10) is made ion a such a way that the end use will be able to modify it, so that it could contain any number of columns and rows What I am trying to go is to create a code that would extract cases from (have) based on ICD10. At first, i thought it would be a very straigt forward proc SQl proc SQL; create table want as select * from have where icd like (select * from icd10) ; quit Of course SAS was not happy to proceed as it expected a list of strings rather that what the (select * from icd10) returned. I also considered joining the tables based on L1, L... but it proved to be difficult to create a generic code that would work. Remember the ICD10 table is user defined, so it could contain L1:L3 or L1:l10 column so I dont know in advance the number of columns (or rows) in the ICD10 table. Could anyone suggest a solution please? One way I am considering is to "melt" table ICD10 into a along string but haven't worked that out yet.. All suggestions are appreciated Kind regards AM

ammarhm · ‎03-13-2018

Hi everyone I need help with 2 issues 1. I have a file containing thousands of enteries that look like this: PatientFirstName;PatientLastName;PatientDateOfBirth;PatientID;HomeDoctorSalutation;HomeDoctorFirstName;HomeDoctorLastName;ExamType;ExamDate;ExaminerSalutation;ExaminerFirstName;ExaminerLastName; Comment 3;Image Comment 4;_x000D_ "John";"Doe";"01/01/99";"9999999";"MS";"Alex";"Josef";"surgery";"01/01/17";"MR";"John";"Watson";"Mr";"Sherlock";"Holmes";"GX";"11111";"";"";"";"";"";""; I need to extract the parts I marked in red, basically, the first 4 entries between citation marks, and the 9th one. They are always in that order and always between citations Then Another file with entries as follows: X:\1111111_5748392_222222222_DDDDD.PDF I also need the entry I marked in red, basically the one between "\" and "_" I know this is doable with regular expression, but it would probably take me a couples of days to figure out a solution, so reaching out to the community for some help please... Kind regards AM

ammarhm · ‎02-27-2018

Hi everyone I have a dataset that looks like this: Patient_ID Visit_Date 1 13-08-2016 1 28-05-2013 1 24-12-2013 2 23-05-2012 2 11-01-2014 2 03-10-2017 2 16-02-2015 3 03-08-2013 3 28-10-2017 3 27-10-2013 3 18-04-2015 4 28-04-2016 4 07-08-2012 4 19-03-2013 4 25-05-2017 5 21-11-2014 5 20-03-2016 5 08-01-2015 5 21-09-2012 What i want to do is to extract a list of patients who have been seen in any 2 consecutive years. Could anyone help please? Kind regards

ammarhm · ‎01-09-2018

Thank you Reeza, Just a quick question, I presume you used regular expressions with Python, which can also be used within SAS. Is that the main approach you use? ie, what is the advantage of going with Python vs. SAS with regular expressions? I did come across natural language processing toolkit in Python, not sure how good that is... I am not that familiar with Python, but familiar with other programming languages, and R. If Python provides a strong tool to analyse natural language that is not provided through other programming environment then it is worthwhile knowing that and learning the language. Keen to hear your feedback.

ammarhm · ‎01-08-2018

Thank you @RW9 for taking the time to answer this question. I fully agree with you on every single point you mentioned and especially regarding the importance capturing data in a structured form to begin with. However, and as you know too well, the world is not that structured nor that easy to deal with. Being a physician, a researcher and a dedicated coder/programmer, I have been dealing with the full spectrum of the problem. I have worked with pharmaceutical sponsored studies, the data capture forms are tedious and no one outside these well sponsored studies will ever use these detailed forms on a daily basis. I tried designing webforms, for use on desktops and smart phones. I tried splitting data collection so that the patients will fill their part and the physicians will do theirs. I even used digitally readable form (where the responder would check a textbox and the form would then be converted to structured data after scanning the answer sheet and putting it through an OCR software. The problem: missing data (huge problem) as people ignore fileds or dont have time to fill in the forms. Forcing them to complete a filed is not the answer either. The more complex and structured form you create to collect more data, the less answers you get. Combining different modalities makes some aspect of the problem easier but can quickly build up to a very complex system relative to the task needed to be accomplished. Employing clerks/data analysts to go through captured data is very slow, very expensive and prone t errors. The other problem is, as you know, that there are numerous programs and solutions used out there for capturing information. Each department and each agency has its own systems. IT units in different companies adapt different solutions and systems. However, at the end of the day, all the data is stored in a database, be it Oracle, or SQL or something else, and it is there that work should be done. You would imagine that, living in a time when Googles AI beats the world nr 1 at "Go" that we would have the tools to do more complex tasks than just working with structured data (just a metaphore, I understand the difference in the problem description here) What I am doing now is a combination of bits and pieces to solve the problem. Matching the text against a table with the name of the surgeons would tells me who did the procedure, patient name is extracted by getting the string between (Mr/Mrs/Miss/Master) and a date regular expression (representing date of birth) and so forth Then splitting the text into sentences. Then applying keywords (like "Stomach) to extract the sentence containing information about the stomach. In that sentence, I would look for expected results (like Ulcer). An example of the problem at this level is the following sentences return a positive result for ulcer: (there was an ulcer) ( there was no ulcer) (no identifiable ulcer was seen) (examination reveled healing ulcer) (After through inspection, an ulcer couldnt be seen) etc will all be picked by reg expression as containing the key-term "ulcer" and the code is not that smart to understand negation, and it is virtually impossible to include every possible way of negating the presence of an ulcer in a simple code format. Maybe text miner could pick this up but note sure if it could be coded and embedded at part of a SAS code. This is just one example of a problem that you might run into with reg expression (and the list could get very long) Yet, you would think (maybe with machine learning?) that a set of general rules could be programmed into the code. Like: IF sentence X contains the term "ulcer" AND there is no negation THEN Ulcer_variable=1 ELSE Ulcer_Variable=0 Maybe wishful thinking, but as data volume, computing power and the sophistication of coding (and human laziness) increase, asking humans to input data using structured forms is not the way forward. We need tools to understand and analyse natural human language As I said, maybe wishful thinking... at least today Thanks AM

ammarhm · ‎01-08-2018

Hi everyone, I am not sure this question fits here, if not then admin please feel free to delete. I would appreciate everyone's input on this question. As you know, more than 80% of data out there are unstructured data. In order to make sense of these data, you need to convert it to structured data for analysis. I guess this is relevant to many people out there, let me give you my own version of the problem. I am a physician, and a researcher. Working in a busy hospital, we store gigabytes of data everyday in medical notes, images, etc. Medical notes are stored in text format (basically text in SQL sever). There are different types of notes, and you expect certain type of information stored int he text depending on the type of the note. Let us imaging a document describing a simple endoscopic procedure. You would expect the following information to be scattered in the text of the document: The name of the surgeon The name of the patient The age of the patient The indication of the procedure The date, time and duration of the procedure. Findings in the oesophagus, stomach, duodenum, colon Therapy done during the procedure Complications Follow up This information is entered as free text, natural human language. There are tens of thousands of these documents, transforming them into a structured analysable data is a huge (but a very tempting challenge) I tried doing this using different approaches, analysing a small sample(≈500 reports), and the best results I managed to obtain were through using regular expressions. Even through the results were impressive (as one of my colleagues who went through some of the cases said: I didn't know a computer could be this good), but they are far, far from good enough, and if the writer deviates excessively from the pattern i program into the regular expression, the code fails spectacularly. I realise I have two issues here: 1. Large data where computing power is needed, but this is not the main question here 2. Processing unstructured data into structured data which is my main focus here. I have been looking into text mining, but not sure that can do the job. This is more of natural human language analysis. I looked outside SAS: R seems to have some (limited?) packages to deal with this kind of issue: https://cran.r-project.org/web/views/NaturalLanguageProcessing.html Others seems to suggest Morphline, or Hadoop...etc. So my question is: Has anyone done this through SAS? Is SAS at all an appropriate tool to do this? If yes, then how? If not, then could you please share your approach of dealing with this kind of problem? As we store more and more data, and as the volume of stored data increased exponentially, this is going to be a more and more important problem to deal with. And if SAS or whoever comes with a good solution to this, it definitely is going to be a very sought after solution ... maybe the Holy Grail of data management in the future. Kind regards AM

ammarhm · ‎12-16-2017

Hi everyone Is there an easy way to create funnel plots for Odds Ratios vs standard error as part of a metaanlaysis? Here is an example of what I am trying to achieve: The frustrating thing is that doing this is a one line code in R or Stata, however, I dont seem to find an easy way of doing it in SAS, but I am sure it is only my ignorance and not a short coming of the software, hence asking for expert advice.

ammarhm · ‎12-07-2017

Hi everyone, I usually use proc logistic to calculate propensity scores (or probability of having the event of interest): Proc logistic data=have; class var1; model death(event='1')=var1 var2; output p=probability; run; Is there a way to calculate time dependent propensity score, that is, can use proc phreg to output the probability of the event happening? Thanks

ammarhm · ‎10-30-2017

Hi everyone, I understand the idea of propensity score... i understand you use proc logistic to get a probability score, then match based on that. However, I am still struggling with doing it. Could anyone please suggest a site or link that goes through the process with a dataset, maybe using a macro to simply things? Thanks

ammarhm · ‎09-16-2017

Here is a question, not sure if it fits here, if not admins can remove it... Has anyone managed to build and run their own virtual machine with SAS installed on it? The idea is to have a virtual machine that you can access anywhere using a browser and run SAS (and other programs) on it... SAS studio, the web-based version of SAS is what I am ultimatly trying to mimic, but there are some privacy issues that makes it difficult to use SAS studio, and hence I prefer to have my "own" version. So, I am trying to do the same thing, but building my own virtual machine. I have seen this done previously, and the login to the virtual machine is a secure one requiring two step authentication, which is great. I think they used Virtual Machine. https://www.vmware.com/products/horizon.html I am happy to host the hardware and install the software needed. I have been googling around but couldn’t find an easy solution. If anyone managed to do this, I would appreciate if they could share their experience. I know many organizations have similar solutions, just wondering how I as an individual can do it... Thanks

ammarhm · ‎09-16-2017

Thank you everyone for the suggestions and useful answeres

ammarhm · ‎09-16-2017

@SASKiwi Thank you for the reply What version control tools do you use / recommend please? Kind regards

Online Status	Offline
Date Last Visited	‎03-06-2023 11:58 AM

Re-coding/ matching multiple columns from an external table

Re: Combining tables and displaying first row from the first table whi...

Combining tables and displaying first row from the first table while k...

Dealing with (de-identified/ re-identifiable) data from multiple sites

Estimate mean or median survival time with 95% confidence interval.

Re: Strange behaviour of proc lifetest with strata

Strange behaviour of proc lifetest with strata

Re: Hazard plot adjusted for covariates

Hazard plot adjusted for covariates

Re: Customising Hazard pot title.

Re: Re-coding/ matching multiple columns from an external table

Re: Strange behaviour of proc lifetest with strata

Re: Sandarizing colour scale in heatmap

Re: Sandarizing colour scale in heatmap

Re: Cox regression vs Poisson regression for analysis

Re: Splitting follow up time by exposure status

Re: Strange behaviour of proc lifetest with strata

Re: Customising Hazard pot title.

Re: Sandarizing colour scale in heatmap

Re: Should I learn SAS or R?

Re: Extract value from a column based on other columns

Extract value from a column based on other columns

Re: Selecting cases based on all cells from another table

Selecting cases based on all cells from another table

Extracting part of a string

Extracting ID with consecutive dates

Re: Processing unstructured data into structured data for dummies

Re: Processing unstructured data into structured data for dummies

Processing unstructured data into structured data for dummies

Funnel plot Odds ratio vs Standard Error

Time dependent probability / propensity scores.

Propensity score matching for beginner

SAS virtual machine

Re: Approach to long and complex SAS coding

Re: Approach to long and complex SAS coding