About BrianB4233

BrianB4233 · ‎06-23-2025

Can confirm that option #2 has worked for me in the past.

BrianB4233 · ‎04-08-2024

Paige's example will work just fine. Here's a couple more (I added a couple of records on my own): data have; input gpa cars; datalines; 17 4 3 1 7 3 4 1 18 1 5 2 ; run; ** Long and unnecessarily convoluted; data want1; set have end=lastrow; retain gpa_max cars_max; if _N_ = 1 then do; /* if 1st record in HAVE ... */ gpa_max = gpa; /* set GPA_MAX and CARS_MAX to values of GPA & CARS */ cars_max = cars; end; /* if it's not the 1st record .... */ else do; if gpa > gpa_max then gpa_max = gpa; if cars > cars_max then cars_max = cars; end; if lastrow; /* Output if last record in file */ keep gpa_max cars_max; run; ** Simpler and get the same result; proc sql; create table want2 as select max(gpa) as gpa_max, max(cars) as cars_max from have ; quit;

BrianB4233 · ‎01-11-2024

Hello all, I've been working with free text - medical notes - that have been uploaded into a SAS dataset. Below are 2 examples that I've copied and pasted as well as a screenshot of the text within Notepad ++ that shows the Carriage Returns/Line Feeds. The keyword is 'carbapenem' and I've tried, without any luck, to extract only the line in the text that contains the keyword using some Regex code. Ideally, I'd like to pull out only the 1st line in example #1 and the 2nd line in example #2. I'm looking for any tips or advice on how to pull out the keyword line using Regex or non-Regex code. "Organism is a confirmed carbapenem. If infection is present, preferred treatment is with a fluoroquinolone. " "COLONY TYPE 2 CARBAPENEM PRODUCER: RECOMMEND A TREATMENT FOR COMPLICATED UTI OR SYSTEMIC INFECTION"

BrianB4233 · ‎09-23-2022

Thank you ballardw, much more efficient and straight-forward than where I was going. And yes, the initial record will always be equal to '0' and the 2nd record - if it exists - will always be considered the follow-up. It took countless hours and research to get to this point from the raw datasets but this was huge and important piece.

BrianB4233 · ‎09-22-2022

Hi all, Once again I've had great difficulty getting the logic correct in my head in regards to implementing, and correctly using, the RETAIN statement. I have a dataset where I need to identify the date (FU_DT) from a patient's next record and the changed score (BIRAD2). A majority of the time these will occur on the 2nd record but, at times, the score change will occur more downstream than the 2nd record. And this is where I've admitted failure. And I only need to process these records until I've identified the changed score, i.e., all other subsequent rows can be ignored. What I'm getting with my current code: exam_dt id exam_cnt birad_score_radi birad1 birad2 fu_dt birad_dt outp 26-Nov-18 1 1 0 0 . . . . 1-Mar-19 1 2 4 0 4 1-Mar-19 1-Mar-19 1 24-Jun-20 1 3 . 0 4 1-Mar-19 1-Mar-19 . 14-Jan-20 2 1 0 0 . . . . 7-Feb-20 2 2 . 0 . 7-Feb-20 . . 27-Feb-20 2 3 . 0 . 7-Feb-20 . . 8-Jun-20 2 4 2 0 2 8-Jun-20 8-Jun-20 1 11-Aug-21 2 5 2 0 2 11-Aug-21 11-Aug-21 1 3-Dec-19 3 1 0 0 . . . . 14-Jan-20 3 2 . 0 . 14-Jan-20 . . 30-Jan-20 3 3 2 0 2 30-Jan-20 30-Jan-20 1 And here's what I ideally would like: id fu_dt birad_dt birad1 birad2 1 1-Mar-19 1-Mar-19 0 4 2 7-Feb-20 8-Jun-20 0 2 3 14-Jan-20 30-Jan-20 0 2 The code that generates the 1st table of output: data have; format exam_dt date9.; input id exam_dt :date9. exam_cnt birad_score_radi; infile datalines missover; datalines; 1 26Nov18 1 0 1 01Mar19 2 4 1 24Jun20 3 . 2 14Jan20 1 0 2 07Feb20 2 . 2 27Feb20 3 . 2 08Jun20 4 2 2 11Aug21 5 2 3 03Dec19 1 0 3 14Jan20 2 . 3 30Jan20 3 2 ; run; data want; *format id id_1 id_last exam_dt; set have; by id exam_dt; *+++++++++++++++++++++++++ * Assessment of 0 scores *+++++++++++++++++++++++++; * Scenario 1: --> Initial screening BIRAD_SCORE_RADI = 0 ^ 1st FU will be the FU date ^ If 1st FU includes the changed BIRAD score, note the date (FU_DT) ^ If 1st FU <> have changed BIRAD score then find the next record & note date (BIRAD_DT) ; retain birad1 birad2 fu_dt birad_dt; format fu_dt birad_dt date9.; * TYPE: 1: initial score = 0 w/ FU 2: amended score from 0 w/o FU 3: initial score = 0 w/o FU ; /* Reset retained vars to missing when onto next patient */ if first.id then do; birad1 = .; birad2 = .; fu_dt = .; birad_dt = .; end; /* Identify & output when initial BIRAD = 0 */ if first.id & birad_score_radi = 0 then do; birad1 = birad_score_radi; end; else if not first.id & birad1 ^= . then do; if birad_score_radi ^= . then do; birad2 = birad_score_radi; birad_dt = exam_dt; fu_dt = exam_dt; outp = 1; end; if birad_score_radi = . then do; if exam_cnt = 2 then fu_dt = exam_dt; end; else if birad_score_radi ^= . then do; birad2 = birad_score_radi; birad_dt = exam_dt; outp = 1; end; end; /* else if birad1 = 0 then do; if birad_score_radi ^= . then do; fu_dt = exam_dt; birad_dt = exam_dt; birad2 = birad_score_radi; outp = 1; end; end; */ run; I'm open to any suggestions or alternative ways to do this. Kind regards, Brian.

BrianB4233 · ‎05-06-2022

Hi all, I apologize in advance if this isn't the correct forum or place to ask this question but I've found sparse info elsewhere. Anyway, I'm a programmer that is on multiple projects with separate tasks for each, and often, it gets muddy for me. So I'm querying this community to ask what resources/software others use to keep track of these things, e.g., Excel, something SAS includes that I'm not aware of or haven't utilized, Microsoft Project. As a note, I do have Microsoft Project but the learning curve seems to be a bit steep and it seems to be aimed more towards a project team rather than an individual. Thanks, Brian.

BrianB4233 · ‎12-20-2021

Thank you yabwon, your solution makes much more sense and is less cumbersome.

BrianB4233 · ‎12-16-2021

Hi all, I'm attempting to locate & output a date using some Regex that follows a keyword. The keyword in my scenario is 'Addendum' and it can occur multiple times. I'm looking for these keyword(s) in some free text that is loaded with line feeds/carriage returns etc. However I've worked on some Regex code that is successful - to a point. I'm using PRXNEXT to capture every instance of 'Addendum', and then up to the next 15 words/non-word, and followed by a date. if _n_ = 1 then do; retain dt_pattern; dt_pattern = prxparse("/(addend\w+(\W+\w+){0,15})\W+(\d{1,2}\s?(\.|\/|-)\s?\d{1,2}\s?(\.|\/|-)\s?\d{2,4})/i"); end; start = 1; stop = length(imp_rep_concat); call prxnext(dt_pattern,start,stop,imp_rep_concat,pos,len); array comm[8] $150 addend1-addend8; array comm1[8] $30 amend_out1-amend_out8; do i = 1 to 8 while (pos > 0); comm(i) = upcase(substr(imp_rep_concat,pos,len)); comm1(i) = prxPosn(dt_pattern, 3, imp_rep_concat); call prxnext(dt_pattern,start,stop,imp_rep_concat,pos,len); end; However, in the example attached, there are 3 instances of 'Addendum' and this code is only picking up two. I've tried adjusting the (\W+\w+){0,15}) in the code to expand it from 15 but then it subsumes the next date. Any ideas/advice? Thank you. Sample is saved here as well: https://regex101.com/r/0eWTV9/1

BrianB4233 · ‎11-30-2021

It looks like you need to add the name of the variable that indicates what the game is, as in: proc sort; by game_type date; run; data CountRoseBowl; set NewRoseBowl; by game_type date; /* set up counter */ if first.date then game_cnt = 1; else game_cnt + 1; /* output last record by GAME_TYPE to get total sum */ if last.game_type; run;

BrianB4233 · ‎11-08-2021

Thank you ChrisNZ - not sure how I missed this but it was indeed #3.

BrianB4233 · ‎11-02-2021

Hi all, I've written what I thought was some solid code to extract multiple dates, and the preceding 18 words/non-words, from a free text field. I'm using PRXNEXT b/c there are often multiple dates within the text field and I'd like to extract all of them. However, testing this in https://regex101.com/ and then viewing the results doesn't result in a match. It is correctly identifying, and outputting, the date using PRXPOSN but it's not including all of the words/non-words preceding the date. What is being output in the temp dataset is this: year: 2.9 %..........[average woman <1.67%] NCI Lifetime: 15.1 %..........[average woman <10%] A Whereas in regex101 it's showing this: https://regex101.com/r/LWRcqN/1 data data_chk1; length dt_1-dt_12 $150 dt_out1-dt_out12 $30 imp_rep_concat $11000 ; set work.birad_score_0_3(drop=cht_in impressiontext reporttext obs=max); /* Combine impression & report text together to search as one */ imp_rep_concat = catx(' REPORT_TEXT ',impression_copy,report_copy); *** Identifies ddOctdd or dOctdddd or ddOctdddd as well if there is a space/hyphen/whatever between the day & month or month & year; if _n_ = 1 then do; retain dt_pattern; dt_pattern = prxparse("/(?:\w+\W+){0,18}(\d{1,2}(\.|\/|-)\d{1,2}(\.|\/|-)\d{2,4})/i"); end; /*if prxmatch(dt_pattern,impression_copy) then do;*/ /*match = 1;*/ /* date_out = prxposn(dt_pattern,1,impression_copy);*/ /*end;*/ start = 1; stop = length(imp_rep_concat); call prxnext(dt_pattern,start,stop,imp_rep_concat,pos,len); array comm[12] $dt_1-dt_12; array comm1[12] $dt_out1-dt_out12; do i = 1 to 12 while (pos > 0); comm(i) = substr(imp_rep_concat,pos,len); comm1(i) = prxPosn(dt_pattern, 1, imp_rep_concat); call prxnext(dt_pattern,start,stop,imp_rep_concat,pos,len); end; *drop dt_1-dt_12 dt_pattern: start: stop: pos len i; run; Any ideas what is causing the inconsistency? Thank you.

BrianB4233 · ‎07-08-2021

Thank you for your reply, and agreed, my wording is a bit vague. What I'm looking for is your 2nd condition: "the score is > treatment date butnot > minimum of the other transposed treatment dates".

BrianB4233 · ‎07-08-2021

Hello all, I've been trying, and failing, to construct an array that compares dates across a row and sets a flag on that record to keep it or not. The 1st three date columns (trandt1-trandt3) are the transposed treatment dates for a patient (this patient had 3 on 26Dec17, 22Jan19, and 07Feb20), the next date column is the treatment date itself, and the last column is the date a score was assigned. Where keeper is equal to '1' are the rows I want to keep, and this b/c 1) the treatment date is equal to the score date or 2) the score is > treatment date but not > the other transposed treatment dates. Make sense? data visits; input id trandt1 :date9. trandt2 :date9. trandt3 :date9. visit_dt :date9. score_dt :date9. keeper; format trandt1 :date9. trandt2 :date9. trandt3 :date9. visit_dt :date9. score_dt :date9.; datalines; 1 26Dec2017 22Jan2019 07Feb2020 26Dec2017 26Dec2017 1 1 26Dec2017 22Jan2019 07Feb2020 26Dec2017 03Jan2018 1 1 26Dec2017 22Jan2019 07Feb2020 26Dec2017 22Jan2019 0 1 26Dec2017 22Jan2019 07Feb2020 26Dec2017 07Feb2020 0 1 26Dec2017 22Jan2019 07Feb2020 26Dec2017 13Feb2020 0 1 26Dec2017 22Jan2019 07Feb2020 22Jan2019 22Jan2019 1 1 26Dec2017 22Jan2019 07Feb2020 22Jan2019 07Feb2020 0 1 26Dec2017 22Jan2019 07Feb2020 22Jan2019 13Feb2020 0 1 26Dec2017 22Jan2019 07Feb2020 07Feb2020 07Feb2020 1 1 26Dec2017 22Jan2019 07Feb2020 07Feb2020 13Feb2020 1 ; run;

BrianB4233 · ‎06-23-2021

Thank you all for your input, it's greatly appreciated.

BrianB4233 · ‎06-21-2021

Hello community, I've been implementing some regex code to capture BIRAD scores (scores that assess risk of breast cancer that can range from 0-6) and some of them, unfortunately, are buried within free text notes. While I have regex code (bi.?rads?\D*(.|\D+)?\d*) working reasonably well I'm having difficulty limiting the return of the digit after the BIRAD keyword. Some examples: 1) BI-RAD category: 1 -- Regex code will capture entire string 2) BI-RADs 3 -- Regex code will capture entire string 3) BI-RADS CATEGORY EXPLANATION density date assessed: 9/24/2018 -- Regex captures up until the "9", or the September I obviously do not want to capture the '9' in the 3rd example. My initial thought was to limit the # of words that occur after the BIRAD keyword using some kind of word boundary count but I've difficulty operationalizing that, and I'm probably not thinking of a simpler approach. Any advice? Code example is attached. Thanks, Brian.

Online Status	Offline
Date Last Visited	‎10-16-2025 01:19 PM

Re: ODS excel -WARNING: Unsupported device 'SVG' for EXCEL destination...

Re: How do I pull max of multiple columns in SAS?

Extracting line from free text that contains a keyword

Re: Difficulty with RETAIN and comparing rows

Difficulty with RETAIN and comparing rows

Tools for managing multiple projects/tasks

Re: Using Regex to look for a date after a keyword

Using Regex to look for a date after a keyword

Re: Cumulative Counter Variable

Re: Extracting a phrase using Regex (PRXNEXT)

Re: ODS excel -WARNING: Unsupported device 'SVG' for EXCEL destination...

Re: Array with Do Until?

Re: Reporting the SAS Code Lines by Program

Re: Difficulty with RETAIN and comparing rows

Re: Searching for Ranges for character values

Re: Matching observations by ID and date

Re: ODS excel -WARNING: Unsupported device 'SVG' for EXCEL destination...

Re: How do I pull max of multiple columns in SAS?

Extracting line from free text that contains a keyword

Re: Difficulty with RETAIN and comparing rows

Difficulty with RETAIN and comparing rows

Tools for managing multiple projects/tasks

Re: Using Regex to look for a date after a keyword

Using Regex to look for a date after a keyword

Re: Cumulative Counter Variable

Re: Extracting a phrase using Regex (PRXNEXT)

Extracting a phrase using Regex (PRXNEXT)

Re: Using Array to compare and keep records based upon dates

Using Array to compare and keep records based upon dates

Re: Limiting # of words in a Regex

Limiting # of words in a Regex

SAS Inner Circle Panel

SAS Analytics Explorers