Hi SAS forum, I have strings are like this: Heart failure, unspecified (I50.9) Osteomyelitis, unspecified (M86.9) Cardiogenic shock (R57.0) I want to divide the strings into different variables according to the parentheses. I saw a similar question: https://communities.sas.com/t5/SAS-Data-Management/splitting-a-variable-into-several-depending-on-parenthesis/td-p/209649 and modified the code as follow: data have; infile cards dsd; length name $100.; input name $; cards; "Heart failure, unspecified (I50.9) Osteomyelitis, unspecified (M86.9) Cardiogenic shock (R57.0)" run; data want; set have; DIAGNOSIS1 = scan(name,1,'(,)'); DIAGNOSISCODE1 = scan(name,2,'(,)'); DIAGNOSIS2 = scan(name,3,'(,)'); DIAGNOSISCODE2 = scan(name,4,'(,)'); DIAGNOSIS3 = scan(name,5,'(,)'); DIAGNOSISCODE3 = scan(name,6,'(,)'); run; But It seems I should remove the comma in the parentheses to get the desired results, like this: data want; set have; DIAGNOSIS1 = scan(name,1,'()'); DIAGNOSISCODE1 = scan(name,2,'()'); DIAGNOSIS2 = scan(name,3,'()'); DIAGNOSISCODE2 = scan(name,4,'()'); DIAGNOSIS3 = scan(name,5,'()'); DIAGNOSISCODE3 = scan(name,6,'()'); run; So I have 2 questions: 1. What's the difference between these 2 codes: DIAGNOSIS1 = scan(name,1,'(,)'); DIAGNOSIS1 = scan(name,1,'()'); (with/without the comma inside the parentheses?) 2. Some of my data contains Chinese characters and fullwidth comma (,) and they are not in the parentheses. Would it be okay to deal with the data with the same code? Thanks in advance!
... View more