Solved: Counting character variable in array with substr

Cruise · Posted 07-05-2017 01:04 PM

I'd like to count first three digits of Qcodes by Facility name. I don't need other codes for now. Below shows my attempt to substract first three digits of Q codes then count by facility name.

proc import datafile="....\patient_nodups.csv" 
out=have
dbms=csv replace;
getnames=yes;
run;

data qcodes; set Patients_NODUPS; 
 array qcodes {*} Other_Diagnosis_Code_1-Other_Diagnosis_Code_24;
 do i = 1 to dim(qcodes);
 if substr(left(qcodes),1,3){i} in ('Q:') then output qcodes;
 end;
run;

Thanks for being such a guradian angel.

art297 · Posted 07-05-2017 08:29 PM

Sound like you really want something like the following:

proc import datafile="/folders/myfolders/patient_nodups.csv" 
     out=have
     dbms=csv replace;
  getnames=yes;
run;

proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('have') and
              substr(name,1,5) eq 'Other'
  ;
quit;

data qcodes (keep=facility_name qcode);
  &lengths.;
  set have;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) eq 'Q' then do;
      qcode=substr(qcodes(i),1,3);
      output;
    end;
  end;
run;

proc sort data=qcodes;
  by facility_name;
run;

proc freq data=qcodes;
  by facility_name;
  tables qcode/out=want (drop=percent);
run;

proc transpose data=want out=want (drop=_:);
  by facility_name;
  var count;
  id qcode;
run;

Art, CEO, AnalystFinder.com

View solution in original post

art297 · Posted 07-05-2017 01:32 PM

Not clear what you are trying to achieve. If you want to eliminate any of the other_diagnostic_codes that don't start with "Q", then the following will work:

proc import datafile="/folders/myfolders/patient_nodups.csv" 
     out=have
     dbms=csv replace;
  getnames=yes;
run;

proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('have') and
              substr(name,1,5) eq 'Other'
  ;
quit;

data qcodes;
  &lengths.;
  set have;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) ne 'Q' then call missing(qcodes(i));
  end;
  call sortc(of qcodes(*));
run;

The reason the proc sql code is there is because proc import set different lengths for each of the other_diagnostic_code variables and the proc sql code standardizes them all to have a length of 6.

Art, CEO, AnalystFinder.com

Cruise · Posted 07-05-2017 05:01 PM

@art297

Thank you very much.

The code:

options nolabel;
proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('patients_nodups') and
              substr(name,1,3) eq 'Other'
  ;
quit;

data qcodes;
  &lengths.;
  set patients_nodups;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) ne 'Q' then call missing(qcodes(i));
  end;
  call sortc(of qcodes(*));
run;

worked fine with warning:

WARNING: Multiple lengths were specified for the variable Other_Diagnosis_Code_1 by input data
         set(s). This can cause truncation of data.
thru 

WARNING: Multiple lengths were specified for the variable Other_Diagnosis_Code_24 by input data
         set(s). This can cause truncation of data.

I changed

substr(name,1,5) to substr(name,1,3)

in order to substract first three digits of q codes. However, I have 5 digit Q codes in the output data. The reason why is that I'd like to count number of distinct 3 digit qcodes by facility name for the next step. I apologize for ambiguity in my question. Hope image helps.

Am I missing something here?

Cruise · Posted 07-05-2017 01:46 PM

I just tried below code wondering if I'm misplacinf end;. But no success yet.

data qcodes; set have; 
        array qcodes {*} Other_Diagnosis_Code_1-Other_Diagnosis_Code_24;
            do i = 1 to dim(qcodes);
			end;
            if (substr(left(qcodes),1,3)) in ('Q:') then output qcodes;
run;

Log:

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      1060:29
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.QCODES may be incomplete.  When this step was stopped there were 0
         observations and 36 variables.
WARNING: Data set WORK.QCODES was not replaced because this step was stopped.

Astounding · Posted 07-05-2017 01:50 PM

If I've figured out what you are after, here's a way to get all the Q codes for each facility.

data qcodes;

array codes {24} $ 4 Other_diagnosis_code_1 - Other_diagnosis_code_24;

set have (keep=facility other_diagnosis_code_: );

do i=1 to 24;

if codes{i} =: 'Q' then do;

qcode = codes{i};

output;

end;

keep facility qcode;

run;

proc freq data=want;

tables facility * qcode / missing list;

run;

Defining the array with length of $4 for each element ahead of time means you get Q plus the three digits that follow.

slchen · Posted 07-05-2017 02:40 PM

If I understand correctly, try this:

data qcodes;

set Patients_NODUPS;

array qcodes {*} Other_Diagnosis_Code_1-Other_Diagnosis_Code_24;

do i = 1 to dim(qcodes);

if substr(left(qcodes),1,1){i}='Q' then do;

output;

return;

end;

run;

art297 · Posted 07-05-2017 08:29 PM

Sound like you really want something like the following:

proc import datafile="/folders/myfolders/patient_nodups.csv" 
     out=have
     dbms=csv replace;
  getnames=yes;
run;

proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('have') and
              substr(name,1,5) eq 'Other'
  ;
quit;

data qcodes (keep=facility_name qcode);
  &lengths.;
  set have;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) eq 'Q' then do;
      qcode=substr(qcodes(i),1,3);
      output;
    end;
  end;
run;

proc sort data=qcodes;
  by facility_name;
run;

proc freq data=qcodes;
  by facility_name;
  tables qcode/out=want (drop=percent);
run;

proc transpose data=want out=want (drop=_:);
  by facility_name;
  var count;
  id qcode;
run;

Art, CEO, AnalystFinder.com

Counting character variable in array with substr

Re: Counting character variable in array with substr

Re: Counting character variable in array with substr

Re: Counting character variable in array with substr

Re: Counting character variable in array with substr

Re: Counting character variable in array with substr

Re: Counting character variable in array with substr

Re: Counting character variable in array with substr

SAS Innovate 2025: Call for Content

Classroom Training Available!