DATA Step, Macro, Functions and more

Counting character variable in array with substr

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 125
Accepted Solution

Counting character variable in array with substr

[ Edited ]

I'd like to count first three digits of Qcodes by Facility name. I don't need other codes for now. Below shows my attempt to substract first three digits of Q codes then count by facility name. 

 

proc import datafile="....\patient_nodups.csv" 
out=have
dbms=csv replace;
getnames=yes;
run;

data qcodes; set Patients_NODUPS; 
array qcodes {*} Other_Diagnosis_Code_1-Other_Diagnosis_Code_24;
do i = 1 to dim(qcodes);
if substr(left(qcodes),1,3){i} in ('Q:') then output qcodes;
end;
run;

 

Thanks for being such a guradian angel.

 

Attachment

Accepted Solutions
Solution
‎07-07-2017 05:35 PM
PROC Star
Posts: 7,356

Re: Counting character variable in array with substr

Sound like you really want something like the following:

proc import datafile="/folders/myfolders/patient_nodups.csv" 
     out=have
     dbms=csv replace;
  getnames=yes;
run;

proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('have') and
              substr(name,1,5) eq 'Other'
  ;
quit;

data qcodes (keep=facility_name qcode);
  &lengths.;
  set have;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) eq 'Q' then do;
      qcode=substr(qcodes(i),1,3);
      output;
    end;
  end;
run;

proc sort data=qcodes;
  by facility_name;
run;

proc freq data=qcodes;
  by facility_name;
  tables qcode/out=want (drop=percent);
run;

proc transpose data=want out=want (drop=_:);
  by facility_name;
  var count;
  id qcode;
run;

Art, CEO, AnalystFinder.com

View solution in original post


All Replies
PROC Star
Posts: 7,356

Re: Counting character variable in array with substr

Not clear what you are trying to achieve. If you want to eliminate any of the other_diagnostic_codes that don't start with "Q", then the following will work:

proc import datafile="/folders/myfolders/patient_nodups.csv" 
     out=have
     dbms=csv replace;
  getnames=yes;
run;

proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('have') and
              substr(name,1,5) eq 'Other'
  ;
quit;

data qcodes;
  &lengths.;
  set have;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) ne 'Q' then call missing(qcodes(i));
  end;
  call sortc(of qcodes(*));
run;

The reason the proc sql code is there is because proc import set different lengths for each of the other_diagnostic_code variables and the proc sql code standardizes them all to have a length of 6.

 

Art, CEO, AnalystFinder.com

 

Frequent Contributor
Posts: 125

Re: Counting character variable in array with substr

@art297

Thank you very much. 

The code:

 

options nolabel;
proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('patients_nodups') and
              substr(name,1,3) eq 'Other'
  ;
quit;

data qcodes;
  &lengths.;
  set patients_nodups;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) ne 'Q' then call missing(qcodes(i));
  end;
  call sortc(of qcodes(*));
run;

 

worked fine with warning:

WARNING: Multiple lengths were specified for the variable Other_Diagnosis_Code_1 by input data
         set(s). This can cause truncation of data.
thru 
WARNING: Multiple lengths were specified for the variable Other_Diagnosis_Code_24 by input data set(s). This can cause truncation of data.

I changed 

substr(name,1,5) to substr(name,1,3)

in order to substract first three digits of q codes. However, I have 5 digit Q codes in the output data. The reason why is that I'd like to count number of distinct 3 digit qcodes by facility name for the next step. I apologize for ambiguity in my question. Hope image helps. 

Am I missing something here? 

 

N of distinct q codes by hospitals.png

Frequent Contributor
Posts: 125

Re: Counting character variable in array with substr

I just tried below code wondering if I'm misplacinf end;. But no success yet.

data qcodes; set have; 
        array qcodes {*} Other_Diagnosis_Code_1-Other_Diagnosis_Code_24;
            do i = 1 to dim(qcodes);
			end;
            if (substr(left(qcodes),1,3)) in ('Q:') then output qcodes;
run;

Log:

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      1060:29
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.QCODES may be incomplete.  When this step was stopped there were 0
         observations and 36 variables.
WARNING: Data set WORK.QCODES was not replaced because this step was stopped.
Super User
Posts: 5,071

Re: Counting character variable in array with substr

[ Edited ]

If I've figured out what you are after, here's a way to get all the Q codes for each facility. 

 

data qcodes;

array codes {24} $ 4 Other_diagnosis_code_1 - Other_diagnosis_code_24;

set have (keep=facility other_diagnosis_code_: );

do i=1 to 24;

   if codes{i} =: 'Q' then do;

      qcode = codes{i};

      output;

   end;

end;

keep facility qcode;

run;

 

proc freq data=want;

tables facility * qcode / missing list;

run;

 

Defining the array with length of $4 for each element ahead of time means you get Q plus the three digits that follow.

Super Contributor
Posts: 275

Re: Counting character variable in array with substr

If I understand correctly, try this:

 

data qcodes;

     set Patients_NODUPS;

     array qcodes {*} Other_Diagnosis_Code_1-Other_Diagnosis_Code_24;

     do i = 1 to dim(qcodes);

     if substr(left(qcodes),1,1){i}='Q' then do;

          output;

          return;

     end;

     end;

run;

Solution
‎07-07-2017 05:35 PM
PROC Star
Posts: 7,356

Re: Counting character variable in array with substr

Sound like you really want something like the following:

proc import datafile="/folders/myfolders/patient_nodups.csv" 
     out=have
     dbms=csv replace;
  getnames=yes;
run;

proc sql noprint;
  select catx(' ','length',name, '$6;')
    into :lengths separated by ' '
      from dictionary.columns
        where libname=upcase('work') and
              memname=upcase('have') and
              substr(name,1,5) eq 'Other'
  ;
quit;

data qcodes (keep=facility_name qcode);
  &lengths.;
  set have;
  array qcodes{*} $ Other_Diagnosis_Code_24-Other_Diagnosis_Code_1;
  do i = 1 to dim(qcodes);
    if first(qcodes(i)) eq 'Q' then do;
      qcode=substr(qcodes(i),1,3);
      output;
    end;
  end;
run;

proc sort data=qcodes;
  by facility_name;
run;

proc freq data=qcodes;
  by facility_name;
  tables qcode/out=want (drop=percent);
run;

proc transpose data=want out=want (drop=_:);
  by facility_name;
  var count;
  id qcode;
run;

Art, CEO, AnalystFinder.com

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 137 views
  • 1 like
  • 4 in conversation