DATA Step, Macro, Functions and more

Defining variables based on multiplelines per person.

Reply
Contributor
Posts: 29

Defining variables based on multiplelines per person.

Hi SAS Experts

Normally I would have one line per patient per admission however in this case it is to complicated and instead I have multiples lines per patient per admission. I my normal case I would use an array  and then define my diagnosis variables (please see example code below).

 

data indlagte_main;

set indlagte_main;

addiction=0;

alcohol=0;

array x{*} diagindH1-diagindH2 diagindB1-diagindB10;

do i=1 to dim(x);

if substr(misdiag{i},1,3) in ('DF1') then do;addiction=1; end;

if substr(misdiag{i},1,4) in ('DF10') then do; alcohol=1; end;

end;

run;

 

This is what I have now

 

Record_idadmissionnumberAdmission dateDischarge datedate of diagnosisdiagnosis
1101-01-201001-02-201002-01-2010df200
1101-01-201001-02-201002-03-2010df100
1203-03-201031-03-201004-03-2010df147
1203-03-201031-03-201004-03-2010df200

 

This is what I want. Addiction and alcohol definitions should be the same as in the above array.

 

Record_idadmissionnumberAdmission dateDischarge dateAddictionAlcohol
1101-01-201001-02-201011
1203-03-201031-03-201010

 

I hope this makes sence.

 

Kind regards from

 

Solvej

Super User
Posts: 10,534

Re: Defining variables based on multiplelines per person.

Please supply example data in a usable form, see my footnotes.

 

As already mentioned in https://communities.sas.com/t5/Base-SAS-Programming/Restriction-of-data-based-on-dates/m-p/454719

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 29

Re: Defining variables based on multiplelines per person.

Posted in reply to KurtBremser

I apologize for this. I hope this datastep is sufficient. Please remember that I have thousands of patients with a many admission. The date format is incorrect but many you have an idea of which one to use. The all seem to fail when I try to use them.

 

data have;

input Record_id admissionnumber Admission_date DATE10. Discharge_date date10. date_of_diagnosis date10. diagnosis $;

format Admission_date date10. Discharge_date date10. date_of_diagnosis date10.;

datalines;

 

1 1 01-01-2010 01-02-2010 02-01-2010 df200

1 1 01-01-2010 01-02-2010 02-03-2010 df100

1 2 03-03-2010 31-03-2010 04-03-2010 df147

1 2 03-03-2010 31-03-2010 04-03-2010 df200

; run;

 

Kind regards

 

Solvej

 

Super User
Posts: 10,534

Re: Defining variables based on multiplelines per person.

Use by group processing, and retained variables:

data have;
input
  Record_id
  admissionnumber
  (Admission_date Discharge_date date_of_diagnosis) (:ddmmyy10.)
  diagnosis $
;
format Admission_date Discharge_date date_of_diagnosis ddmmyyd10.;
datalines;
1 1 01-01-2010 01-02-2010 02-01-2010 df200
1 1 01-01-2010 01-02-2010 02-03-2010 df100
1 2 03-03-2010 31-03-2010 04-03-2010 df147
1 2 03-03-2010 31-03-2010 04-03-2010 df200
;
run;

data want;
set have;
by record_id admissionnumber;
retain addiction alcohol;
if first.admissionnumber
then do;
  addiction = 0;
  alcohol = 0;
end;
if substr(diagnosis,1,3) = 'df1' then addiction = 1;
if substr(diagnosis,1,4) = 'df10' then alcohol = 1;
if last.admissionnumber then output;
drop diagnosis date_of_diagnosis;
run;

proc print data=want noobs;
run;

Result:

Record_                       Admission_    Discharge_
   id      admissionnumber       date          date       addiction    alcohol

   1              1           01-01-2010    01-02-2010        1           1   
   1              2           03-03-2010    31-03-2010        1           0   
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 3 replies
  • 78 views
  • 2 likes
  • 2 in conversation