Solved: Re: data management with transpose HELP please

chuie · Posted 02-27-2017 10:17 AM

Hi

i have a claim data of encounter type of all clinic A users.How I have identified clinic A users is from the encounter type = clinic A. so I would like to count number of hopsitalization, ED, BEFORE AND AFTER his first encounter(by date) to clinic A. So basically the first visit date to clinic A is the intervention date.In the data set below, ID 1 started using clinic A is from 2/3 so any enconter before that date is considered before and after that date is after.

data is like

ID dateofvisit encountertype

1 2/1 ED

1 2/2 hospitalization

1 2/3 clinic A

1 2/4 ED

2

3

etc

The want data is:

ID beforeED Beforehospitalization afterED afterhospitalization

1 1 1 1 0

Thank you so much in advance. I have spend 2 weeks to figure out by myself ... any help would be highly appriciated.

PGStats · Posted 02-27-2017 10:12 PM

Assuming your data is sorted by id and dates:

data have;
input ID    dateofvisit :$8.    encountertype &:$20.;
datalines;
1       2/1                       ED
1       2/2                       hospitalization
1      2/3                         clinic A
1      2/4                          ED
2       2/1                       ED
2      2/2                         clinic A
2       2/3                       hospitalization
2      2/4                          ED
;

data temp;
length col $32;
state = "before";
dum = 1;  /* Needed for transpose */
do until (last.id);
    set have; by id;
    if encountertype = "clinic A" then state = "after";
    else do;
        col = cats(state, encountertype);
        output;
        end;
    end;
keep id col dum;
run;

proc transpose data=temp out=almost(drop=_name_);
by id;
var dum;
id col;
run;

/* Replace missing with zeros */
proc stdize data=almost out=want reponly missing=0;
run;

PG

View solution in original post

PGStats · Posted 02-27-2017 10:12 PM

Assuming your data is sorted by id and dates:

data have;
input ID    dateofvisit :$8.    encountertype &:$20.;
datalines;
1       2/1                       ED
1       2/2                       hospitalization
1      2/3                         clinic A
1      2/4                          ED
2       2/1                       ED
2      2/2                         clinic A
2       2/3                       hospitalization
2      2/4                          ED
;

data temp;
length col $32;
state = "before";
dum = 1;  /* Needed for transpose */
do until (last.id);
    set have; by id;
    if encountertype = "clinic A" then state = "after";
    else do;
        col = cats(state, encountertype);
        output;
        end;
    end;
keep id col dum;
run;

proc transpose data=temp out=almost(drop=_name_);
by id;
var dum;
id col;
run;

/* Replace missing with zeros */
proc stdize data=almost out=want reponly missing=0;
run;

PG

chuie · Posted 02-28-2017 12:32 PM

This is excellent. Thank you so much. You have definately made my day...

I would really appricate if you could explain the steps as data doesnot look as simple as it in this example like i have separate clinic data and ED, hospitalization etc are from claims data.

PGStats · Posted 02-28-2017 02:54 PM

Here is a commented version:

data have;
input ID    dateofvisit :$8.    encountertype &:$20.;
datalines;
1       2/1                       ED
1       2/2                       hospitalization
1      2/3                         clinic A
1      2/4                          ED
2       2/1                       ED
2      2/2                         clinic A
2       2/3                       hospitalization
2      2/4                          ED
;

/* Data is assumed to be sorted by id and date */
data temp;
length col $32;
/* Define initial state at the beginning of each id, 
   to be changed later when clinic name is detected */
state = "before";
/* Define dummy variable needed for transpose */
dum = 1;
/* Loop on observations from a single id */
do until (last.id);
    set have; by id;
    /* If clinic name is detected, change state from before to after. 
       No output needed in this case */
    if encountertype = "clinic A" then state = "after";
    else do;
        /* Form a future variable name by appending state and encountertype */
        col = cats(state, encountertype);
        output;
        end;
    end;
keep id col dum;
run;

/* Transpose creates the new variables whose names were created in previous step */
proc transpose data=temp out=almost(drop=_name_);
by id;
var dum;
id col;
run;

/* Transpose leaves missing values for variables that were not present 
   for a given id. This final step replaces missing values with zeros */
proc stdize data=almost out=want reponly missing=0;
run;

PG

chuie · Posted 02-28-2017 03:44 PM

You are AWESOME....

Thank you so much

chuie · Posted 03-31-2017 01:53 PM

Hi PG,

I got this error message when i tyirend the code in my actual database. I appricate your input in advance.

proc transpose data=temp1 out=almost11;
by MRN;
var dum;
id col;
run;

ERROR: The ID value "afterE" occurs twice in the same BY group.
ERROR: The ID value "afterE" occurs twice in the same BY group.
ERROR: The ID value "afterE" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
MRN=123

DavidGhan · Posted 03-31-2017 02:17 PM

Hi,

The problem you have is that the values of COL will be used for the variable names in the output dataset produced by PROC TRANSPOSE. If you have duplicate values of COL within any of your BY GROUPS (within all the rows for a unique value of MRN), this produces an error because you cannot have two variables with the same name in the output dataset.

this example might help:

data temp;

MRN='a'; col='1'; dum=100; output;

MRN='a'; col='3'; dum=87; output;

MRN='b'; col='1'; dum=234; output;

MRN='b'; col='3'; dum=21; output;

run;

data temp1;

MRN='a'; col='1'; dum=100; output;

MRN='a'; col='3'; dum=87; output;

MRN='a'; col='1'; dum=22; output;

MRN='b'; col='1'; dum=234; output;

MRN='b'; col='3'; dum=21; output;

run;

proc print data=temp;

run;

proc print data=temp1;

run;

proc transpose data=temp out=temptrans;

by MRN;

var dum;

id col;

run;

proc print data=temptrans;

run;

proc transpose data=temp1 out=temp1trans;

by MRN;

var dum;

id col;

run;

If you execute this you will see that the first transpose on the data temp works because within each group for MRN values of col are not duplicated. In the dataset temp2 an extra row is added with a duplicate value for COL in the group where MRN='a'. Note in this example that since the first character of ID is numeric, PROC TRANSPOSE adds an '_' character as a prefix when it generates the column names from values of COL.

If you want to transpose your data like this you will need to first remove duplicate values for COL within each group of MRN. One way to do this (although it may not be the way you need to do it for your data) is to use PROC SORT, sort on MRN and COL and add the NODUPKEY option to your PROC SORT statement. Then you can run the transpose on the de-duplicated output from PROC SORT:

proc sort data=temp1 nodupkey out=tempdedup;

by mrn col;

run;

proc transpose data=tempdedup out=temp1trans;

by MRN;

var dum;

id col;

run;

chuie · Posted 03-31-2017 03:02 PM

This worked as a magic.

Thank you David

chuie · Posted 05-13-2017 03:35 PM

David,

I would like to count the number of ED after clinic date. In this example we only have one ed encounter so this code worked. But data contains multiple visits to visit after clinic visit so how do I count the number of ED encounter?

Please advise

Thank you

DavidGhan · Posted 05-30-2017 11:24 AM

can you provide some representative of your data that contains records with multiple values you describe and what you need the result to be for that representative of data? That will help me to understand exacly what you have and what you need to produce as output.

chuie · Posted 08-08-2017 12:23 PM

Thank you David. Sorry for late reply

Objective is to count the numbe ro of ED and hospitalization BEFORE AND AFTER his first visit to clinic A

here is my orginal data

ID Date Visit type

1       2/1                       ED
1       2/2                       hospitalization
1      2/3                         clinic A
1      2/4                          ED

1 5/4 ED

1 6/10 hospitalization
2 2/1 ED

2       2/5                       ED
2      2/10                         clinic A
2       2/13                       hospitalization
2      21/4                          ED

For ID 1 the visit date of clinic A is 2/3 so count the visit to ED and hospitalization before and after this date like so,

ID before_ED before_hospitalization after_ed After_hospitalization

1 1 1 2 1

2 2 0 1 1

there were 1 ED in 2/1 that is before his viist to clinnic A in 2/3 so it was calcuated to before ED.

Does it make sence? Please HELP

Thanks

DavidGhan · Posted 08-30-2017 09:59 AM

Well, it took me a few different steps but I did come up with a solution:

data original;

length id 8 date 8 visittype $20;

infile cards dlm=',';

input ID dateval $ visittype $;

year=2017;

month=input(scan(dateval,1, '/'), 2.);

day=input(scan(dateval,2, '/'), 2.);

date = mdy(month,day, year);

format date mmddyy5.;

drop dateval year month day;

cards;

1, 2/1, ED

1, 2/2, hospitalization

1, 2/3, clinic A

1, 2/4, ED

1, 5/4, ED

1, 6/10, hospitalization

2, 2/1, ED

2, 2/5, ED

2, 2/10, clinic A

2, 2/13, hospitalization

2, 4/21, ED

;

run;

proc print data=original;

run;

proc sql;

create table firstdates as

select id, min(date) as fdate format = mmddyy5.

from original

where visittype='clinic A'

group by id;

create table firstdates2 as

select o.id, date, fdate, visittype,

case

when date <= fdate then 'Before'

when date > fdate then 'After'

end as BeforAft,

cat(calculated beforaft,visittype) as group

from original as o, firstdates as f

where o.id=f.id;

quit;

proc sql;

create table counts as

select id, group, count(*) as count

from firstdates2

where visittype ne 'clinic A'

group by id, group;

quit;

proc transpose data=counts out=c2(drop=_name_);

by id;

var count;

id group;

run;

proc print data=c2;

run;

chuie · Posted 08-31-2017 11:13 AM

David,

You are my "SAS GOD" 🙂

I cannot thank you enough....

Catch up on SAS Innovate 2026