BookmarkSubscribeRSS Feed
LoisHNewMexico
Calcite | Level 5
I have a dataset with multiple lines per record. Each record is a healthcare encounter, identified using a combination of health record ID and date of encounter, which appear on the first line, only. I would like to add an Encounter ID variable to all lines of the encounter, and then use proc transpose BY EncounterID to get my diagnosis and procedure data onto the firstline. Here is basically what I've got (hopefully this will format to something legible):

LineNum HlthRecordID City ServiceDate Diagnosis Procedure

1 mc21898 Cty, NM 4/28/2007 4.74 368.47
2 . . . 86.35 326.87
3 mc59002 Cty, UT 8/8/2007 88.61 966.07
4 . . . 49.22 625.9
5 . . . 333.18
6 . . . 922.09
7 . . . 847.55
8 mc21898 Cty, NM 6/4/2007 41.17 869.38
9 . . . 43.26 193.39
10 . . . 86.35 799.5
11 . . . 4.06 641.92
12 . . . 536.27
13 . . . 971.87
14 mc14473 Cty, AZ 8/16/2007 64.11 60.2
15 . . . 91.56 663.38
16 . . . 91.58 800.32
17 . . . 684.24
18 . . . 529.19
19 . . . 747.11
20 . . . 849.6

format data in message

Message was edited by: LoisH New Mexico

Message was edited by: LoisH New Mexico oops, I meant proc transpose!


Message was edited by: LoisH New Mexico
6 REPLIES 6
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Suggest looking at a SAS DATA step with "BY GROUP PROCESSING" where you will have a "counter variable" incremented with each new "condition" which you would define, based on your SAS variables. In your code, you will want to consider using a code piece like:

IF FIRST. THEN DO;
* your assignment logic here ;
END;

possibly using a RETAIN statement to increment your Encounter_ID variable, finally resetting it back to some initial-value with each new "condition".

You will find the SAS support http://support.sas.com/ website, with SAS-hosted DOC and supplemental technical / conference, can provide guidance and code examples. As well, there are forum archives on this topic of creating an "ID variable" for tracking purposes.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search arguments, this topic / post:

data step programming by group processing site:sas.com

proc transpose id variable site:sas.com
data_null__
Jade | Level 19
This is similar to a program I have seen sometimes with data read from EXCEL spread sheets.

See if this example works for you.

[pre]
data test;
infile cards missover;
input LineNum HlthRecordID:$8. City &$16. ServiceDate:mmddyy. Diagnosis Procedure;
format servicedate mmddyy.;
cards;
1 mc21898 Cty, NM 4/28/2007 4.74 368.47
2 . . . 86.35 326.87
3 mc59002 Cty, UT 8/8/2007 88.61 966.07
4 . . . 49.22 625.9
5 . . . 333.18
6 . . . 922.09
7 . . . 847.55
8 mc21898 Cty, NM 6/4/2007 41.17 869.38
9 . . . 43.26 193.39
10 . . . 86.35 799.5
11 . . . 4.06 641.92
12 . . . 536.27
13 . . . 971.87
14 mc14473 Cty, AZ 8/16/2007 64.11 60.2
15 . . . 91.56 663.38
16 . . . 91.58 800.32
17 . . . 684.24
18 . . . 529.19
19 . . . 747.11
20 . . . 849.6
;;;;
run;
proc print;
run;
data filled;
if 0 then set test;
set test(rename=(HlthRecordID=_1 City=_2 ServiceDate=_3));
HlthRecordID = coalesceC(_1,HlthRecordID);
City = coalesceC(_2,City);
ServiceDate = coalesce(_3,ServiceDate);
drop _:;
run;
proc print;
run;
[/pre]
Ksharp
Super User
Hi.
Be honest,I can not understant what your mean totally.

>to get my diagnosis and procedure data onto the firstline

You said first line.But there are two variables (Diagnosis and Procedure) ,namely it will appear two lines not one line after proc transpose.


[pre]
data test;
infile cards missover dsd delimiter=' ';
input LineNum HlthRecordID : $8. City : $16. ServiceDate : mmddyy10. Diagnosis Procedure;
format servicedate mmddyy10.;
cards;
1 mc21898 Cty,NM 4/28/2007 4.74 368.47
2 86.35 326.87
3 mc59002 Cty,UT 8/8/2007 88.61 966.07
4 49.22 625.9
5 333.18
6 922.09
7 847.55
8 mc21898 Cty,NM 6/4/2007 41.17 869.38
9 43.26 193.39
10 86.35 799.5
11 4.06 641.92
12 536.27
13 971.87
14 mc14473 Cty,AZ 8/16/2007 64.11 60.2
15 91.56 663.38
16 91.58 800.32
17 684.24
18 529.19
19 747.11
20 849.6
;;;;
run;
data temp(keep=EncounterID Diagnosis Procedure);
set test;
retain EncounterId ;
if not missing(HlthRecordID) then EncounterId=catx('_',HlthRecordID,put(ServiceDate,mmddyy10.));
run;
proc sort data=temp;
by EncounterID;
proc transpose data=temp out=result;
by EncounterID;
run;
[/pre]



Ksharp
LoisHNewMexico
Calcite | Level 5
Each health system encocunter has several lines of data, with the identifying information, the first diagnosis code and the first procedure code on the first line, while the remaining lines have only the additional diagnosis and procedure codes. I'd like the diagnosis code on the first line to populate a variable called Diag1, the diagnosis code on the second line to populate a variable called Diag2, etc. And likewise with the procedure codes. The final result will be to have the identifying information AND all the diag and proc variables (up to Diag18 and Proc18) on one line per health system encounter.

I haven't had time to try any of the suggestions, yet. Thank you to everyone who has responded!
Ksharp
Super User
Hi.
OK. That would not be too difficult.
See whether it is what you want.

[pre]
data test;
infile cards missover dsd delimiter=' ';
input LineNum HlthRecordID : $8. City : $16. ServiceDate : mmddyy10. Diagnosis Procedure;
format servicedate mmddyy10.;
cards;
1 mc21898 Cty,NM 4/28/2007 4.74 368.47
2 86.35 326.87
3 mc59002 Cty,UT 8/8/2007 88.61 966.07
4 49.22 625.9
5 333.18
6 922.09
7 847.55
8 mc21898 Cty,NM 6/4/2007 41.17 869.38
9 43.26 193.39
10 86.35 799.5
11 4.06 641.92
12 536.27
13 971.87
14 mc14473 Cty,AZ 8/16/2007 64.11 60.2
15 91.56 663.38
16 91.58 800.32
17 684.24
18 529.19
19 747.11
20 849.6
;;;;
run;
data temp(keep=EncounterID Diagnosis Procedure);
set test;
retain EncounterId ;
if not missing(HlthRecordID) then EncounterId=catx('_',HlthRecordID,put(ServiceDate,mmddyy10.));
run;
data op(keep=encounterid varname value);
set temp;
array a{2} diagnosis procedure;
if encounterid ne lag(encounterid) then count=0;
count+1;
do i=1 to 2;
varname=cats(vname(a{i}),count);
value= a{i};
output;
end;
run;
proc sort data=op;
by encounterid;
run;
proc transpose data=op out=result(drop = _name_);
by encounterid;
id varname;
var value;
run;
proc print;run;
[/pre]



Ksharp
deleted_user
Not applicable
Hello,

my solution is based on reading data twice, first in order to create the maximum number of observation per id, and second to generate the transposed data:

data inp;
infile datalines missover;
input @;
if anyalpha(substr(_infile_,3,4)) then do;
input @3 id:$8. City $ 11-18 date:mmddyy. Diagnosis Procedure;
end;
else do;
input @30 Diagnosis Procedure;
end;
format date mmddyy.;
datalines;
1 mc21898 City, NM 4/28/2007 4.74 368.47
2 . . . 86.35 326.87
3 mc59002 City, UT 8/8/2007 88.61 966.07
4 . . . 49.22 625.9
5 . . . 333.18
6 . . . 922.09
7 . . . 847.55
8 mc21898 City, NM 6/4/2007 41.17 869.38
9 . . . 43.26 193.39
10 . . . 86.35 799.5
11 . . . 4.06 641.92
12 . . . 536.27
13 . . . 971.87
14 mc14473 Cty , AZ 8/16/2007 64.11 60.2
15 . . . 91.56 663.38
16 . . . 91.58 800.32
17 . . . 684.24
18 . . . 529.19
19 . . . 747.11
20 . . . 849.6
;
run;

data out ;
retain rec_id city_from date_service ;

do until (last);
set inp (keep=id) end=last;
by id notsorted;
if first.id and ^missing(id) then count=1;
else count+1;
end;

call symput('new',put(count,7.));

array diag{&new};
array proc{&new};

do until (n_last);
set inp (rename=(id= old_id City =old_City date=old_date)) end=n_last;
by old_id notsorted;

if first.old_id and ^missing(old_id) and ^missing(lag(Diagnosis)) then output;

if first.old_id and ^missing(old_id) then do;
n_count=1;
rec_id=old_id;
city_from=old_City;
date_service=old_date;
end;
else n_count+1;
Diag{n_count}=Diagnosis;
Proc{n_count}=Procedure;
if n_last then output;
end;

format date_service mmddyy10.;

drop old_id old_City old_date id count n_count Diagnosis Procedure;

run;

Marius

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1321 views
  • 0 likes
  • 5 in conversation