Hello,
I am trying to think of marking a record with 'Y' if the next line in the file starts with 'NTE' and 'N' f the next line in the file does not start with 'NTE'.
I am using the following program to create a dataset of OBX records and I want to add a column named 'has_NTE' which would basically say whether a specific OBX record has an NTE a trailing segment and/or record.
* get the max column count of non-missing columns;
data _null_;
retain column_count;
count=0;
infile obx end=eof;
input;
if _infile_ =: "OBX" then
do i=1 to countc(_infile_,"|^")+1;
count+1;
column_count=max(column_count,count);
end;
if eof then call symputx("column_count",column_count);
run;
%put &=column_count;
data OBX;
infile obx lrecl=32767;
ID=_N_;
length var1-var&column_count $100;
array vars{*} var:;
input;
if _infile_ =: "OBX";
do i=1 to dim(vars);
vars{i}=scan(_infile_,i,"|^",'M');
end;
drop i;
run;
I want my output column to be 'Y' for the following scenario:
OBX|1|ST|13457-7^^LOINC^25016900^LDL-CHOLESTEROL^L||SEE NOTE|mg/dL (calc)|^99^||||F|||201901030942|||
NTE|1||
And I want my output column to be 'N' for any other scenarios
Please help me enhancing the code. Thanks is advance. 🙂
Thanks everyone for your guidance. I kind of figured it out by myself.
As you are all aware of the hierarchical structure of HL7 files, it seems one NTE segments or a group of NTE segments will always follow an OBX segment but not all OBX segments are followed by NTE segment(s). In my file the first NTE comes at 18th line and 17th line is an OBX segment.
So what I did was:
1. Get observation numbers of each NTE segments and OBX segments by using ID=_N_ and create 2 separate datasets (OBX and NTE)
NTE dataset:
Partial OBX dataset:
2. Subtract 1 from the ID column for NTE segments. I saw a series of incremental ID from which the lowest number would represent the observation numbers (OBX_ID) of the corresponding OBX segments.
3.I assigned NTE_SK to each NTE records using the following code so that I can group them together later.
data want;
set have;
if set_id_NTE=1 then NTE_SK+1;
run;
dataset with NTE_SK:
4.Then I transposed the NTE dataset and flattened it. This gave me 2 types of records. One holds the whole comment for the OBX segment and the other holds a bunch of numbers from which the lowest number would represent the observation numbers of the corresponding OBX segments.
proc transpose data=WORK.NTE_WITH_ID out=WORK.TRANSPOSED_NTE prefix=x;
by NTE_SK;
var COMMENT previous_OBX_ID;
run;
data WORK.FLATTENED_NTE;
set WORK.TRANSPOSED_NTE;
LENGTH FULL_COMMENT $3000;
FULL_COMMENT = CATX(' ', OF X:);
keep NTE_SK _NAME_ FULL_COMMENT previous_OBX_ID;
DROP X:;
run;
Transposed NTE dataset:
Flattened NTE dataset:
5. Then I split the dataset into 2 datasets using the type of records. One would have just the comments and other would have just the incremental numbers
/* define which libname.member table, and by which column */
%let TABLE=WORK.FLATTENED_NTE;
%let COLUMN=_NAME_;
proc sql NOPRINT;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct
cat("DATA NTE_",compress(&COLUMN.,,'kad'),
"; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
"')); run;") length=500 into :allsteps separated by ';'
from &TABLE.;
quit;
/* macro that includes the program we just generated */
%macro runSteps;
&allsteps.;
%mend;
/* and...run the macro when ready */
%runSteps;
Just comments:
Just numbers:
6. I used a SCAN function on the FULL_COMMENT in the numbers' dataset to get the first number which would be the OBX_ID.
7. Then I joined this dataset with comment dataset using NTE_SK and now I have an OBX_ID with every comments to join OBX table with.
Sorry for this huge post. I know I have done a lot of post processing to get where I am now. I would really appreciate if someone shares a faster and more efficient approach to do this. Take care!
Hi! I would suggest merging the dataset with itself to simulate a "lead" function:
http://support.sas.com/kb/24/666.html
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
There are some very good reasons to code Yes/No or True/False as numeric 1 and 0 instead of character values.
Report procedures like Proc Report or Tabulate you can use the N statistic to get the number of responses, the SUM statistic to get the number of Yes, the MEAN statistic to get a percentage of yes, Range to tell if all of the responses in a group are the same (either all 0 or all 1 when the Range=0) , MAX to find if any of the records were coded yes (when the MAX=1), MIN to find if any were coded No (when the Min=0).
Another is you can use: if code then <whatever>; as SAS treats 1 as true .
There are even some interesting calculations that can be done with code bits like: code*(some calculations) which return 0 for that when code is no.
data sk / view=sk;
sk+1;
set sashelp.class;
run;
proc sql;
create table lead as
select
a.name
,b.name as next_name
from
sk a
left join
sk b
on
a.sk=b.sk-1
;
quit;
Thanks everyone for your guidance. I kind of figured it out by myself.
As you are all aware of the hierarchical structure of HL7 files, it seems one NTE segments or a group of NTE segments will always follow an OBX segment but not all OBX segments are followed by NTE segment(s). In my file the first NTE comes at 18th line and 17th line is an OBX segment.
So what I did was:
1. Get observation numbers of each NTE segments and OBX segments by using ID=_N_ and create 2 separate datasets (OBX and NTE)
NTE dataset:
Partial OBX dataset:
2. Subtract 1 from the ID column for NTE segments. I saw a series of incremental ID from which the lowest number would represent the observation numbers (OBX_ID) of the corresponding OBX segments.
3.I assigned NTE_SK to each NTE records using the following code so that I can group them together later.
data want;
set have;
if set_id_NTE=1 then NTE_SK+1;
run;
dataset with NTE_SK:
4.Then I transposed the NTE dataset and flattened it. This gave me 2 types of records. One holds the whole comment for the OBX segment and the other holds a bunch of numbers from which the lowest number would represent the observation numbers of the corresponding OBX segments.
proc transpose data=WORK.NTE_WITH_ID out=WORK.TRANSPOSED_NTE prefix=x;
by NTE_SK;
var COMMENT previous_OBX_ID;
run;
data WORK.FLATTENED_NTE;
set WORK.TRANSPOSED_NTE;
LENGTH FULL_COMMENT $3000;
FULL_COMMENT = CATX(' ', OF X:);
keep NTE_SK _NAME_ FULL_COMMENT previous_OBX_ID;
DROP X:;
run;
Transposed NTE dataset:
Flattened NTE dataset:
5. Then I split the dataset into 2 datasets using the type of records. One would have just the comments and other would have just the incremental numbers
/* define which libname.member table, and by which column */
%let TABLE=WORK.FLATTENED_NTE;
%let COLUMN=_NAME_;
proc sql NOPRINT;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct
cat("DATA NTE_",compress(&COLUMN.,,'kad'),
"; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
"')); run;") length=500 into :allsteps separated by ';'
from &TABLE.;
quit;
/* macro that includes the program we just generated */
%macro runSteps;
&allsteps.;
%mend;
/* and...run the macro when ready */
%runSteps;
Just comments:
Just numbers:
6. I used a SCAN function on the FULL_COMMENT in the numbers' dataset to get the first number which would be the OBX_ID.
7. Then I joined this dataset with comment dataset using NTE_SK and now I have an OBX_ID with every comments to join OBX table with.
Sorry for this huge post. I know I have done a lot of post processing to get where I am now. I would really appreciate if someone shares a faster and more efficient approach to do this. Take care!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.