BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mosabbirfardin
Fluorite | Level 6

Hello,

I am trying to think of marking a record with 'Y' if the next line in the file starts with 'NTE' and 'N' f the next line in the file does not start with 'NTE'.

I am using the following program to create a dataset of OBX records and I want to add a column named 'has_NTE'  which would basically say whether a specific OBX record has an NTE a trailing segment and/or record.  

 

* get the max column count of non-missing columns;
data _null_;
   retain column_count;
   count=0;
   infile obx end=eof;
   input;
   if _infile_ =: "OBX" then
   do i=1 to countc(_infile_,"|^")+1;
   count+1;
      column_count=max(column_count,count);
   end;
   if eof then call symputx("column_count",column_count);
run;
%put &=column_count;

data OBX;
   infile obx lrecl=32767;
   ID=_N_;
   length var1-var&column_count $100;
   array vars{*} var:;
   input;
   if _infile_ =: "OBX";
   do i=1 to dim(vars);
      vars{i}=scan(_infile_,i,"|^",'M');
   end;
   drop i;
run;

I want my output column to be 'Y' for the following scenario:

 

OBX|1|ST|13457-7^^LOINC^25016900^LDL-CHOLESTEROL^L||SEE NOTE|mg/dL (calc)|^99^||||F|||201901030942|||
NTE|1||     

 

And I want my output column to be 'N' for any other scenarios

 

Please help me enhancing the code. Thanks is advance. 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
mosabbirfardin
Fluorite | Level 6

Thanks everyone for your guidance. I kind of figured it out by myself.

As you are all aware of the hierarchical structure of HL7 files, it seems one NTE segments or a group of NTE segments will always follow an OBX segment but not all OBX segments are followed by NTE segment(s). In my file the first NTE comes at 18th line and 17th line is an OBX segment.

 

So what I did was:

1. Get observation numbers of each NTE segments and OBX segments by using ID=_N_ and create 2 separate datasets (OBX and NTE)

 

 

NTE dataset:

 

sas1.PNG

 

Partial OBX dataset:

sas3.PNG

 

 

2. Subtract 1 from the ID column for NTE segments. I saw a series of incremental ID from which the lowest number would represent the observation numbers (OBX_ID) of the corresponding OBX segments.

 

 

sas2.PNG

 

 

3.I assigned NTE_SK to each NTE  records using the following code so that I can group them together later.

 

 

 

data want;
set have;
if set_id_NTE=1 then NTE_SK+1;
run;

 

 

dataset with NTE_SK:

 

 

sas10.PNG

 

 

4.Then I transposed the NTE dataset and flattened it. This gave me 2 types of records. One holds the whole comment for the OBX segment and the other holds a bunch of numbers from which the lowest number would represent the observation numbers of the corresponding OBX segments.

 

 

proc transpose data=WORK.NTE_WITH_ID out=WORK.TRANSPOSED_NTE prefix=x;

   by NTE_SK;

   var COMMENT previous_OBX_ID;

run;

data WORK.FLATTENED_NTE;

   set WORK.TRANSPOSED_NTE;
   LENGTH FULL_COMMENT $3000;
   FULL_COMMENT = CATX(' ', OF X:);
   keep NTE_SK _NAME_ FULL_COMMENT previous_OBX_ID;
   DROP X:;
run;

 

 

Transposed NTE dataset:

 

sas4.PNG

 

Flattened NTE dataset:

 

sas5.PNG

 

 

5. Then I split the dataset into 2 datasets using the type of records. One would have just the comments and other would have just the incremental numbers

 

 

 

/* define which libname.member table, and by which column */
%let TABLE=WORK.FLATTENED_NTE;
%let COLUMN=_NAME_;
 
proc sql NOPRINT;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct 
   cat("DATA NTE_",compress(&COLUMN.,,'kad'),
   "; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
   "')); run;") length=500 into :allsteps separated by ';' 
  from &TABLE.;
quit;
 
/* macro that includes the program we just generated */
%macro runSteps;
 &allsteps.;
%mend;
 
/* and...run the macro when ready */
%runSteps;

 

 

Just comments:

sas6.PNG

 

Just numbers:

 

sas7.PNG

 

6. I used a SCAN function on the FULL_COMMENT in the numbers' dataset to get the first number which would be the OBX_ID.

 

sas8.PNG

 

7. Then I joined this dataset with comment dataset using NTE_SK and now I have an OBX_ID with every comments to join OBX table with.

 

Sorry for this huge post. I know I have done a lot of post processing to get where I am now. I would really appreciate if someone shares a faster and more efficient approach to do this. Take care!

 

 

 

 

 

 

View solution in original post

4 REPLIES 4
KDG
SAS Employee KDG
SAS Employee

Hi! I would suggest merging the dataset with itself to simulate a "lead" function: 

 

http://support.sas.com/kb/24/666.html

 

 


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

ballardw
Super User

There are some very good reasons to code Yes/No or True/False as numeric 1 and 0 instead of character values.

Report procedures like Proc Report or Tabulate you can use the N statistic to get the number of responses, the SUM statistic to get the number of Yes, the MEAN statistic to get a percentage of yes, Range to tell if all of the responses in a group are the same (either all 0 or all 1 when the Range=0) , MAX to find if any of the records were coded yes (when the MAX=1), MIN to find if any were coded No (when the Min=0).

 

Another is you can use: if code then <whatever>; as SAS treats 1 as true .

 

There are even some interesting calculations that can be done with code bits like: code*(some calculations) which return 0 for that when code is no.

 

 

ScottBass
Rhodochrosite | Level 12
data sk / view=sk;
   sk+1;
   set sashelp.class;
run;

proc sql;
   create table lead as
   select
      a.name
     ,b.name as next_name
   from
      sk a
   left join 
      sk b
   on 
      a.sk=b.sk-1
   ;
quit;

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
mosabbirfardin
Fluorite | Level 6

Thanks everyone for your guidance. I kind of figured it out by myself.

As you are all aware of the hierarchical structure of HL7 files, it seems one NTE segments or a group of NTE segments will always follow an OBX segment but not all OBX segments are followed by NTE segment(s). In my file the first NTE comes at 18th line and 17th line is an OBX segment.

 

So what I did was:

1. Get observation numbers of each NTE segments and OBX segments by using ID=_N_ and create 2 separate datasets (OBX and NTE)

 

 

NTE dataset:

 

sas1.PNG

 

Partial OBX dataset:

sas3.PNG

 

 

2. Subtract 1 from the ID column for NTE segments. I saw a series of incremental ID from which the lowest number would represent the observation numbers (OBX_ID) of the corresponding OBX segments.

 

 

sas2.PNG

 

 

3.I assigned NTE_SK to each NTE  records using the following code so that I can group them together later.

 

 

 

data want;
set have;
if set_id_NTE=1 then NTE_SK+1;
run;

 

 

dataset with NTE_SK:

 

 

sas10.PNG

 

 

4.Then I transposed the NTE dataset and flattened it. This gave me 2 types of records. One holds the whole comment for the OBX segment and the other holds a bunch of numbers from which the lowest number would represent the observation numbers of the corresponding OBX segments.

 

 

proc transpose data=WORK.NTE_WITH_ID out=WORK.TRANSPOSED_NTE prefix=x;

   by NTE_SK;

   var COMMENT previous_OBX_ID;

run;

data WORK.FLATTENED_NTE;

   set WORK.TRANSPOSED_NTE;
   LENGTH FULL_COMMENT $3000;
   FULL_COMMENT = CATX(' ', OF X:);
   keep NTE_SK _NAME_ FULL_COMMENT previous_OBX_ID;
   DROP X:;
run;

 

 

Transposed NTE dataset:

 

sas4.PNG

 

Flattened NTE dataset:

 

sas5.PNG

 

 

5. Then I split the dataset into 2 datasets using the type of records. One would have just the comments and other would have just the incremental numbers

 

 

 

/* define which libname.member table, and by which column */
%let TABLE=WORK.FLATTENED_NTE;
%let COLUMN=_NAME_;
 
proc sql NOPRINT;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct 
   cat("DATA NTE_",compress(&COLUMN.,,'kad'),
   "; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
   "')); run;") length=500 into :allsteps separated by ';' 
  from &TABLE.;
quit;
 
/* macro that includes the program we just generated */
%macro runSteps;
 &allsteps.;
%mend;
 
/* and...run the macro when ready */
%runSteps;

 

 

Just comments:

sas6.PNG

 

Just numbers:

 

sas7.PNG

 

6. I used a SCAN function on the FULL_COMMENT in the numbers' dataset to get the first number which would be the OBX_ID.

 

sas8.PNG

 

7. Then I joined this dataset with comment dataset using NTE_SK and now I have an OBX_ID with every comments to join OBX table with.

 

Sorry for this huge post. I know I have done a lot of post processing to get where I am now. I would really appreciate if someone shares a faster and more efficient approach to do this. Take care!

 

 

 

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1499 views
  • 0 likes
  • 4 in conversation