Re: How to check if a column has same value in first and last observat...

AmitChop7391 · Posted 05-16-2019 11:27 AM

Hi,

I have a data set which contains the following columns: PTID, NDC, DRUG_CLASS1_FLAG, DRUG_CLASS2_FLAG, DRUG_CLASS3_FLAG, DRUG_CLASS4FLAG.

I want to find out if these DRUG_CLASS FLAGS have same value in first and last observation of a data set.

For example, in first observation, DRUG_CLASS1_FLAG =1 and in last observation, DRUG_CLASS1_FLAG=1, then I want to find out these type of records

Any help would be appreciated.

Thanks in advance.

PaigeMiller · Posted 05-16-2019 11:31 AM

data new;
    set have end=eof;
    if _n_=1 or eof then output;
run;

proc summary data=new;
     var whatevervariablesyouwant;
     output out=want range= /autoname;
run;

If the output of PROC SUMMARY shows a zero for the RANGEs, then you have the same value in the first and last observation.

--
Paige Miller

AmitChop7391 · Posted 05-16-2019 11:44 AM

To be more clear. I have the following data set

PT_ID	DRUG_CODE	DRUG_1	DRUG_2
1	A	1
1	A	1
1	B		1
1	B		1
1	A	1
1	A	1

Similarly, I have multiple patients data. I want to find out those patients which have same value of flag DRUG_1 or DRUG_2 for their first and last observation.

For example, if for PTID =1, first observations has DRUG_1=1 and last observation has DRUG_1 =1 then this patient should be reported.

PaigeMiller · Posted 05-16-2019 11:48 AM

So we have to perform this analysis for each specific PT_ID, and not over the entire data set, as was my impression from the original post?

And your variables are character variables and not numeric as was my impression from the original post?

In your example, under DRUG_2, the first and last record shows a missing character value, does that count as having the same value?

--
Paige Miller

Kurt_Bremser · Posted 05-16-2019 11:55 AM

Once again, Maxim 42 in action.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

AmitChop7391 · Posted 05-16-2019 11:56 AM

Yes, we have to perform the analysis for each PT_ID. Some variables are character and some are numeric.

DRUG_1 and DRUG_2 are numeric variables and I want only those patients where DRUG_1 or DRUG_2 = 1 for first and last observation. We won't consider missing values.

Thanks

ballardw · Posted 05-16-2019 11:59 AM

@AmitChop7391 wrote:

To be more clear. I have the following data set

PT_ID DRUG_CODE DRUG_1 DRUG_2

1 A 1

1 A 1

1 B 1

1 B 1

1 A 1

1 A 1

Similarly, I have multiple patients data. I want to find out those patients which have same value of flag DRUG_1 or DRUG_2 for their first and last observation.

For example, if for PTID =1, first observations has DRUG_1=1 and last observation has DRUG_1 =1 then this patient should be reported.

Is there any chance you have an actual DATE or some sort of sequence indicator? I have a sneaking suspicion this problem may be much easier to define first and last if you have one of those variables in your data.

AmitChop7391 · Posted 05-16-2019 12:08 PM

Yes, I also have a date field which is in ascending order. First record of each patient has min date and last record has max date.

ballardw · Posted 05-16-2019 12:36 PM

@AmitChop7391 wrote:

Yes, I also have a date field which is in ascending order. First record of each patient has min date and last record has max date.

Something like this will select the first an last records for each patient.

data temp;
   set have;
   by patient date;
   if first.patient or last.patient;
run;

Note that if there is only one record for the patient there will only be one so "first" and "last" are the same for all variables...

For numeric variables a modification of @PaigeMiller's solution

proc summary data=temp nway;
    class patient;
     var whatevervariablesyouwant;
     output out=want range= /autoname;
run;

would have 0 for any of the numeric variables with the same value.

At some point you should provide an example of what you want the final output to look like as well.

AmitChop7391 · Posted 05-16-2019 12:59 PM

PT_ID	DRUG_NAME	DRUG_1	DRUG_2	Date
1	A	1		1/1/2017
1	A	1		5/2/2017
1	B		1	8/2/2017
1	B		1	9/2/2017
1	A	1		12/2/2017
1	A	1		12/31/2017
2	A	1		5/3/2017
2	B		1	6/3/2017
2	A	1		9/3/2017
3	A	1		5/6/2017
3	B		1	7/9/2017
3	B		1	9/9/2017

So, I only want PT_ID 1 and 2 to be reported in my final data set as their DRUG_1 flags have same values in first and last observation.

Thanks

Kurt_Bremser · Posted 05-16-2019 11:37 AM

data test;
set have (obs=1);
output;
if nobs > 1;
set have nobs=nobs point=nobs;
output;
run;

If dataset have has more than one observation, you'll get the first and last observation. Otherwise, it will be one or no observation.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

AmitChop7391 · Posted 05-16-2019 11:45 AM

To be more clear. I have the following data set

PT_ID	DRUG_CODE	DRUG_1	DRUG_2
1	A	1
1	A	1
1	B		1
1	B		1
1	A	1
1	A	1

Similarly, I have multiple patients data. I want to find out those patients which have same value of flag DRUG_1 or DRUG_2 for their first and last observation.

For example, if for PT_ID =1, first observations has DRUG_1=1 and last observation has DRUG_1 =1 then this patient should be reported.

Reeza · Posted 05-16-2019 12:30 PM

Could you please provide a more detailed sample data set. Please include at least two or three different ID's, the date variable and what you would expect as the output. Do you want the list of ID's where the first matches the last? Or all records? Or something else entirely?

@AmitChop7391 wrote:

To be more clear. I have the following data set

PT_ID DRUG_CODE DRUG_1 DRUG_2

1 A 1

1 A 1

1 B 1

1 B 1

1 A 1

1 A 1

Similarly, I have multiple patients data. I want to find out those patients which have same value of flag DRUG_1 or DRUG_2 for their first and last observation.

For example, if for PT_ID =1, first observations has DRUG_1=1 and last observation has DRUG_1 =1 then this patient should be reported.

AmitChop7391 · Posted 05-16-2019 12:49 PM

PT_ID	DRUG_NAME	DRUG_1	DRUG_2	Date
1	A	1		1/1/2017
1	A	1		5/2/2017
1	B		1	8/2/2017
1	B		1	9/2/2017
1	A	1		12/2/2017
1	A	1		12/31/2017
2	A	1		5/3/2017
2	B		1	6/3/2017
2	A	1		9/3/2017
3	A	1		5/6/2017
3	B		1	7/9/2017
3	B		1	9/9/2017

So, I only want PT_ID 1 and 2 to be reported in my final data set as their DRUG_1 flags have same values in first and last observation.

ballardw · Posted 05-16-2019 01:22 PM

To use the RANGE function to check equality we can't have missing values. So here is a modified data set

and code:

data have;
   input PT_ID DRUG_NAME $ DRUG_1 DRUG_2 Date :mmddyy10.; 
   format date mmddyy10.;
   drug_1 = (drug_1=1);
   drug_2 = (drug_2=1);
datalines;
1 A 1 . 1/1/2017 
1 A 1 . 5/2/2017 
1 B . 1 8/2/2017 
1 B . 1 9/2/2017 
1 A 1 . 12/2/2017 
1 A 1 . 12/31/2017 
2 A 1 . 5/3/2017 
2 B . 1 6/3/2017 
2 A 1 . 9/3/2017 
3 A 1 . 5/6/2017 
3 B . 1 7/9/2017 
3 B . 1 9/9/2017 
;

data temp;
   set have;
   by  PT_ID;
   if first.pt_id or last.Pt_id;
run;
proc summary data=temp nway;
   class pt_id;
   var drug_1;
   output out=work.drug_1 (drop=_type_ _freq_ where=(drug_1=0))
       range = ;
run;

The range=0 ones have the same values.

I strongly suggest that you consider using flags of 1/0 coding instead of 1/. . You get some very useful statistics such as percentages by taking the mean of a 1/0 variable. Sum gives you a count of 1's either way, MODE would report whether you have more 1 or 0 values but would not work with missing values,

Depending on specific needs the 0 often comes in handy.

How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?

Re: How to check if a column has same value in first and last observation?