BookmarkSubscribeRSS Feed
christinagting0
Quartz | Level 8

Hi Everyone, 

 

I am trying to flag first and last observations based on baseline date (baseline_doc) for all same groupings of client_id.

 

For example, I want my final dataset to look like this:

 

Client_id       Baseline_Date          First.baseline_date       Last.baseline_date
     1               April 1 2016             0                           0
     1              April 20 2016             0                           1
     1              January 1 2016            1                           0
     2              January 1 2016            1                           0
     2              January 20 2016           0                           1
     3              February 20 2016          1                           1
   

This is the code I am using:

 

data FA_CA_DADOS_CLEAN_DELETED;
	set fa_ca_dados_clean2;
	by client_id baseline_doc;
	firstclient_id = first.client_id;
	lastclient_id = last.client_id;
	firstBaseline_doc = first.baseline_doc;
	lastBaseline_doc = last.baseline_doc;
run;

However, my code doesn't give me the output like I want. What is happening is that every client_id is getting tagged with a 1 for last.baseline_doc but there definitely are some that should be 0 b/c I have duplicate client_ids with different dates.

 

What am I doing wrong?

 

I copied this code directly from this resource:

https://onlinecourses.science.psu.edu/stat481/node/9

 

and it has the output I want so I'm not sure what is going on.

 

I proc sorted by client_id and baseline_date before I ran the code above!

7 REPLIES 7
LinusH
Tourmaline | Level 20

First, confusing when you are mixing variable names (doc vs date).

Second, nice to show the actual output.

Third, I think you are confusing the levels of BY variable where first. and last. operate.

First. will tag any observation that is first within it's value of the specified  BY-group.

Since you have unique (my guess, I don't see the actual source data) values of baseline_doc/date, every row will tagged.

So my guess is that you only want tag set on first./last.client_id...? Perhaps that wasn't a coincidence that you omitted those variables in your desired output.

Data never sleeps
christinagting0
Quartz | Level 8

Yes that is right I want to tag the first and last client_ids based on baseline date

 

please see my example desired output above. 

 

Anyone else have any ideas?

Astounding
PROC Star

The underlying issue here is that your BASELINE_DATE variable is character.  That's not how SAS is built to handle dates.

 

Try converting it to numeric:

 

baseline_sasdate = input(baseline_date, worddate18.);

 

Then use the numeric version in your programming.

christinagting0
Quartz | Level 8

Hi Astounding..

 

I double checked hoping you were right, but my baseline_doc variable is definitely number data type

 

Any other ideas?

 

Here is my full code so far:

 

proc sort data=FA_CA_DADOS_CLEAN out=fa_ca_dados_clean2;
by mrn baseline_doc;
run;


data FA_CA_DADOS_CLEAN_DELETED;
	set fa_ca_dados_clean2;
	by mrn baseline_doc;
	firstMRN = first.mrn;
	lastMRN = last.mrn;
	firstBaseline_doc = first.baseline_doc;
	lastBaseline_doc = last.baseline_doc;
run;

 MRN = client ID

Basedline_doc = baseline date

 

 

Astounding
PROC Star

Well, if your baseline date is already numeric, then the output you are showing doesn't match the program you are running.  There is no way for April 1 2016 to sort earlier than January 1 2016, when the variable is on SAS's date scale.  That's the order you would get for a character variable, not numeric.  I can't see what's in your data, but I know that there's a mismatch somewhere.

 

 

 

christinagting0
Quartz | Level 8

Thanks everyone!

 

I couldn't figure it out so I just ended up doing it quickly in excel...

 

if anyone else knows the reason please post a solution. this will forever bug me:(

Ksharp
Super User

Did I miss something ?

Just SORT and use FIRST LAST keyword ?

 

 

proc sort data=have;

by Client_id       Baseline_Date ;

run;

 

data want;

 set have;

 by Client_id ;

 first=first.Client_id      ;

 last=last.Client_id      ;

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2500 views
  • 2 likes
  • 4 in conversation