BookmarkSubscribeRSS Feed
catkat96
Obsidian | Level 7

I'm trying to create a table from information on another table. So, on the first one, let's call it Data1, I have panel data (many observations/individuals for many days).

On the second one, which I called "DataOutcome", I am trying to get one row per observation, and summarize something on the other table: if the last date of an observation is later than some day, then a variable will take value 0, or 1 in any other case. 

This is my code so far, it just gets the id's from another table and creates the date that would define this new variable's outcome.  

 

Apologies if this is a very bad question, I'm quite new at SAS/SQL. 

Here's a data snippet:

 

data OriginalData;
   infile datalines delimiter=',';
   input date $ agreement_number $ outcome;
   datalines;
05/11/2020,1000,Answered
06/11/2020,1001,Answered
08/11/2020,1002,Not Answered
08/11/2020,1003,Answered
;
data Data1;
   infile datalines delimiter=',';
   input agreement_number $ date;
   datalines;
1000, 06/11/2020
1000, 07/11/2020
1001, 06/11/2020
1001, 07/11/2020
1001, 08/11/2020
1001, 09/11/2020
1002, 08/11/2020
1003, 08/11/2020
1003, 08/11/2020
;

 

 

proc SQL;
create table DataOutcome as
select Agreement_Number,
	Date,
Outcome, intnx("day",Date,2) as new_date format=ddmmyy11. from work.OriginalData where outcome = "Answered";
/* for each Agreement_Number
if the last date in Data1 is between Date and new_date
New_var = 1 */ Group by 1; Quit;

 

6 REPLIES 6
maguiremq
SAS Super FREQ

Please do provide a data snippet using DATALINES. This seems like it would be done more efficiently in a DATA step, but I can't really tell. I don't see a join condition anywhere in your code, either.

catkat96
Obsidian | Level 7

Included a datalines snippet on the original post. 

I'm not using joins because I don't really need to join anything (I believe), I'm creating a completely new table... but perhaps there is a use for JOIN here that I'm not aware of.

utrocketeng
Quartz | Level 8

if i understand your objective correctly, i believe this code should do the trick:

data OriginalData;
   infile datalines delimiter=',';
   input date :ddmmyy10. agreement_number $ outcome $ ;
   format date ddmmyy10.;
   datalines;
05/11/2020,1000,Answered
06/11/2020,1001,Answered
08/11/2020,1002,Not Answered
08/11/2020,1003,Answered
;

data Data1;
   infile datalines delimiter=',';
   input agreement_number $ date :ddmmyy10.;
   format date ddmmyy10.;
   datalines;
1000, 06/11/2020
1000, 07/11/2020
1001, 06/11/2020
1001, 07/11/2020
1001, 08/11/2020
1001, 09/11/2020
1002, 08/11/2020
1003, 08/11/2020
1003, 08/11/2020
;

PROC SQL;
CREATE TABLE work.dataMaxDate AS SELECT agreement_number, Max(Date) as MostRecentDate FROM work.data1 GROUP BY agreement_number;
CREATE TABLE work.final AS
SELECT 
	a.*
	,CASE when b.MostRecentDate is not null then 1 else 0 end as New_Var
FROM 
	originaldata a
	LEFT JOIN work.dataMaxDate b on a.agreement_number = b.agreement_number and b.MostRecentDate >= a.date and b.MostRecentDate < a.date+2
;
QUIT;

note that you want to look at the work.final table.

 

good luck

catkat96
Obsidian | Level 7

Hi, thanks for your reply! 

I think it's kinda doing what I want sans the filtering by "Answered". I tried adding that but I can't get it to work, can you tell me where am I mistaken here?

 

PROC SQL;
CREATE TABLE work.dataMaxDate AS SELECT agreement_number, Max(Date) as MostRecentDate FROM work.data1 GROUP BY agreement_number;
CREATE TABLE work.final AS
SELECT 
	a.*
	,CASE when b.MostRecentDate is not null then 1 else 0 end as New_Var
FROM 
	originaldata a
	LEFT JOIN work.dataMaxDate (WHERE = (outcome = "Answered")) b on a.agreement_number = b.agreement_number and b.MostRecentDate >= a.date and b.MostRecentDate < a.date+2
;
QUIT;

 

maguiremq
SAS Super FREQ

I still think we're missing something. This code doesn't capture exactly what you put in your "OriginalData", but I don't know if you intend to count duplicate dates as "Answered". Some clarification there might help. You could also use PROC TRANSPOSE, I think. Nonetheless, PROC SQL:

 

proc sql;
	create table 	want as
		select		
					agreement_number,
					min(date) as min_date
						format = mmddyy10.,
					case
						when calculated min_date = max(date) then "Not Answered"
						else "Answered"
					end as outcome
		from		
					data1
		group by
					agreement_number
		having
					agreement_number = min(agreement_number);
quit;

I highly recommend that you format your code. It helps spot errors more easily, like your semicolon after your WHERE clause in your query. Some people probably think my formatting is a little overkill, but consistency is key. Edit: it looks as though the "OriginalData" and "Data1" don't have the same dates. I hope I'm not missing something obvious.

 

Thank you providing some example data.

catkat96
Obsidian | Level 7

Hi, thanks for the reply!

I think I might have not explained my problem properly. I have both OriginalData and Data1 already, I want to create a new table mixing info from both: I want to

-check OriginalData to identify those agreements that are "Answered"

-identify the date when that happened

-identify what is two days after that date (date + 2)

-check on the Data1 if those agreements' last date is larger than date + 2 (then assign value 0) or is between date and date + 2 (then assign value 1).

 

Is there anything I'm forgetting to make it clear?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 951 views
  • 1 like
  • 3 in conversation