Hi SAS,
I have two datasets, and I want to do the following steps in SAS, but didnot get the result.
1. For the table 1, there are two date variables, have the FORMAT: DATETIME26.6 and INFORMAT: DATATIME26.6. In the table, their value likes this: 29FEB2008:00:00:00.000000. I want those date values show like 29FEB2008.
This is part of my dataset:
ID | VALID_START_DTTM | VALID_END_DTTM |
1 | 31JAN2008:00:00:00.000000 | 28FEB2008:23:59:59.000000 |
2 | 29FEB2008:00:00:00.000000 | 30MAR2008:23:59:59.000000 |
3 | 31MAR2008:00:00:00.000000 | 30OCT2008:23:59:59.000000 |
4 | 31OCT2008:00:00:00.000000 | 27NOV2008:23:59:59.000000 |
5 | 28NOV2008:00:00:00.000000 | 29NOV2009:23:59:59.000000 |
6 | 30NOV2009:00:00:00.000000 | 29APR2012:23:59:59.000000 |
7 | 30APR2012:00:00:00.000000 | 29APR2013:23:59:59.000000 |
8 | 30APR2013:00:00:00.000000 | 30MAY2013:23:59:59.000000 |
9 | 31MAY2013:00:00:00.000000 | 30OCT2013:23:59:59.000000 |
10 | 31OCT2013:00:00:00.000000 | 30JAN2014:23:59:59.000000 |
11 | 31JAN2014:00:00:00.000000 | 28AUG2014:23:59:59.000000 |
My coding is:
DATA NEWFORMAT101;
SET TL.BRIGE_101;
FORMAT VALID_START_DTTM VALID_END_DTTM DDMMYY8.;
RUN;
But, I only got ******** for values in those two variabels.
Could you help me?
2, Once I get the result from step 1, I want to change those values to numeric values in both VALID_START_DTTM and VALID_END_DTTM variables, according to the following table:
PERIOD_ID | TIME_ID |
29 February 2016 | 98 |
31 March 2016 | 99 |
30 April 2016 | 100 |
31 May 2016 | 101 |
30 June 2016 | 102 |
31 August 2016 | 104 |
3, after get the result from step2, I want to select observations which the time 101 is included within the VALID_START_DTTM and the VALID_END_DTTM period.
Thank you.
Ok, first in the smaller dataset:
data dataset2; set dataset2; dt="31MAY2016"d + (101-time); format dt date9.; run;
The above sets dt to an actual date value which is that date + 0 if 101, 1 if 102 etc.
This can then be merged to the main dataset by using:
proc sql; create table WANT as select A.*, B.TIME from DATASET1 A left join DATASET2 B on A.ID=B.ID and datepart(A.VALID_START_DTTM) <= B.DT <= datepart(A.VALID_END_DTTM); quit;
So what I am saying here is merge TIME from dataset 2 onto the main dataset where the date I calculate in the previous datastep is between the two dttm values from dataset1, and only comparing the date part of the value.
For step 1 create a new variable, put the datepart of the datatime in, then format that as you need:
data want; set have valid_start_dt=datepart(valid_start_dttm); format valid_start_dt date9.; run;
Step 2 makes no sense to me, what does the clumn time_id mean and how do you derive it?
Step 3, again, makes no sense, what is time_id in relation to a datetime variable?
Thank you, I got the result from step1.
For step 2, I want to replace the date values to numeric values. Because I want to join two dataset, dataset 1 has ID, VALID_START_DTTM, VALID_END_DTTM and other variables . Dataset 1 has millions of records, every ID has many no overlap START and END date.
DATASET 1
ID | VALID_START_DTTM | VALID_END_DTTM | QUALITY | GRADE |
1 | 31JAN2008:00:00:00.000000 | 28FEB2008:23:59:59.000000 | A | A |
1 | 29FEB2008:00:00:00.000000 | 30MAR2008:23:59:59.000000 | B | E |
1 | 31MAR2008:00:00:00.000000 | 30OCT2015:23:59:59.000000 | C | D |
1 | 31OCT2015:00:00:00.000000 | 27NOV2016:23:59:59.000000 | D | E |
1 | 28NOV2016:00:00:00.000000 | 29NOV2017:23:59:59.000000 | E | A |
2 | 30NOV2009:00:00:00.000000 | 29APR2012:23:59:59.000000 | A | E |
2 | 30APR2012:00:00:00.000000 | 29APR2013:23:59:59.000000 | B | D |
2 | 30APR2013:00:00:00.000000 | 30MAY2013:23:59:59.000000 | C | E |
2 | 31MAY2013:00:00:00.000000 | 30OCT2015:23:59:59.000000 | D | A |
2 | 31OCT2015:00:00:00.000000 | 30JAN2016:23:59:59.000000 | E | E |
2 | 31JAN2016:00:00:00.000000 | 28AUG2017:23:59:59.000000 | A | A |
Dataset 2 likes the following, but the TIME value is specific to 101.
ID | TIME |
1 | 101 |
2 | 101 |
3 | 101 |
Time 101 is corresponding to time 31MAY2016 in my working enviroment.
I want to get a joined table, which includes ID, TIME (only has value101) and QUALITY and GRADE. The condition is, TIME 101, the same as 31MAY2016 must be in one of each ID's START and END period.
The final result should like this:
ID | VALID_START_DTTM | VALID_END_DTTM | QUALITY | GRADE | TIME |
1 | 31OCT2015:00:00:00.000000 | 27NOV2016:23:59:59.000000 | D | E | 101 |
2 | 31JAN2016:00:00:00.000000 | 28AUG2017:23:59:59.000000 | A | A | 101 |
Hope this time, I say it clearly. Thank you.
Ok, first in the smaller dataset:
data dataset2; set dataset2; dt="31MAY2016"d + (101-time); format dt date9.; run;
The above sets dt to an actual date value which is that date + 0 if 101, 1 if 102 etc.
This can then be merged to the main dataset by using:
proc sql; create table WANT as select A.*, B.TIME from DATASET1 A left join DATASET2 B on A.ID=B.ID and datepart(A.VALID_START_DTTM) <= B.DT <= datepart(A.VALID_END_DTTM); quit;
So what I am saying here is merge TIME from dataset 2 onto the main dataset where the date I calculate in the previous datastep is between the two dttm values from dataset1, and only comparing the date part of the value.
Thank you.
I followed your code, it run successfully on my SAS, but some variables' values are missing. There are no values at all in some variables, like ACCOUNT_SK, CUSTOMER_SK.
This is my code:
DATA NEWFORMAT101;
SET TL.BRIGE_101;
VALID_START_DT=DATEPART(VALID_START_DTTM);
VALID_END_DT=DATEPART(VALID_END_DTTM);
FORMAT VALID_START_DT VALID_END_DT DATE9.;
RUN;
PROC SQL;
CREATE TABLE NEW101 AS
SELECT DISTINCT t1.TARGET,
t1.ACCOUNT,
t2.ACCOUNT_SK,
t2.CUSTOMER_SK,
t1.PERIOD_DT,
t1.TIME_SK,
t2.VALID_START_DT,
t2.VALID_END_DT,
t2.PROCESSED_DTTM
FROM TL.TARGET_TIME101 t1
LEFT JOIN NEWFORMAT101 t2 ON (t1.ACCOUNT = t2.ACCOUNT) AND DATEPART(t2.VALID_START_DT) <= t1.PERIOD_DT <= DATEPART(t2.VALID_END_DT);
QUIT;
Do you know why?
Thank you.
Its doing a left join based on the criteria given, if there is no row from T2 which matches the condition - i.e. is between the two dates, then those variables from T2 will be blank in the output. Also, please avoid typing all in capitals and use the code window so your indentations show up - its the {i} above the post window.
Thank you very much.
I tried to get the merge table first and then want to do the condition filter which you used DATEPART(t2.VALID_START_DT) <= t1.PERIOD_DT <= DATEPART(t2.VALID_END_DT).
Could you tell me, which SAS process can do this selection? Can you show me the code?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.