BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JinboZhao
Calcite | Level 5

Hi SAS,

 

I have two datasets, and I want to do the following steps in SAS, but didnot get the result.

1. For the table 1, there are two date variables, have the FORMAT: DATETIME26.6 and INFORMAT: DATATIME26.6.  In the table, their value likes this: 29FEB2008:00:00:00.000000.  I want those date values show like 29FEB2008. 

 

This is part of my dataset:

IDVALID_START_DTTMVALID_END_DTTM
131JAN2008:00:00:00.00000028FEB2008:23:59:59.000000
229FEB2008:00:00:00.00000030MAR2008:23:59:59.000000
331MAR2008:00:00:00.00000030OCT2008:23:59:59.000000
431OCT2008:00:00:00.00000027NOV2008:23:59:59.000000
528NOV2008:00:00:00.00000029NOV2009:23:59:59.000000
630NOV2009:00:00:00.00000029APR2012:23:59:59.000000
730APR2012:00:00:00.00000029APR2013:23:59:59.000000
830APR2013:00:00:00.00000030MAY2013:23:59:59.000000
931MAY2013:00:00:00.00000030OCT2013:23:59:59.000000
1031OCT2013:00:00:00.00000030JAN2014:23:59:59.000000
1131JAN2014:00:00:00.00000028AUG2014:23:59:59.000000

 

My coding is:  

DATA NEWFORMAT101;
SET TL.BRIGE_101;
FORMAT VALID_START_DTTM VALID_END_DTTM DDMMYY8.;
RUN;

 

But, I only got ******** for values in those two variabels. 

Could you help me?

 

2, Once I get the result from step 1, I want to change those values to numeric values in both VALID_START_DTTM and VALID_END_DTTM variables, according to the following table:

 

PERIOD_IDTIME_ID
29 February 201698
31 March 201699
30 April 2016100
31 May 2016101
30 June 2016102
31 August 2016104

 

3, after get the result from step2, I want to select observations which the time 101 is included within the VALID_START_DTTM and the VALID_END_DTTM period. 

 

Thank you. 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ok, first in the smaller dataset:

 

data dataset2;
  set dataset2;
  dt="31MAY2016"d + (101-time);
  format dt date9.;
run;

The above sets dt to an actual date value which is that date + 0 if 101, 1 if 102 etc.

 

This can then be merged to the main dataset by using:

 

proc sql;
  create table WANT as
  select  A.*,
          B.TIME
  from    DATASET1 A
  left join DATASET2 B
  on      A.ID=B.ID
  and     datepart(A.VALID_START_DTTM) <= B.DT <= datepart(A.VALID_END_DTTM);
quit;

So what I am saying here is merge TIME from dataset 2 onto the main dataset where the date I calculate in the previous datastep is between the two dttm values from dataset1, and only comparing the date part of the value.

 

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

For step 1 create a new variable, put the datepart of the datatime in, then format that as you need:

data want;
  set have
  valid_start_dt=datepart(valid_start_dttm);
  format valid_start_dt date9.;
run;

Step 2 makes no sense to me, what does the clumn time_id mean and how do you derive it?

 

Step 3, again, makes no sense, what is time_id in relation to a datetime variable?

JinboZhao
Calcite | Level 5

Thank you, I got the result from step1. 

 

For step 2, I want to replace the date values to numeric values. Because I want to join two dataset, dataset 1 has ID, VALID_START_DTTM, VALID_END_DTTM and other variables .  Dataset 1 has millions of records, every ID has many no overlap START and END date. 

                                                                            DATASET 1 

IDVALID_START_DTTMVALID_END_DTTMQUALITYGRADE
131JAN2008:00:00:00.00000028FEB2008:23:59:59.000000AA
129FEB2008:00:00:00.00000030MAR2008:23:59:59.000000BE
131MAR2008:00:00:00.00000030OCT2015:23:59:59.000000CD
131OCT2015:00:00:00.00000027NOV2016:23:59:59.000000DE
128NOV2016:00:00:00.00000029NOV2017:23:59:59.000000EA
230NOV2009:00:00:00.00000029APR2012:23:59:59.000000AE
230APR2012:00:00:00.00000029APR2013:23:59:59.000000BD
230APR2013:00:00:00.00000030MAY2013:23:59:59.000000CE
231MAY2013:00:00:00.00000030OCT2015:23:59:59.000000DA
231OCT2015:00:00:00.00000030JAN2016:23:59:59.000000EE
231JAN2016:00:00:00.00000028AUG2017:23:59:59.000000AA

 

Dataset 2 likes the following, but the TIME value is specific to 101. 

ID TIME 
1101
2101
3101

 

Time 101 is corresponding to time 31MAY2016 in my working enviroment. 

 

I want to get a joined table, which includes ID, TIME (only has value101) and QUALITY and GRADE. The condition is, TIME 101, the same as 31MAY2016 must be in one of each ID's START and END period. 

 

The final result should like this:

 

IDVALID_START_DTTMVALID_END_DTTMQUALITYGRADETIME
131OCT2015:00:00:00.00000027NOV2016:23:59:59.000000DE101
231JAN2016:00:00:00.00000028AUG2017:23:59:59.000000AA101

 

Hope this time, I say it clearly. Thank you. 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ok, first in the smaller dataset:

 

data dataset2;
  set dataset2;
  dt="31MAY2016"d + (101-time);
  format dt date9.;
run;

The above sets dt to an actual date value which is that date + 0 if 101, 1 if 102 etc.

 

This can then be merged to the main dataset by using:

 

proc sql;
  create table WANT as
  select  A.*,
          B.TIME
  from    DATASET1 A
  left join DATASET2 B
  on      A.ID=B.ID
  and     datepart(A.VALID_START_DTTM) <= B.DT <= datepart(A.VALID_END_DTTM);
quit;

So what I am saying here is merge TIME from dataset 2 onto the main dataset where the date I calculate in the previous datastep is between the two dttm values from dataset1, and only comparing the date part of the value.

 

JinboZhao
Calcite | Level 5

Thank you. 

 

I followed your code, it run successfully on my SAS, but some variables' values are missing. There are no values at all in some variables, like ACCOUNT_SK, CUSTOMER_SK. 

This is my code: 

DATA NEWFORMAT101;
SET TL.BRIGE_101;
VALID_START_DT=DATEPART(VALID_START_DTTM);
VALID_END_DT=DATEPART(VALID_END_DTTM);
FORMAT VALID_START_DT VALID_END_DT DATE9.;
RUN;


PROC SQL;
CREATE TABLE NEW101 AS
SELECT DISTINCT t1.TARGET,
t1.ACCOUNT,
t2.ACCOUNT_SK,
t2.CUSTOMER_SK,
t1.PERIOD_DT,
t1.TIME_SK,
t2.VALID_START_DT,
t2.VALID_END_DT, 
t2.PROCESSED_DTTM
FROM TL.TARGET_TIME101 t1
LEFT JOIN NEWFORMAT101 t2 ON (t1.ACCOUNT = t2.ACCOUNT) AND DATEPART(t2.VALID_START_DT) <= t1.PERIOD_DT <= DATEPART(t2.VALID_END_DT);
QUIT;

 

Do you know why? 

Thank you. 

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its doing a left join based on the criteria given, if there is no row from T2 which matches the condition - i.e. is between the two dates, then those variables from T2 will be blank in the output.  Also, please avoid typing all in capitals and use the code window so your indentations show up - its the {i} above the post window.

JinboZhao
Calcite | Level 5

Thank you very much.

 

I tried to get the merge table first and then want to do the condition filter which you used DATEPART(t2.VALID_START_DT) <= t1.PERIOD_DT <= DATEPART(t2.VALID_END_DT).

 

Could you tell me, which SAS process can do this selection? Can you show me the code?

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2727 views
  • 0 likes
  • 2 in conversation