I'm working with longitudinal data containing date and time in the CSV file. I would like to know how to separate the date & time into two separate columns. Please see the data format as below: Ex: Sampling Date: 16-07-20, 04-09-20....
M00000008 16-07-20 0:00 M00000008 04-09-20 0:00 M00000008 27-11-20 0:00 M00000036 25-04-17 0:00 M00000036 24-05-17 0:00 M00000036 14-06-17 0:00 M00000036 12-07-17 0:00 M00000036 09-08-17 0:00 M00000036 06-09-17 0:00 M00000036 28-09-17 0:00 M00000036 25-10-17 0:00 M00000036 22-11-17 0:00 M00000036 20-12-17 0:00 M00000036 17-01-18 0:00 M00000036 07-02-18 0:00 M00000036 07-03-18 0:00 M00000036 04-04-18 0:00 M00000036 02-05-18 0:00
I'm also curious to know what is the best way to import this CSV data into SAS such that I import the date format correctly. Thanks in advance
Since you have text file you can read it directly with a data step. You can read the value as text and then convert the pieces separately into date and time variables. You could try using the ANYDTDTM informat, but that will make some assumptions about strange values and it looks like your values are following a well defined pattern so it would be better to get error messages if the value in the field are strange.
You can also fix the structure to remove values like '09015C_1' and '09015C_2' from the variable names and put them into values of variables instead.
To adjust this code to read from your file replace the in-line data with a RUN statement and change the INFILE statement to point to your actual data file.
data want ;
infile cards dsd truncover firstobs=2;
input ID :$20. Sampling_Date_Text :$21. @;
Sampling_Date=input(sampling_date_text,yymmdd10.);
Sampling_Time=input(scan(sampling_date_text,2,' '),time8.);
format Sampling_Date yymmdd10. Sampling_Time time8.;
length code $8 name $100 value 8 unit $20 ref $30 ;
do code='09015C_1','09015C_2' ;
input name value unit ref @;
output;
end;
cards4;
"ID","Sampling Date","09015C_1_NAME","09015C_1_VALUE","09015C_1_UNIT","09015C_1_REF","09015C_2_NAME","09015C_2_VALUE","09015C_2_UNIT","09015C_2_REF"
"M00000003","2016/11/24 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.02","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","81","mL/min/1.73^2","0.000-99999.000"
"M00000003","2016/12/15 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.06","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","78","mL/min/1.73^2","0.000-99999.000"
;;;;
Sampling_ Sampling_ Obs ID Sampling_Date_Text Date Time code 1 M00000003 2016/11/24 00:00:00 2016-11-24 0:00:00 09015C_1 2 M00000003 2016/11/24 00:00:00 2016-11-24 0:00:00 09015C_2 3 M00000003 2016/12/15 00:00:00 2016-12-15 0:00:00 09015C_1 4 M00000003 2016/12/15 00:00:00 2016-12-15 0:00:00 09015C_2 Obs name 1 Creatinine [Mass/volume] in Serum or Plasma 2 Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD) 3 Creatinine [Mass/volume] in Serum or Plasma 4 Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD) Obs value unit ref 1 1.02 mg/dl 0.60 - 1.30 2 81.00 mL/min/1.73^2 0.000-99999.000 3 1.06 mg/dl 0.60 - 1.30 4 78.00 mL/min/1.73^2 0.000-99999.000
Hi @mantubiradar19,
Thank you for using SAS Communities!
Please check if this the solution on this thread will help: https://communities.sas.com/t5/New-SAS-User/separate-datetime-using-Datepart-and-Timepart/td-p/54265...
The date format looks like this in my sas file -
I used the following code:
data WANT;
Set HAVE;
DATE = datepart(Sampling_Date);
TIME = timepart(Sampling_Date);
format DATE yymmdd10. TIME time.;
run;
I got the following error message:
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
430:21 431:21
NOTE: Invalid numeric data, Sampling_Date='2016/11/24 00:00:00' , at line 430 column 21.
NOTE: Invalid numeric data, Sampling_Date='2016/11/24 00:00:00' , at line 431 column 21.
ID=M00000003 Sampling_Date=2016/11/24 00:00:00 _09015C_1_NAME=Creatinine [Mass/volume] in Serum or Plasma
_09015C_1_VALUE=1.02 _09015C_1_UNIT=mg/dl _09015C_1_REF=0.60 - 1.30
_09015C_2_NAME=Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)
_09015C_2_VALUE=81 _09015C_2_UNIT=mL/min/1.73^2 _09015C_2_REF=0.000-99999.000 DATE=. TIME=. _ERROR_=1 _N_=1
NOTE: Invalid numeric data, Sampling_Date='2016/12/15 00:00:00' , at line 430 column 21.
NOTE: Invalid numeric data, Sampling_Date='2016/12/15 00:00:00' , at line 431 column 21.
So your datetimes have been read as character, and look completely different from what you initially posted (4-digit years, YMD order instead of (as I guess) DMY).
Please open your csv file with a text editor(!) and copy/paste the first few lines directly from there into the code box here.
Hi Kurt, here you go
"ID","Sampling Date","09015C_1_NAME","09015C_1_VALUE","09015C_1_UNIT","09015C_1_REF","09015C_2_NAME","09015C_2_VALUE","09015C_2_UNIT","09015C_2_REF" "M00000003","2016/11/24 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.02","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","81","mL/min/1.73^2","0.000-99999.000" "M00000003","2016/12/15 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.06","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","78","mL/min/1.73^2","0.000-99999.000" "M00000003","2016/12/29 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.05","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","79","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/01/13 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.09","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","75","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/02/02 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.05","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","79","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/02/16 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.11","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","73","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/03/03 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.13","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","72","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/03/16 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.11","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","73","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/03/30 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.10","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","74","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/04/13 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.05","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","78","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/04/27 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.03","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","80","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/05/11 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.17","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","69","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/05/29 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.12","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","73","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/07/26 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.22","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","66","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/10/26 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.14","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","71","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/11/21
Since you have text file you can read it directly with a data step. You can read the value as text and then convert the pieces separately into date and time variables. You could try using the ANYDTDTM informat, but that will make some assumptions about strange values and it looks like your values are following a well defined pattern so it would be better to get error messages if the value in the field are strange.
You can also fix the structure to remove values like '09015C_1' and '09015C_2' from the variable names and put them into values of variables instead.
To adjust this code to read from your file replace the in-line data with a RUN statement and change the INFILE statement to point to your actual data file.
data want ;
infile cards dsd truncover firstobs=2;
input ID :$20. Sampling_Date_Text :$21. @;
Sampling_Date=input(sampling_date_text,yymmdd10.);
Sampling_Time=input(scan(sampling_date_text,2,' '),time8.);
format Sampling_Date yymmdd10. Sampling_Time time8.;
length code $8 name $100 value 8 unit $20 ref $30 ;
do code='09015C_1','09015C_2' ;
input name value unit ref @;
output;
end;
cards4;
"ID","Sampling Date","09015C_1_NAME","09015C_1_VALUE","09015C_1_UNIT","09015C_1_REF","09015C_2_NAME","09015C_2_VALUE","09015C_2_UNIT","09015C_2_REF"
"M00000003","2016/11/24 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.02","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","81","mL/min/1.73^2","0.000-99999.000"
"M00000003","2016/12/15 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.06","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","78","mL/min/1.73^2","0.000-99999.000"
;;;;
Sampling_ Sampling_ Obs ID Sampling_Date_Text Date Time code 1 M00000003 2016/11/24 00:00:00 2016-11-24 0:00:00 09015C_1 2 M00000003 2016/11/24 00:00:00 2016-11-24 0:00:00 09015C_2 3 M00000003 2016/12/15 00:00:00 2016-12-15 0:00:00 09015C_1 4 M00000003 2016/12/15 00:00:00 2016-12-15 0:00:00 09015C_2 Obs name 1 Creatinine [Mass/volume] in Serum or Plasma 2 Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD) 3 Creatinine [Mass/volume] in Serum or Plasma 4 Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD) Obs value unit ref 1 1.02 mg/dl 0.60 - 1.30 2 81.00 mL/min/1.73^2 0.000-99999.000 3 1.06 mg/dl 0.60 - 1.30 4 78.00 mL/min/1.73^2 0.000-99999.000
@mantubiradar19 wrote:
Hi Kurt, here you go
"ID","Sampling Date","09015C_1_NAME","09015C_1_VALUE","09015C_1_UNIT","09015C_1_REF","09015C_2_NAME","09015C_2_VALUE","09015C_2_UNIT","09015C_2_REF" "M00000003","2016/11/24 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.02","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","81","mL/min/1.73^2","0.000-99999.000" "M00000003","2016/12/15 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.06","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","78","mL/min/1.73^2","0.000-99999.000" "M00000003","2016/12/29 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.05","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","79","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/01/13 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.09","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","75","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/02/02 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.05","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","79","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/02/16 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.11","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","73","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/03/03 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.13","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","72","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/03/16 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.11","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","73","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/03/30 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.10","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","74","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/04/13 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.05","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","78","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/04/27 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.03","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","80","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/05/11 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.17","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","69","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/05/29 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.12","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","73","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/07/26 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.22","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","66","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/10/26 00:00:00","Creatinine [Mass/volume] in Serum or Plasma","1.14","mg/dl","0.60 - 1.30","Glomerular filtration rate/1.73 sq M predicted among non-blacks by Creatinine-based formula (MDRD)","71","mL/min/1.73^2","0.000-99999.000" "M00000003","2017/11/21
Does that data ever actually have a time component? What you are showing looks like what I call lazy programming and letting something default to a date with time when there is no actual time component. When I have no "time" values other than 00:00:00 I read this as a date, YYMMDD10. informat, and ignore the time and assign an appropriate date format.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.