I think that Tom's code will do what you want as long as you add a retain statement. i.e.:
data want ;
infile "c:\art\o_equities_20080528.tas" dlm=",";
length date 8 time 8 c1 $1 c2 $3 date2 8 n1 8 c3 $3 n2 8 n3 8 ;
informat date date2 yymmdd8. time time8.;
format date date2 yymmdd10. time time8. ;
retain date;
if _n_=1 then do ;
input / / date / / ;
end;
input time c1 c2 date2 n1 c3 n2 n3 ;
run;
Dear Art, I tried the same code and here what my data looks like
You will have to post the code that you ran. You appear to be trying to input character fields as numbers which is one reason some of the variables would show up as missing. Also, for the code to work, Date would have to be on the third row, in the format shown in your example, and the actual data would have to start at row 6 (like in your example).
OK here is the full code.
FileName zipfile PIPE 'C:\Bch\7za.exe e "c:\data\o_equities_20080528.tas.zip" "o_equities_20080528.tas" -y -so';
Data dataset;
Infile zipfile dlm= ',' truncover dsd firstobs=6;
length date 8 time 8 cpf $1 cnt $3 ba $1;
informat date ED yymmdd8. time time8.;
format date ED yymmdd10. time time8.;
retain date;
if _n_=1 then do;
input //date//;
end;
input time cpf $ cnt $ ED vol ba $ price var1;
run;
Yes my data is exactly as shown in the picture.
Do you mind explaining to me the meaning of this bit:
retain date;
if _n_=1 then do;
input //date//;
end;
I would really like to understand what it means so I wouldn't just be copying pasting!
Thank you
RETAIN marks the variable DATE1 to keep its value when a new iteration of the data step starts. Since it is only assigned a value on the first pass it will have the same value for all observations.
IF _N_=1 test if this is the first pass through the data step. _N_ is automatic variable that is incremented for each pass through the data step.
The slashes in the INPUT statement tell it to go to the next line. So it skips two lines, reads a value for the DATE1 variable and then skips two more lines.
Oh OK!
Thank you very much, Tom, for the explanation! It is surely better than copying and pasting commands without really understanding what they are doing.
Thanks.
Works fine with the data you posted (with the RETAIN statement as Art reminded us).
data want ;
infile cards dsd dlm=',' truncover ;
length date1 8 time 8 c1 $1 c2 $3 date2 8 n1 8 c3 $3 n2 8 n3 8 ;
informat date1 date2 yymmdd8. time time8.;
format date1 date2 yymmdd10. time time8. ;
retain date1;
if _n_=1 then do ;
input / / date1 / / ;
end;
input time c1 c2 date2 n1 c3 n2 n3 ;
cards;
05
20080528 223000
20080528
999999
11169660
07:18:42,F,AC,20080601,,A,48.736,10
07:18:42,F,AC,20080601,,B,48.608,10
07:18:42,F,AC,20080701,,A,48.928,10
07:18:42,F,AC,20080701,,B,48.786,10
07:18:42,F,AC,20080801,,A,49.15,10
07:18:42,F,AC,20080801,,B,49.002,10
07:18:53,F,AC,20080601,,A,48.737,10
07:19:13,F,AC,20080701,,A,48.989,10
07:19:13,F,AC,20080601,,A,48.797,10
07:19:13,F,AC,20080801,,A,49.21,10
07:19:13,F,AC,20080801,,B,49.033,10
07:19:13,F,AC,20080701,,B,48.816,10
07:19:13,F,AC,20080601,,B,48.638,10
07:19:14,F,AC,20080801,,B,49.012,10
07:19:14,F,AC,20080701,,B,48.796,10
07:19:14,F,AC,20080601,,B,48.618,10
07:19:14,F,AC,20080801,,B,49.033,10
07:19:14,F,AC,20080701,,B,48.816,10
07:19:14,F,AC,20080601,,B,48.638,10
07:19:14,F,AC,20080801,,B,49.012,10
07:19:14,F,AC,20080701,,B,48.796,10
07:19:14,F,AC,20080601,,B,48.618,10
07:19:55,F,AC,20080701,,A,,
07:19:55,F,AC,20080701,,B,,
07:19:55,F,AC,20080601,,A,,
07:19:55,F,AC,20080601,,B,,
07:19:55,F,AC,20080801,,A,,
07:19:55,F,AC,20080801,,B,,
07:21:55,F,AC,20080801,,A,49.119,10
07:21:55,F,AC,20080801,,B,48.972,10
07:21:56,F,AC,20080601,,A,48.706,10
07:21:56,F,AC,20080601,,B,48.578,10
07:21:57,F,AC,20080701,,A,48.898,10
07:21:57,F,AC,20080701,,B,48.756,10
07:25:43,F,AC,20080601,,A,48.757,10
07:25:43,F,AC,20080701,,A,48.948,10
07:25:43,F,AC,20080801,,A,49.17,10
07:25:43,F,AC,20080701,,A,48.969,10
07:25:43,F,AC,20080601,,A,48.777,10
07:25:43,F,AC,20080801,,A,49.19,10
07:25:43,F,AC,20080701,,B,48.796,10
07:25:43,F,AC,20080601,,B,48.618,10
07:25:43,F,AC,20080801,,B,49.012,10
07:26:23,F,AC,20080701,,B,48.786,10
07:26:23,F,AC,20080601,,B,48.608,10
07:26:23,F,AC,20080801,,B,49.002,10
07:26:27,F,AC,20080701,,B,48.766,10
07:26:27,F,AC,20080601,,B,48.588,10
07:26:27,F,AC,20080801,,B,48.982,10
07:26:28,F,AC,20080601,,A,48.706,10
07:26:28,F,AC,20080701,,A,48.898,10
07:26:28,F,AC,20080801,,A,49.119,10
07:26:46,F,AC,20080601,,A,48.767,10
07:26:46,F,AC,20080701,,A,48.958,10
07:26:46,F,AC,20080801,,A,49.18,10
07:26:48,F,AC,20080701,,A,,
07:26:48,F,AC,20080701,,B,,
07:26:48,F,AC,20080601,,A,,
07:26:48,F,AC,20080601,,B,,
07:26:48,F,AC,20080801,,A,,
07:26:48,F,AC,20080801,,B,,
07:34:52,F,AC,20080701,,A,48.858,10
07:34:52,F,AC,20080701,,B,48.705,10
07:34:54,F,AC,20080801,,A,49.079,10
07:34:54,F,AC,20080801,,B,48.921,10
07:34:57,F,AC,20080601,,A,48.666,10
07:34:57,F,AC,20080601,,B,48.528,10
07:35:26,F,AC,20080801,,B,48.942,10
07:35:26,F,AC,20080601,,B,48.548,10
07:35:26,F,AC,20080701,,B,48.725,10
07:36:37,F,AC,20080601,,A,48.676,10
07:36:37,F,AC,20080701,,A,48.868,10
07:36:37,F,AC,20080801,,A,49.089,10
07:36:38,F,AC,20080701,,A,48.878,10
07:36:38,F,AC,20080601,,A,48.686,10
07:36:38,F,AC,20080801,,A,49.099,10
07:36:42,F,AC,20080701,,A,48.908,10
07:36:42,F,AC,20080601,,A,48.716,10
07:36:42,F,AC,20080801,,A,49.129,10
07:36:42,F,AC,20080701,,A,48.968,10
07:36:42,F,AC,20080601,,A,48.776,10
07:36:42,F,AC,20080801,,A,49.19,10
07:36:43,F,AC,20080701,,A,48.858,10
07:36:43,F,AC,20080601,,A,48.666,10
07:36:43,F,AC,20080801,,A,49.079,10
07:37:50,F,AC,20080601,,A,48.716,10
07:37:50,F,AC,20080701,,A,48.908,10
07:37:50,F,AC,20080801,,A,49.129,10
07:37:50,F,AC,20080701,,A,48.958,10
07:37:50,F,AC,20080601,,A,48.766,10
07:37:50,F,AC,20080801,,A,49.179,10
07:37:54,F,AC,20080701,,B,48.766,10
07:37:54,F,AC,20080601,,B,48.588,10
07:37:54,F,AC,20080801,,B,48.982,10
07:37:54,F,AC,20080601,,A,48.746,10
07:37:54,F,AC,20080701,,A,48.938,10
07:38:33,F,AC,20080801,,A,49.15,10
07:38:50,F,AC,20080701,,B,48.786,10
07:38:50,F,AC,20080601,,B,48.608,10
07:38:50,F,AC,20080801,,B,49.002,10
07:39:14,F,AC,20080701,,A,48.918,10
07:39:14,F,AC,20080601,,A,48.726,10
07:39:23,F,AC,20080801,,A,49.13,10
07:40:37,F,AC,20080801,,A,49.14,10
07:40:37,F,AC,20080701,,A,48.959,10
07:40:37,F,AC,20080601,,A,48.767,10
07:40:37,F,AC,20080801,,A,49.18,10
07:40:37,F,AC,20080701,,B,48.806,10
07:40:37,F,AC,20080601,,B,48.628,10
07:40:37,F,AC,20080801,,B,49.022,10
07:40:38,F,AC,20080701,,B,48.786,10
07:40:38,F,AC,20080601,,B,48.608,10
07:40:38,F,AC,20080801,,B,49.002,10
07:40:49,F,AC,20080701,,B,48.806,10
07:40:49,F,AC,20080601,,B,48.628,10
07:40:49,F,AC,20080801,,B,49.022,10
07:42:04,F,AC,20080801,,B,49.053,10
07:42:04,F,AC,20080601,,B,48.658,10
07:42:04,F,AC,20080701,,B,48.836,10
07:42:09,F,AC,20080801,,B,49.022,10
07:42:09,F,AC,20080601,,B,48.628,10
07:42:09,F,AC,20080701,,B,48.806,10
07:42:49,F,AC,20080701,,A,48.939,10
07:42:49,F,AC,20080601,,A,48.747,10
07:42:49,F,AC,20080801,,A,49.16,10
07:43:21,F,AC,20080701,,A,48.949,10
07:43:21,F,AC,20080601,,A,48.757,10
07:43:21,F,AC,20080801,,A,49.17,10
07:43:24,F,AC,20080701,,A,48.928,10
07:43:24,F,AC,20080601,,A,48.737,10
07:43:24,F,AC,20080801,,A,49.15,10
07:43:27,F,AC,20080701,,A,48.949,10
07:43:27,F,AC,20080601,,A,48.757,10
07:43:27,F,AC,20080801,,A,49.17,10
07:43:27,F,AC,20080701,,B,48.826,10
07:43:27,F,AC,20080601,,B,48.648,10
07:43:27,F,AC,20080801,,B,49.042,10
07:43:56,F,AC,20080601,,A,48.767,10
07:43:56,F,AC,20080701,,A,48.959,10
07:43:56,F,AC,20080801,,A,49.18,10
07:44:05,F,AC,20080601,,A,48.817,10
07:44:05,F,AC,20080701,,A,49.009,10
07:44:05,F,AC,20080801,,A,49.23,10
07:44:13,F,AC,20080601,,A,48.767,10
07:44:13,F,AC,20080701,,A,48.959,10
07:44:13,F,AC,20080801,,A,49.18,10
07:44:17,F,AC,20080601,,A,48.807,10
07:44:17,F,AC,20080701,,A,48.999,10
07:44:17,F,AC,20080801,,A,49.22,10
run;
proc print; run;
Obs date1 time c1 c2 date2 n1 c3 n2 n3
1 2008-05-28 7:18:42 F AC 2008-06-01 . A 48.736 10
2 2008-05-28 7:18:42 F AC 2008-06-01 . B 48.608 10
3 2008-05-28 7:18:42 F AC 2008-07-01 . A 48.928 10
4 2008-05-28 7:18:42 F AC 2008-07-01 . B 48.786 10
...
Oh I realized that I added firstobs= 6 while I shouldn't have!!:smileyblush:
Excellent! Thank you all!
I would love to understand what this command actually does, if that is possible:
retain date;
if _n_=1 then do;
input //date//;
end;
Thank you Arthur !!
Couple of things before I answer your question.
First, I found that I could only get the correct result if I added the truncover option as Tom had originally suggested. Also, I like to use informats rather than length statements. The code I used was:
data want ;
infile "c:\art\o_equities_20080528.tas" dlm="," truncover;
informat cpf ba vol $1.;
informat cnt $2.;
informat date ED yymmdd8. time time8.;
format date ED yymmdd10. time time8.;
retain date;
if _n_=1 then do;
input //date//;
end;
input time cpf cnt ED vol ba price var1;
run;
As for the code you asked about:
retain date;
if _n_=1 then do;
input //date//;
end;
retaining date, in this case, basically holds it across all of the observations since it is only read from record 3.
When _n_ eq 1, which will only be the case at the beginning of your run, Tom's code skips the first two records, then reads date, and then skips the next two records.
Those instructions will never be used again, as _n_ will always be greater than one for all of the other records.
Art
p.s. You really ought to give Tom the credit for the correct answer as all I added was the retain statement. The real answer came from him
Thank you to both of you guys, Arthur and Tom, for your excellent answers.
Your answers both contributed in solving the issue.
I just thought that your post has the full code for future readers in case they are in the same case. I just noticed that Tom's post has also the full code so I will go ahead and put his as the right answer then.
But really, thank you to the both of you.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.