hi,
I have a dataset which has 1 million records. There is a date filed which as values in format 20130917. But there seem to be alphabets(names) in date field. How to get 10 records above from the point the alphabets have been included in the date field?
Are you requesting the 10 records before each "date" containing non-digit characters or before the first place this occurs?
If the names normally should occur after the dates in the file then likely causes involve missing or incorrect use options such as FLOWOVER or TRUNCOVER in an INFILE statement. Then the "date" is getting treated as the part of a previous line.
If the name occurs before the date there may be an issue with tabs in the name or possible another field and the column alignments are off.
Hi,
Treat the variable as charater variable initially, then using charater functions identify dates with alphabets, then create flag variable to get top 10 or bottom 10 of each alphabet values.
You need two synchornized readings of the data, one of them 10 records behind the other (the "if _n_>10" below). If the "lead' reading finds an alpha character in DATE, then set a counter to 10. The second reading tests the counter and if it's greater than or equal to 0, it outputs the record. It also decrements the counter. The result should be every offending record plus the 10 records preceding each of them.
Is the data in a raw data file? Let's say it's in file 'c:\temp\t.txt', and date is the first 8 characters of each record, followed by other variables of interest:
in1 'c:\temp\t.txt';
in2 'c:\temp\t.txt';
data want (drop=prx);
retain prx 0;
infile in1 end=end_in1;
if end_in1=0 then do;
input date $8.;
if notdigit(date) then prx=min(_n_-1,10);
end;
if _n_>10 then do;
infile in2;
input date $8. ... other variables .... ;
if prx>=0 then output;
prx=prx-1;
end;
run;
If the data is a SAS dataset (say HAVE), with DATE as a character variable, the logic is similar:
data want (drop=prx);
retain prx 0;
if end_of_have=0 then do;
set have (keep=date) end=end_of_have;
if notdigit(date) then prx=min(_n_-1,10);
end;
if _n_>10 then do;
set have;
if prx>=0 then output;
prx=prx-1;
end;
run;
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.