(New SAS user)
Not sure where to start. I have a dataset which I need to read a row and perform some evaluations. Then I need to read the second row and compare it to the primary key (ID, SITE) in the first row, if they are equal, read the next row, if primary key not equal perform the evaluation. For example: Read row 1, if end date less than date (specific field in dataset) then xxx, read new row, if ID, SITE equal ID, SITE in row 1, read next row, else perform evaluation. Thanks in advance for your assistance.
Sample Dataset:
ROW ID SITE END DATE
1 1100 AAA BLANK
2 1100 AAA 12JUN2020
3 1100 BBB BLANK
4 1200 AAA 13JUN2020
5 1200 CCC 13JUN2020
Yes.
Expected output would be
Row ID SITE END DATE
1 1100 AAA BLANK
3 1100 BBB BLANK
4 1200 AAA 13JUN2020
5 1200 CCC 13JUN2020
Hmm. That really didn't clear it up for me. I think I need to know:
Read row 1, if end date less than date (specific field in dataset) then xxx, read new row, if ID, SITE equal ID, SITE in row 1, read next row, else perform evaluation.
BY group processing in SAS makes this fairly easy.
data have; input ROW ID $ SITE $ ENDDATE :$9. ; datalines; 1 1100 AAA BLANK 2 1100 AAA 12JUN2020 3 1100 BBB BLANK 4 1200 AAA 13JUN2020 5 1200 CCC 13JUN2020 ; /* next step assumes the data set have is sorted by ID and Site*/ data want; set have; by id site; if first.site then result='do evaluation'; else result = 'no evaluation'; run;
When a data set is processed with BY groups then SAS creates automatic variables First. and Last. (note the dot is important!) for each variable on the By statement. These are boolean (True/False or numerically 1/0).
So in the above if the Site value is the first site it refers to within values of the ID.
Please note that the preferred way to share data is a data step. That answers lots of questions such as actual variable names, types, lengths and values. Paste code into a code box opened on the forum with either the </> or "running man" icons to preserve formatting. Otherwise the message windows will reformat your code and may introduce invisible HTML characters that mean the code may not run.
You refer to this:
@lmclaughlin10 wrote:
(New SAS user)
date (specific field in dataset)
which is not in your rough dataset.
Please supply your "have" dataset in this way:
data have;
infile datalines truncover;
input ID $ SITE $ end_date :date9.;
format end_date yymmddd10.;
datalines;
1100 AAA
1100 AAA 12JUN2020
1100 BBB
1200 AAA 13JUN2020
1200 CCC 13JUN2020
;
and include all variables that are relevant to your conditions and calculations.
The evaluation part is if date is less cut off date, then before, else after. ( case when end_date < cut_off_date then "before", else "after" end) as cutoff
The actual table i'm reading has many fields. These are only fields I need to used in my query/sql. Once I understand how to obtain the results needed, I can then add in the evaluation part. Even if it's again a temporary work file. The solution may be simple. I maybe overthinking it. It may be as simple a removing duplicates.
There is no column cut_off_date in your dataset.
Please supply your dataset as a data step with datalines, as requested, and if you need to use a value that is not part of the dataset, but derived from somewhere else (and maybe stored in a macro variable), then say so.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.