BookmarkSubscribeRSS Feed
lmclaughlin10
Calcite | Level 5

(New SAS user)

Not sure where to start.  I have a dataset which I need to read a row and perform some evaluations. Then I need to read the second row and compare it to the primary key (ID, SITE)  in the first row, if they are equal, read the next row, if primary key not equal perform the evaluation. For example: Read row 1, if end date less than date (specific field in dataset) then xxx, read new row, if ID, SITE equal ID, SITE in row 1, read next row, else perform evaluation. Thanks in advance for your assistance.

 

Sample Dataset:

ROW     ID            SITE                  END DATE

1          1100         AAA                  BLANK

2          1100         AAA                  12JUN2020

3          1100         BBB                  BLANK

4          1200         AAA                  13JUN2020

5          1200         CCC                  13JUN2020

 

8 REPLIES 8
ketpt42
Quartz | Level 8
Can you give us an idea of what you want in expected output using your sample dataset as a guide?
lmclaughlin10
Calcite | Level 5

Yes.  

Expected output would be

Row   ID       SITE    END DATE

1        1100   AAA      BLANK

3         1100  BBB      BLANK

4         1200  AAA      13JUN2020

5         1200 CCC      13JUN2020

ketpt42
Quartz | Level 8

Hmm. That really didn't clear it up for me. I think I need to know:

  • What the "date" is in your logic below?
  • What you mean by "xxx"?
  • Confirmation that you do not consider a missing end date as being less than "date".

 


Read row 1, if end date less than date (specific field in dataset) then xxx, read new row, if ID, SITE equal ID, SITE in row 1, read next row, else perform evaluation.

 

ballardw
Super User

BY group processing in SAS makes this fairly easy.

 

data have;
input ROW     ID  $         SITE  $                ENDDATE :$9. ;
datalines;
1          1100         AAA                  BLANK
2          1100         AAA                  12JUN2020
3          1100         BBB                  BLANK
4          1200         AAA                  13JUN2020
5          1200         CCC                  13JUN2020
;

/* next step assumes the data set have is sorted by ID and Site*/

data want;
   set have;
   by id site;
   if first.site then result='do evaluation';
   else result = 'no evaluation';
run;

When a data set is processed with BY groups then SAS creates automatic variables First. and Last. (note the dot is important!) for each variable on the By statement. These are boolean (True/False or numerically 1/0).

So in the above if the Site value is the first site it refers to within values of the ID.

 

Please note that the preferred way to share data is a data step. That answers lots of questions such as actual variable names, types, lengths and values. Paste code into a code box opened on the forum with either the </> or "running man" icons to preserve formatting. Otherwise the message windows will reformat your code and may introduce invisible HTML characters that mean the code may not run.

Kurt_Bremser
Super User

You refer to this:


@lmclaughlin10 wrote:

(New SAS user)

date (specific field in dataset)


which is not in your rough dataset.

Please supply your "have" dataset in this way:

data have;
infile datalines truncover;
input ID $ SITE $ end_date :date9.;
format end_date yymmddd10.;
datalines;
1100 AAA
1100 AAA 12JUN2020
1100 BBB
1200 AAA 13JUN2020
1200 CCC 13JUN2020
;

and include all variables that are relevant to your conditions and calculations.

lmclaughlin10
Calcite | Level 5

The evaluation part is if date is less cut off date, then before, else after. ( case when end_date < cut_off_date then "before", else "after" end) as cutoff

The actual table i'm reading has many fields.  These are only fields I need to used in my query/sql.  Once I understand how to obtain the results needed, I can then add in the evaluation part.  Even if it's again a temporary work file.  The solution may be simple. I maybe overthinking it.  It may be as simple a removing duplicates.

Kurt_Bremser
Super User

There is no column cut_off_date in your dataset.

Please supply your dataset as a data step with datalines, as requested, and if you need to use a value that is not part of the dataset, but derived from somewhere else (and maybe stored in a macro variable), then say so.

lmclaughlin10
Calcite | Level 5
Thank Kurt for your feed back. The cut_off_date varies by site and other data within the overall dataset. Again, that part of the evaluation I could handle. However as mentioned in my last post, I probable was overthinking it. Using the NODUPKEY function returned me the results I needed.

Thanks

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 956 views
  • 1 like
  • 4 in conversation