DATA Step, Macro, Functions and more

Compare variable against first record in dataset

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Compare variable against first record in dataset

Hi everyone - 

 

I have a dataset where I'd like to compare a record against the first row in the dataset. For example, if I have the following:

 

ID    Height

1      40

2      35

3      40

4      58

 

I want to be able to create another variable that says which ones are greater than the first.

ID    Height    Flag

1      40          0

2      35          0

3      40          0

4      58          1

 

I don't think a lag will work since I always want to compare against record 1, not the record immediately preceding. And I can't get a first. command to work since I need to sort by ID in order to get my comparison row on the top, though I think there may be a way to do this that I can't figure out.

 

Any ideas? Thank you!

 


Accepted Solutions
Solution
‎10-12-2017 08:36 PM
Trusted Advisor
Posts: 1,284

Re: Compare variable against first record in dataset

[ Edited ]

Use two SET statements.  One SET read just the single first obs.  The other reads the entire data set, and does the compare:

 

data have;
  input id height;
datalines;
1      40
2      35
3      40
4      58
run;
data want (drop=h_ref);
  if _n_=1 then set have (keep=height rename=(height=h_ref));
  set have;
  flag=height>h_ref;
run;

 

The "if _n_=1 then set have ..." reads observation 1, renaming height to h_ref.  Just as importantly any variable read by SET  is automatically "retained" until that same SET statement is executed in a later iteration of the data set.  But because the program does not re-execute that set, variable H_REF is retained throughout all the observations.  Meanwhile the other SET statement reads all the obs, from obs 1 to the end.  I.e., the two SET statements set up separate streams of data from HAVE.

View solution in original post


All Replies
Super User
Posts: 22,823

Re: Compare variable against first record in dataset

[ Edited ]

RETAIN. 

 

RETAIN first_record_value;
if _n_=1 /*check for first record*/
     then first_record_value = height;

ATLien wrote:

Hi everyone - 

 

I have a dataset where I'd like to compare a record against the first row in the dataset. For example, if I have the following:

 

ID    Height

1      40

2      35

3      40

4      58

 

I want to be able to create another variable that says which ones are greater than the first.

ID    Height    Flag

1      40          0

2      35          0

3      40          0

4      58          1

 

I don't think a lag will work since I always want to compare against record 1, not the record immediately preceding. And I can't get a first. command to work since I need to sort by ID in order to get my comparison row on the top, though I think there may be a way to do this that I can't figure out.

 

Any ideas? Thank you!

 



 

New Contributor
Posts: 3

Re: Compare variable against first record in dataset

When I try that, it just adds in a new variable with a record only for the first row. I still only have the relevant data in that top row, which I want to compare against data in subsequent rows.

Super User
Posts: 22,823

Re: Compare variable against first record in dataset

Did you include the RETAIN statement?

 

Post your exact code please.

Solution
‎10-12-2017 08:36 PM
Trusted Advisor
Posts: 1,284

Re: Compare variable against first record in dataset

[ Edited ]

Use two SET statements.  One SET read just the single first obs.  The other reads the entire data set, and does the compare:

 

data have;
  input id height;
datalines;
1      40
2      35
3      40
4      58
run;
data want (drop=h_ref);
  if _n_=1 then set have (keep=height rename=(height=h_ref));
  set have;
  flag=height>h_ref;
run;

 

The "if _n_=1 then set have ..." reads observation 1, renaming height to h_ref.  Just as importantly any variable read by SET  is automatically "retained" until that same SET statement is executed in a later iteration of the data set.  But because the program does not re-execute that set, variable H_REF is retained throughout all the observations.  Meanwhile the other SET statement reads all the obs, from obs 1 to the end.  I.e., the two SET statements set up separate streams of data from HAVE.

New Contributor
Posts: 3

Re: Compare variable against first record in dataset

Thank you! I had never thought to use two set statements but this works perfectly. What a great new trick!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 132 views
  • 1 like
  • 3 in conversation