BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Retep1972
Calcite | Level 5

Hi,

 

I'm not new to programming but I am new to SAS and I want to perform a simple task on a table but can not find a clear answer:

 

I have a table with multiple columns which is the result of a logfile of a system. For analyses I'm only interested in a value change in one  specific column. If the value of this column changes from False to True or vice versa THEN I want to place the row, wich contains the timestamp to a specific table. The next value change is then added to the specific table.

 

Normaly I would build a loop and store the last know value of the column and compare each itration with that value and commit the data to a specific table. Now in SAS I need some help to get started.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Retep1972 

 

First a tip: If you reply to somebody then use the @ notation as shown above. I only read your reply by chance, because I never got a notification.

 

According to the log, test data is written to a data set named FILTER_FOR_ODS_SLS_DEL1. I am not a SAS EG user, but I guess the name is given by default by EG like QUERY_FOR_something.

 

The following data step will not work, because it tries to read data set named have, which is never created. So instead of have you need to specify the actual name of the data set containing your test data. 

 

Another tip: You can use the automatic variable &SYSLAST instead of a literal name to refer to the most recently created data set:

 

data work.want;
	set &syslast;
	if value ne lag(value);
run;

 

 

View solution in original post

6 REPLIES 6
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Retep1972 

 

Welcome in this universe of nerds! It is good practice here to give some additional information, like code to create example input that can be copied into SAS and executed, so anybody answering your question has something to work on and doesn't have to bother with inventing test data. Or -if the problem is getting data into SAS, - then attach a CSV or TXT file, not Excel, because most people wouldn't  dare to open Excel files from unknown sources.

 

So far there are too many initial questions, like

Do you have a SAS input or a LOG file?

What is the time stamp, a datetime numeric value, a string like '2020-01-02:12:59:03.021' or .. ?

How is the boolean value represented, Literal True/False (or true/false or TRUE/FALSE), 0/1 or .. ?

Is it any change of state from one record to the next, or changes applying to a given ID of some kind?

is the initial state in the log file considered a change?

 

You have given a pretty good explanation of how the problen should be solved, but the SAS vocabulary is different, and the syntax reflects that.

 

The following is the most simple version one could imagine. Common vocabulary is added in parentheses. Note that

 

  • the output data set is specified in the data statement.
  • the output data step includes per default all variables (columns) from the input + any variables created in the step, in this case zero.
  • the input data set (table) is specified in the set statement, and the scope is per default all observations (records)
  • the data step loops per default over the input data set, so all observations are read sequentially.
  • the if statement used without a corresponding then is called a subsetting if, and output (commit) is performed only when the given condition is met in the current observation.
  • the if statement is executed once pr. input observarion because of the implicit loop.
  • the lag function returns a missing value (NULL value) for the first observation, because there is no previous observation to read from.
  • the run statement terminates the data step and thereby also the implicit loop.

 

* Data Step with CARD input - 
  to read a log file use an infile statement instead of datalines;
data have;
	length timestamp 8 state $5;
	input timestamp datetime22.3 @24 state $5.;
	format timestamp datetime22.3;
	datalines;
25DEC2020:05:53:55.256 True
25DEC2020:11:47:49.512 True
25DEC2020:17:41:43.768 True
25DEC2020:23:35:38.024 False
26DEC2020:05:29:32.280 False
26DEC2020:11:23:26.536 False
26DEC2020:17:17:20.792 False
26DEC2020:23:11:15.048 False
27DEC2020:05:05:09.304 False
27DEC2020:10:59:03.560 False
27DEC2020:16:52:57.816 False
27DEC2020:22:46:52.072 True
28DEC2020:04:40:46.328 True
28DEC2020:10:34:40.584 True
28DEC2020:16:28:34.840 True
28DEC2020:22:22:29.096 False
29DEC2020:04:16:23.352 False
29DEC2020:10:10:17.608 False
29DEC2020:16:04:11.864 False
29DEC2020:21:58:06.120 False
;
run;

* Keep records where state is changed 
  - the shortest possible version, current state is compared to state from previous record;
data want; 
	set have;
	if state ne lag(state);
run;

output data set contains:

 

 

state.gif

 

 

Retep1972
Calcite | Level 5

I have been trying to get it to work but I got stuk with the creation of the "have" data. I'm using the code below:

 


*data set you want to create demo data for;
%let dataSetName = filter_for_data1;
*number of observations you want to keep;
%let obsKeep = 100;


******************************************************
DO NOT CHANGE ANYTHING BELOW THIS LINE
******************************************************;

%let source_path = https://gist.githubusercontent.com/statgeek/bcc55940dd825a13b9c8ca40a904cba9/raw/865d2cf18f5150b8e88...;

filename reprex url "&source_path";
%include reprex;
filename reprex;

option linesize=max;
%data2datastep(dsn=&dataSetName, file=have, obs=&obsKeep);
run;

data work.want;
set work.have;
if value ne lag(value);
run;

 

It wil not create the work.have data and I'm a bit clueless since I think I'm using the right variables. As attachement the log file

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Retep1972 

 

First a tip: If you reply to somebody then use the @ notation as shown above. I only read your reply by chance, because I never got a notification.

 

According to the log, test data is written to a data set named FILTER_FOR_ODS_SLS_DEL1. I am not a SAS EG user, but I guess the name is given by default by EG like QUERY_FOR_something.

 

The following data step will not work, because it tries to read data set named have, which is never created. So instead of have you need to specify the actual name of the data set containing your test data. 

 

Another tip: You can use the automatic variable &SYSLAST instead of a literal name to refer to the most recently created data set:

 

data work.want;
	set &syslast;
	if value ne lag(value);
run;

 

 

Retep1972
Calcite | Level 5

@ErikLund_Jensen 

I have no use for a "have" data set but I thought I needed it as an input for the lag function but the "&syslast" was the key to solve my problem! Thank you!

Kurt_Bremser
Super User

"have" and "want" are the names we use for datasets in code examples, so it is clear what is the source and what the target. Replace those names with your actual dataset names.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1576 views
  • 2 likes
  • 3 in conversation