Hi,
I am working with sashelp.class to understand well the loops in a dataset
My idea is to find the name Alfred and check if is Robert in a previous observations. The same thing for when observations is Robert find if Ronald is a previous observations.
If I check this i will put a new variable value=1 if not value=0. The problem is in save the actual observations and it is in a infinite loop. I don't know why because de code is idented.
I don't know if the result is correct because i don't have the output.
%macro loopOverDatasets();
%local datasetCount iter inter2 inName inName2;
%let Name1 = 'Alfred';
%let Name2 = 'Robert';
%let Name3 = 'Ronald';
proc sql noprint;
select count(*)
into :datasetCount
from sashelp.class;
quit;
%let iter=1;
%do %while (&iter.<= &datasetCount.);
data _NULL_;
set sashelp.class (firstobs=&iter. obs=&iter.);
call symput('inName',strip(Name));
run;
data Group;
set sashelp.class;
iter2 = 1;
%if (&inName. EQ &Name1.)
%then %do;
%do %while (&iter2. <= &iter.);
data _NULL_;
set sashelp.class (firstobs=&iter2. obs=&iter2.);
call symput('inName2',strip(Name));
run;
%if (&inName. EQ &Name2.)
%then %do;
value=1;
%end;
%else
value=0;
%end;
%let iter2=%eval(&iter2.+1);
%end;
%end;
%if (&inName. EQ &Name2.)
%then %do;
%do %while (&iter2. <= &iter.);
data _NULL_;
set sashelp.class (firstobs=&iter2. obs=&iter2.);
call symput('inName2',strip(Name));
run;
%if (&inName. EQ &Name3.)
%then %do;
value=1;
%end;
%else
value=0;
%end;
%let iter2=%eval(&iter2.+1);
%end;
%end;
run;
%let iter=%eval(&iter.+1);
%end;
%mend;
%loopOverDatasets()
Someone can help me
Regards,
Aleixo
See this code:
data work.dados(label='Dados');
infile datalines dsd truncover;
input Time:32. Name:$8. Tp:$1. Age:$3. Type:$2.;
datalines4;
107,Joyce,B,OFF,Dj
110,Joyce,B,OFF,Dj
202,Jane,C,OFF,Dj
304,Alice,A,ON,Pr
476,Alice,A,OFF,Dj
546,Barbara,D,OFF,Dj
628,Carol,F,OFF,Dj
756,Mary,G,ON,Pr
858,Mary,F,ON,Pr
929,Mary,G,OFF,Dj
949,Thomas,I,ON,Pr
989,Thomas,I,OFF,Dj
1002,Jane,C,ON,Pr
1006,Jane,C,ON,Pr
1010,Jane,Z,ON,Pr
1100,Jane,C,OFF,Dj
1235,Thomas,A,ON,Pr
1240,Thomas,I,ON,Pr
1304,Thomas,I,OFF,Dj
;;;;
run;
proc sort data=dados;
by name tp time;
run;
data want;
set dados;
by name tp;
retain inittime;
if first.tp
then do;
inittime = .;
diff = 0;
end;
if age = 'ON' and inittime = .
then do;
inittime = time;
diff = 0;
end;
else if age = 'OFF' and inittime ne .
then do;
diff = time - inittime;
inittime = .;
end;
else diff = 0;
drop inittime;
run;
proc sort data=want;
by time;
run;
proc print data=want noobs;
run;
Result:
Time Name Tp Age Type diff 107 Joyce B OFF Dj 0 110 Joyce B OFF Dj 0 202 Jane C OFF Dj 0 304 Alice A ON Pr 0 476 Alice A OFF Dj 172 546 Barbara D OFF Dj 0 628 Carol F OFF Dj 0 756 Mary G ON Pr 0 858 Mary F ON Pr 0 929 Mary G OFF Dj 173 949 Thomas I ON Pr 0 989 Thomas I OFF Dj 40 1002 Jane C ON Pr 0 1006 Jane C ON Pr 0 1010 Jane Z ON Pr 0 1100 Jane C OFF Dj 98 1235 Thomas A ON Pr 0 1240 Thomas I ON Pr 0 1304 Thomas I OFF Dj 64
matches what you wanted.
Why so complicated?
%let actval=Ronald;
%let prevval=Alfred;
data want;
set sashelp.class;
retain flag 0;
if name = "&prevval." then flag = 1;
if name = "&actval." and flag
then newval = 1;
else newval = 0;
drop flag;
run;
As usual, a macro is not needed.
Hi,
Thanks for your answer. Worked fine.
It is not soo simple xD. What i ask is simple because i am at start but i can complicate a little bit. My background in programming is thinking that way. Now SAS is a little different.
In my idea these names will be dinamic, not always the same name. Other idea is If my list has more than one Alfreds or Ronalds i will be check the names and another variables like the age. My code try to thinks in all cases that i want to check. That's why is so complicated. The newvalue is a flat to check all obervations with the condition "name".
How i can calculate newval = Age(Alfred) - Age(Ronald)
That's why i need to save observations and search after search until the end.
Sorry about all my questions, i try to search on community, lots of topics but the codes are only for one simple thing and i have to mix them to achieve my goal.
Regards,
Aleixo
Well, the basic logic stays the same: retain a reference variable and compare current values to it.
If you need to run the test repeatedly with lots of different values, wrap the code into a macro (that's why I used two macro variables, set them as macro parameters), and then use call execute() to call the macro off a dataset of values.
It is not a good idea because you in a datastep will overwrite the newval. Datastep change all collumn(newval) every time in each loop.
The code must to be sequential from first observation to last observation. That's is the idea of two loops( 2 search ) with do while statement. One for search what i want to update (newval =1) and another intern loop to search the other observation previous to compare and for example calculate the age different.
I can't make any sense of this. Please provide some example data (in a data step) and an example for the expected output, with comments for why what is expected.
SQL is a likely candidate for joining data sets on different criteria especially if place the records into a data set.
No claim that this is the most efficient way but is one way to build a moderately complex comparison of records based on different values in a single data set:
data work.names; length firstname secondname $ 8; input firstname $ secondname $; datalines; Alfred Thomas Barbara Judy ; proc sql; create table want as select firstset.firstname, secondset.secondname, firstset.age as FirstAge, secondset.age as SecondAge, (firstset.age - secondset.age) as AgeDif from (select a.firstname,a.secondname, b.* from work.names as a left join sashelp.class as b on a.firstname=b.name ) as firstset left join (select c.firstname,c.secondname, d.* from work.names as c left join sashelp.class as d on c.secondname=d.name ) as secondset on firstset.firstname=secondset.firstname and firstset.secondname=secondset.secondname ; quit;
You are not using dataset loops there at all as far as I can tell, it is all macro code. Now based on the knowledge that macro is never needed, and you should always try to use Base SAS code, I would ask why you have written this big chunk of code. I can see no reason for it, as a simple retain statement in a datastep will do the requested logic. Provide some test data - in the form of a datastep - so that we have something to write code to, and show what the output should look like. We can then provide you with some simple code to achieve your task. If you do not know how to get a datastep of your data follow this post:
Also, do not the code you have written is invalid in several ways - starting with the obvious data _null_ steps within another datastep (data group) which is invalid.
Note this:
data Group;
set sashelp.class;
iter2 = 1;
%if (&inName. EQ &Name1.)
%then %do;
%do %while (&iter2. <= &iter.);
data _NULL_;
When SAS encounters the data _NULL_ statement, which constitutes a step boundary, the first data step will look like this:
data Group;
set sashelp.class;
iter2 = 1;
and start to execute, so you get a copy of sashelp.class with column iter2 added where all values are 1.
Ok. Thank you all answers, seriously. I think the solutions are not what i want. I will be more specific:
My data in a dataset that i have to try the code
data work.dados(label='Dados');
infile datalines dsd truncover;
input Time:32. Name:$8. Tp:$1. Age:$3. Type:$2.;
datalines4;
107,Joyce,B,OFF,Dj
110,Joyce,B,OFF,Dj
202,Jane,C,OFF,Dj
304,Alice,A,ON,Pr
476,Alice,A,OFF,Dj
546,Barbara,D,OFF,Dj
628,Carol,F,OFF,Dj
756,Mary,G,ON,Pr
858,Mary,F,ON,Pr
929,Mary,G,OFF,Dj
949,Thomas,I,ON,Pr
989,Thomas,I,OFF,Dj
1002,Jane,C,ON,Pr
1006,Jane,C,ON,Pr
1010,Jane,Z,ON,Pr
1100,Jane,C,OFF,Dj
1235,Thomas,A,ON,Pr
1240,Thomas,I,ON,Pr
1304,Thomas,I,OFF,Dj
;;;;
The idea is calculate time difference between same "Name" with same "Tp" and OFF Dj -> ON Pr (first in time(older), not second in time(newer)).
Attention Thomas has two calculations. And for example Joyce don't have correspond ON Pr in the table, so not need to calculate difference. Or for example Carol is only one observation, no need to calculate difference.
I can not use the names because could be hundreds of them xD. That why i try to do two do while statements to find all names needed. I know that It is very difficult to be efficient to do this.
Want
107,Joyce,B,OFF,Dj,0
110,Joyce,B,OFF,Dj,0
202,Jane,C,OFF,Dj,0
304,Alice,A,ON,Pr,0
476,Alice,A,OFF,Dj,172
546,Barbara,D,OFF,Dj,0
628,Carol,F,OFF,Dj,0
756,Mary,G,ON,Pr,0
858,Mary,F,ON,Pr,0
929,Mary,G,OFF,Dj,173
949,Thomas,I,ON,Pr,0
989,Thomas,I,OFF,Dj,40
1002,Jane,C,ON,Pr,0
1006,Jane,C,ON,Pr,0
1010,Jane,Z,ON,Pr,0
1100,Jane,C,OFF,Dj,98
1235,Thomas,A,ON,Pr,0
1240,Thomas,I,ON,Pr,0
1304,Thomas,I,OFF,Dj,64
If i understand how to code this work i will be happy and will improve my programming skills to a next level.
Sorry my last posts but to study programming we have to do work step by step, simple to complicated to achieve our final goal.
Regards,
Aleixo
The idea is calculate time difference between same "Name" with same "Tp" and OFF Dj -> ON Pr (first in time(older), not second in time(newer)).
I am not sure from your example data how to tell time sequence or "newer" or "older" or which order of calculation is desired:
In the previous example I showed how to match things such as "same "Name" with same "Tp" ", Join on set1.value=set2.value and set1.value2=set2.value2
A WHERE clause might be able to filter the results: where set1.variable > set2.variable (if a single variable contains the order of comparison information) if you have multiple comparisons then add them to the where clause. but since you did not actual use variables names in your requirement clearly enough I can't make a specific recommendation.
thank you ballardw.
I thought also about where statement but i had problems with repeated observations, like in Name repeated and like in ON Pr repeated with same name.
See this code:
data work.dados(label='Dados');
infile datalines dsd truncover;
input Time:32. Name:$8. Tp:$1. Age:$3. Type:$2.;
datalines4;
107,Joyce,B,OFF,Dj
110,Joyce,B,OFF,Dj
202,Jane,C,OFF,Dj
304,Alice,A,ON,Pr
476,Alice,A,OFF,Dj
546,Barbara,D,OFF,Dj
628,Carol,F,OFF,Dj
756,Mary,G,ON,Pr
858,Mary,F,ON,Pr
929,Mary,G,OFF,Dj
949,Thomas,I,ON,Pr
989,Thomas,I,OFF,Dj
1002,Jane,C,ON,Pr
1006,Jane,C,ON,Pr
1010,Jane,Z,ON,Pr
1100,Jane,C,OFF,Dj
1235,Thomas,A,ON,Pr
1240,Thomas,I,ON,Pr
1304,Thomas,I,OFF,Dj
;;;;
run;
proc sort data=dados;
by name tp time;
run;
data want;
set dados;
by name tp;
retain inittime;
if first.tp
then do;
inittime = .;
diff = 0;
end;
if age = 'ON' and inittime = .
then do;
inittime = time;
diff = 0;
end;
else if age = 'OFF' and inittime ne .
then do;
diff = time - inittime;
inittime = .;
end;
else diff = 0;
drop inittime;
run;
proc sort data=want;
by time;
run;
proc print data=want noobs;
run;
Result:
Time Name Tp Age Type diff 107 Joyce B OFF Dj 0 110 Joyce B OFF Dj 0 202 Jane C OFF Dj 0 304 Alice A ON Pr 0 476 Alice A OFF Dj 172 546 Barbara D OFF Dj 0 628 Carol F OFF Dj 0 756 Mary G ON Pr 0 858 Mary F ON Pr 0 929 Mary G OFF Dj 173 949 Thomas I ON Pr 0 989 Thomas I OFF Dj 40 1002 Jane C ON Pr 0 1006 Jane C ON Pr 0 1010 Jane Z ON Pr 0 1100 Jane C OFF Dj 98 1235 Thomas A ON Pr 0 1240 Thomas I ON Pr 0 1304 Thomas I OFF Dj 64
matches what you wanted.
WoW, It's amazing what you have done. So simple what you do, i am delighted. Worked very well.
I will try to understand how it really works the code and the loop. Thank you so much
Regards,
Aleixo
I started out with the simple retain/first./if-condition-then model. Then I ran the code, looked for differences to the desired result, and then tweaked the conditions until I got the desired result. SAS being an interpreting language (you don't have to wait for compiles before every test), it lends itself very well to this approach.
A major part of the solution is the correct sort order for the logic, and getting back to the original order. This can sometimes necessitate the creation of a variable that keeps the original _n_.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.