DATA Step, Macro, Functions and more

Loops from first to last

Accepted Solution Solved
Reply
Highlighted
Contributor
Posts: 24
Accepted Solution

Loops from first to last

Hello. Noob here.

 

I have data that looks like this:

 

Tech     SubTech

T1         AB11

T1         JDE1

T1         LEIO

T2         AJNC

T2         JUN1

T2         PLNL

T3         TNEVE

T3         NED0

 

I would like to use two loops, the outside to loop through the three technologies, and the inner one for each subtech within, to create macro variables something like this:

 

subs1 = "JDE1, LEI0"

subs2 = "AJNC, JUN1, PLNL"

subs3="TNEVE, NED0"

 

via subs&i = (correct command here)

Something like:

 

data _null_;

    set techs;

    

    if first.Tech = 1 then do until last.Tech = 1

      subs&i = ;

       <second loop over j in here>

       subs&i = subs&i.||<AppropriateSubTech>; *(trying to paste together a list here like "JDE1, LEI0")

       end;

   end;

   run;    

 

Two questions: 

 

1) How do I define the inside loop?

 

2) Also, I feel like the outside loop will terminate one too early if I use last.Tech = 1 because I want to include that term in the loop as well. How do I appropriately define the index for the outer loop?


Accepted Solutions
Solution
‎03-08-2018 04:03 PM
Super User
Posts: 23,771

Re: Loops from first to last

That isn't how you've shown your data. you've only shown values in a single column, not multiple variables.

 

select ABCD, ABCE, ABCF, DCAB, ... from table1 -> you only showed two variables, Tech and SubTech, where do variables ABCD and ABCE come from? If those are values you're not understanding what I'm trying to suggest.

 

And, Why for the first set are you missing the first entry?

 

subs1 = "JDE1, LEI0" -> where's the first one?

 

So you're trying to create subsets for each tech group based on the subtech groups? You can add a WHERE to the select statement, but I don't understand why you don't just join them and find the values you need.

 

or use an approach like this. 

 

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1 where tech='T1');
quit;

There's some ideas here on how to create those lists but SAS doesn't loop the way you're thinking, there's already a data step loop that you need to take advantage of, as well as the BY group processing that's supported.

 

See this example here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

 

You can run it, and I suggest you do and see how it works. If you really, really want the macro variables at the point, its a very easy addition, it would be something like the following at each of the last.group values.

 

 

if last.group then call symputx(group_name, variable_value);

View solution in original post


All Replies
Contributor
Posts: 24

Re: Loops from first to last

The point of this exercise is to subset another dataset that doesn't have the Tech column but has the SubTech column.

 

Eg, 

 

dataset2 is like:

 

SubTech Data

AB11      0.102

AB11      -1.22

LEI0       0.73

 

 

data Tech1Set;

   set dataset2;

   if SubTech IN (&subs1);

   run;

 

etc.

Super User
Posts: 23,771

Re: Loops from first to last

Change your process:

 



proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1);
quit;
Contributor
Posts: 24

Re: Loops from first to last

[ Edited ]

Thank you for your reply.

 

I don't think that will work as I have 100's of subtechs that I need to automate their listing.

 

It would be like 

 

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select ABCD, ABCE, ABCF, DCAB, ...  from table1);
quit;

There's no way I would hand-type out those SubTechs, and the list of SubTechs is expanding, which is why I need to pull the most up to date list from the first dataset that gets updated with all active subtechs.

 

If I just literally "select subtech from tabel1" then it wouldn't subset the data between the major tech groups.

Solution
‎03-08-2018 04:03 PM
Super User
Posts: 23,771

Re: Loops from first to last

That isn't how you've shown your data. you've only shown values in a single column, not multiple variables.

 

select ABCD, ABCE, ABCF, DCAB, ... from table1 -> you only showed two variables, Tech and SubTech, where do variables ABCD and ABCE come from? If those are values you're not understanding what I'm trying to suggest.

 

And, Why for the first set are you missing the first entry?

 

subs1 = "JDE1, LEI0" -> where's the first one?

 

So you're trying to create subsets for each tech group based on the subtech groups? You can add a WHERE to the select statement, but I don't understand why you don't just join them and find the values you need.

 

or use an approach like this. 

 

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1 where tech='T1');
quit;

There's some ideas here on how to create those lists but SAS doesn't loop the way you're thinking, there's already a data step loop that you need to take advantage of, as well as the BY group processing that's supported.

 

See this example here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

 

You can run it, and I suggest you do and see how it works. If you really, really want the macro variables at the point, its a very easy addition, it would be something like the following at each of the last.group values.

 

 

if last.group then call symputx(group_name, variable_value);
Contributor
Posts: 24

Re: Loops from first to last

I think adding the where statement will make it work. Thank you!

 

Both data sets have two columns, one has Tech and SubTech, and the other has SubTech and Data. The data sets I showed were small examples of very large 500k+ rows of information.

 

Missing the first entry just due to missing it when providing my example. 

 

The values ABCD, ABCE...etc, are just in the dataset. Those are the fake versions of the subtech names, and are values, not separate variables, which I assume should be the top column if they were supposed to be variables.

 

I don't just join them because there is some slight complexities I know how to deal with that I'm glossing over with respect to the SubTech column, which actually contains multiple techs, and therefore the two SubTech columns are not identical. I can use the where statement along with an index indicator function to get around that though.

 

I was trying to take advantage of the internal loop structure of the data step by using a sorted data set and the first/last commands. Perhaps badly. I'll check out that link. Thank you for your help!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 161 views
  • 0 likes
  • 2 in conversation