SAS Programming

abak · Posted 03-08-2018 12:35 PM

Hello. Noob here.

I have data that looks like this:

Tech SubTech

T1 AB11

T1 JDE1

T1 LEIO

T2 AJNC

T2 JUN1

T2 PLNL

T3 TNEVE

T3 NED0

I would like to use two loops, the outside to loop through the three technologies, and the inner one for each subtech within, to create macro variables something like this:

subs1 = "JDE1, LEI0"

subs2 = "AJNC, JUN1, PLNL"

subs3="TNEVE, NED0"

via subs&i = (correct command here)

Something like:

data _null_;

set techs;

if first.Tech = 1 then do until last.Tech = 1

subs&i = ;

subs&i = subs&i.||<AppropriateSubTech>; *(trying to paste together a list here like "JDE1, LEI0")

end;

run;

Two questions:

1) How do I define the inside loop?

2) Also, I feel like the outside loop will terminate one too early if I use last.Tech = 1 because I want to include that term in the loop as well. How do I appropriately define the index for the outer loop?

Reeza · Posted 03-08-2018 01:12 PM

That isn't how you've shown your data. you've only shown values in a single column, not multiple variables.

select ABCD, ABCE, ABCF, DCAB, ... from table1 -> you only showed two variables, Tech and SubTech, where do variables ABCD and ABCE come from? If those are values you're not understanding what I'm trying to suggest.

And, Why for the first set are you missing the first entry?

subs1 = "JDE1, LEI0" -> where's the first one?

So you're trying to create subsets for each tech group based on the subtech groups? You can add a WHERE to the select statement, but I don't understand why you don't just join them and find the values you need.

or use an approach like this.

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1 where tech='T1');
quit;

There's some ideas here on how to create those lists but SAS doesn't loop the way you're thinking, there's already a data step loop that you need to take advantage of, as well as the BY group processing that's supported.

See this example here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

You can run it, and I suggest you do and see how it works. If you really, really want the macro variables at the point, its a very easy addition, it would be something like the following at each of the last.group values.

if last.group then call symputx(group_name, variable_value);

View solution in original post

abak · Posted 03-08-2018 12:38 PM

The point of this exercise is to subset another dataset that doesn't have the Tech column but has the SubTech column.

Eg,

dataset2 is like:

SubTech Data

AB11 0.102

AB11 -1.22

LEI0 0.73

data Tech1Set;

set dataset2;

if SubTech IN (&subs1);

run;

etc.

Reeza · Posted 03-08-2018 12:59 PM

Change your process:



proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1);
quit;

abak · Posted 03-08-2018 01:01 PM

Thank you for your reply.

I don't think that will work as I have 100's of subtechs that I need to automate their listing.

It would be like

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select ABCD, ABCE, ABCF, DCAB, ...  from table1);
quit;

There's no way I would hand-type out those SubTechs, and the list of SubTechs is expanding, which is why I need to pull the most up to date list from the first dataset that gets updated with all active subtechs.

If I just literally "select subtech from tabel1" then it wouldn't subset the data between the major tech groups.

Reeza · Posted 03-08-2018 01:12 PM

That isn't how you've shown your data. you've only shown values in a single column, not multiple variables.

select ABCD, ABCE, ABCF, DCAB, ... from table1 -> you only showed two variables, Tech and SubTech, where do variables ABCD and ABCE come from? If those are values you're not understanding what I'm trying to suggest.

And, Why for the first set are you missing the first entry?

subs1 = "JDE1, LEI0" -> where's the first one?

So you're trying to create subsets for each tech group based on the subtech groups? You can add a WHERE to the select statement, but I don't understand why you don't just join them and find the values you need.

or use an approach like this.

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1 where tech='T1');
quit;

There's some ideas here on how to create those lists but SAS doesn't loop the way you're thinking, there's already a data step loop that you need to take advantage of, as well as the BY group processing that's supported.

See this example here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

You can run it, and I suggest you do and see how it works. If you really, really want the macro variables at the point, its a very easy addition, it would be something like the following at each of the last.group values.

if last.group then call symputx(group_name, variable_value);

abak · Posted 03-08-2018 01:23 PM

I think adding the where statement will make it work. Thank you!

Both data sets have two columns, one has Tech and SubTech, and the other has SubTech and Data. The data sets I showed were small examples of very large 500k+ rows of information.

Missing the first entry just due to missing it when providing my example.

The values ABCD, ABCE...etc, are just in the dataset. Those are the fake versions of the subtech names, and are values, not separate variables, which I assume should be the top column if they were supposed to be variables.

I don't just join them because there is some slight complexities I know how to deal with that I'm glossing over with respect to the SubTech column, which actually contains multiple techs, and therefore the two SubTech columns are not identical. I can use the where statement along with an index indicator function to get around that though.

I was trying to take advantage of the internal loop structure of the data step by using a sorted data set and the first/last commands. Perhaps badly. I'll check out that link. Thank you for your help!

SAS Programming

Loops from first to last

Re: Loops from first to last

Re: Loops from first to last

Re: Loops from first to last

Re: Loops from first to last

Re: Loops from first to last

Re: Loops from first to last

Iterative Loop

call nlpqn within the loop

Looping proc optmodel

Loop

Debugging loop

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...