BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abak
Obsidian | Level 7

Hello. Noob here.

 

I have data that looks like this:

 

Tech     SubTech

T1         AB11

T1         JDE1

T1         LEIO

T2         AJNC

T2         JUN1

T2         PLNL

T3         TNEVE

T3         NED0

 

I would like to use two loops, the outside to loop through the three technologies, and the inner one for each subtech within, to create macro variables something like this:

 

subs1 = "JDE1, LEI0"

subs2 = "AJNC, JUN1, PLNL"

subs3="TNEVE, NED0"

 

via subs&i = (correct command here)

Something like:

 

data _null_;

    set techs;

    

    if first.Tech = 1 then do until last.Tech = 1

      subs&i = ;

       <second loop over j in here>

       subs&i = subs&i.||<AppropriateSubTech>; *(trying to paste together a list here like "JDE1, LEI0")

       end;

   end;

   run;    

 

Two questions: 

 

1) How do I define the inside loop?

 

2) Also, I feel like the outside loop will terminate one too early if I use last.Tech = 1 because I want to include that term in the loop as well. How do I appropriately define the index for the outer loop?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

That isn't how you've shown your data. you've only shown values in a single column, not multiple variables.

 

select ABCD, ABCE, ABCF, DCAB, ... from table1 -> you only showed two variables, Tech and SubTech, where do variables ABCD and ABCE come from? If those are values you're not understanding what I'm trying to suggest.

 

And, Why for the first set are you missing the first entry?

 

subs1 = "JDE1, LEI0" -> where's the first one?

 

So you're trying to create subsets for each tech group based on the subtech groups? You can add a WHERE to the select statement, but I don't understand why you don't just join them and find the values you need.

 

or use an approach like this. 

 

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1 where tech='T1');
quit;

There's some ideas here on how to create those lists but SAS doesn't loop the way you're thinking, there's already a data step loop that you need to take advantage of, as well as the BY group processing that's supported.

 

See this example here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

 

You can run it, and I suggest you do and see how it works. If you really, really want the macro variables at the point, its a very easy addition, it would be something like the following at each of the last.group values.

 

 

if last.group then call symputx(group_name, variable_value);

View solution in original post

5 REPLIES 5
abak
Obsidian | Level 7

The point of this exercise is to subset another dataset that doesn't have the Tech column but has the SubTech column.

 

Eg, 

 

dataset2 is like:

 

SubTech Data

AB11      0.102

AB11      -1.22

LEI0       0.73

 

 

data Tech1Set;

   set dataset2;

   if SubTech IN (&subs1);

   run;

 

etc.

Reeza
Super User

Change your process:

 



proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1);
quit;
abak
Obsidian | Level 7

Thank you for your reply.

 

I don't think that will work as I have 100's of subtechs that I need to automate their listing.

 

It would be like 

 

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select ABCD, ABCE, ABCF, DCAB, ...  from table1);
quit;

There's no way I would hand-type out those SubTechs, and the list of SubTechs is expanding, which is why I need to pull the most up to date list from the first dataset that gets updated with all active subtechs.

 

If I just literally "select subtech from tabel1" then it wouldn't subset the data between the major tech groups.

Reeza
Super User

That isn't how you've shown your data. you've only shown values in a single column, not multiple variables.

 

select ABCD, ABCE, ABCF, DCAB, ... from table1 -> you only showed two variables, Tech and SubTech, where do variables ABCD and ABCE come from? If those are values you're not understanding what I'm trying to suggest.

 

And, Why for the first set are you missing the first entry?

 

subs1 = "JDE1, LEI0" -> where's the first one?

 

So you're trying to create subsets for each tech group based on the subtech groups? You can add a WHERE to the select statement, but I don't understand why you don't just join them and find the values you need.

 

or use an approach like this. 

 

proc sql;
create table tech1set as
select *
from dataset2
where subtech in (select subtech from table1 where tech='T1');
quit;

There's some ideas here on how to create those lists but SAS doesn't loop the way you're thinking, there's already a data step loop that you need to take advantage of, as well as the BY group processing that's supported.

 

See this example here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

 

You can run it, and I suggest you do and see how it works. If you really, really want the macro variables at the point, its a very easy addition, it would be something like the following at each of the last.group values.

 

 

if last.group then call symputx(group_name, variable_value);
abak
Obsidian | Level 7

I think adding the where statement will make it work. Thank you!

 

Both data sets have two columns, one has Tech and SubTech, and the other has SubTech and Data. The data sets I showed were small examples of very large 500k+ rows of information.

 

Missing the first entry just due to missing it when providing my example. 

 

The values ABCD, ABCE...etc, are just in the dataset. Those are the fake versions of the subtech names, and are values, not separate variables, which I assume should be the top column if they were supposed to be variables.

 

I don't just join them because there is some slight complexities I know how to deal with that I'm glossing over with respect to the SubTech column, which actually contains multiple techs, and therefore the two SubTech columns are not identical. I can use the where statement along with an index indicator function to get around that though.

 

I was trying to take advantage of the internal loop structure of the data step by using a sorted data set and the first/last commands. Perhaps badly. I'll check out that link. Thank you for your help!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1088 views
  • 0 likes
  • 2 in conversation