Solved: How to extract columns of data1 with respect to data2?

Kaushik1 · Posted 01-24-2019 03:27 AM

I have data(data1) in following format-

obs | Label |  A  |  B  |  C  |  D |...

1   | sub1  |  12 |  12 |  45 | 12 |... 

2   |   :   |        

3

:   | sub1  |           :

and data(data2) in following format-

obs | credit | score |...

1   |   1    |   ...

2   |   3    |

3   |   1    |

4     |     2   |

5     |     2   |

where,

[len(obs) of data2]=[(len(column) of data1)-2] 2 is for excluding obs and label column.

credit=K(in this case 5)

credits are w.r.t. A,B,C,D...

I want to create tables or datasets which are having similar credits. i.e.

table1-

obs  |   A   |  C  |

1    |  12  |  45 |

2    |        :

:

table2-

obs  |   D   |  E  |

1    |   12  |     |

:

would you tell me how can I able to solve above scenario in dynamically?

ballardw · Posted 01-25-2019 11:59 AM

Really not seeing the connection between your ABC1 data set an the output.

There are times when a person is so involved with a process they don't realize how little the rest of world does not know. So people like us here may need the steps described in small chunks. Such exactly how Stud13 and Stud4 variables are chosen for table T1? And why is that one T1 instead of possibly T3 or T5 or some other name? Why only one variable in T3? Why only 3 output sets?

I am going to guess that you want to group the STUDNAME by the values of CREDIT in ABC1.

Here may be one way to approach this:

data abc;
input label$ stud1 stud2 stud3 stud4 stud5;
datalines;
Sub1 10 20 30 65 12
Sub1 45 12 45 12 23
Sub1 45 12 14 75 32
Sub1 23 45 65 42 32
;

data abc1;
input credit Tscore StudName$;
datalines;
2 123 Stud1
1 99 Stud2
3 169 Stud3
3 195 Stud4
1 95 Stud5
;
run;

Proc sort data=abc1;
   by credit studname;
run;

Proc transpose data=abc1 out=abctrans
   prefix=Var
;
by credit;
var studname;
run;

data work.code;
   set abctrans;
   array v var: ;
   varlist= catx(',', of v(*));
   /* caution credit must be integer for this to work*/
   length tablename $ 41. ;
   tablename= cats('T',credit);
   length longstring $ 200;
   longstring = catx(' ','proc sql; create table',tablename,'as select',varlist,'from abc;quit;');
   call execute(longstring);

run;

CALL execute places code strings into a stack that executes AFTER the current data step generating the calls completes. In this case I save the generated code into a data set for review but frequently a DATA _null_ is used.

If the code you need to generate for each record in the input set is more complex you can have multiple Call execute statements. Each statement is just code placed into the stack so complete statements or procedure calls need not be in one call execute statement.

View solution in original post

ballardw · Posted 01-24-2019 10:58 AM

I think that you need to provide a little bit more detail about exactly what

[len(obs) of data2]=[(len(column) of data1)-2]

means. As in place some values from your example data and what operation is being performed. You are obviously referencing some other than SAS approach as Len is not a function in SAS that I am aware of and SAS functions would use lists of variable names. So "obs" and "column" are kind of undefined in SAS terms. Explicit lists of the values and the variables mentioned in you data1 or data2 would be needed.

credit=K(in this case 5)

Is sort of meaningless as there is no K shown. So how would we tell that the value should be 5??

A more completely worked example might give us enough clues.

If you have your data in SAS then Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

Kaushik1 · Posted 01-24-2019 11:44 PM

[len(obs) of data2]=[(len(column) of data1)-2]

This line is for reference purpose, which shows if we have 6 columns in data1 then there will be total 4 rows will be there in data2.

credit=K(in this case 5)

Above line shows max(credit) for given data. In this case it 5 i.e. credit will not go above 5.

Kaushik1 · Posted 01-25-2019 05:41 AM

data abc;
input label$ stud1 stud2 stud3 stud4 stud5;
datalines;
Sub1 10 20 30 65 12
Sub1 45 12 45 12 23
Sub1 45 12 14 75 32
Sub1 23 45 65 42 32
;
proc print data=abc;run;

data abc1;
input credit Tscore StudName$;
datalines;
2 123 stud1
1 89 stud2
3 154 stud3
3 194 stud4
1 99 stud5
;
proc print data=abc1;run;

proc sql;
create table t1 as select stud3, stud4 from abc;
run;

proc sql;
create table t2 as select stud2, stud5 from abc;
run;

proc sql;
create table t3 as select stud1 from abc;
run;

This is how I created tables manually, which should be dynamically created.

Also number of tables are equals to max(credit).

This will be helpful...

Tables are containing columns which are having similar credits.

Maximum credit limit in above case is 3(which is user defined) therefore total number of tables will be equals to 3.

andreas_lds · Posted 01-25-2019 05:49 AM

Thanks for posting input data, now please post and explain what you expect as result.

Kaushik1 · Posted 01-25-2019 06:05 AM

Expected output is tables which in above case I have created manually(t1,t2,t3)

Creation of tables should be in dynamic way.

andreas_lds · Posted 01-25-2019 06:15 AM

@Kaushik1 wrote:

Expected output is tables which in above case I have created manually(t1,t2,t3)

Creation of tables should be in dynamic way.

Ok, got that. And now please describe the logic that resulted in selecting Stud3 and Stud4. As @ballardw already mentioned, your explanation in the initial post is difficult - for me impossible - to understand.

Kaushik1 · Posted 01-25-2019 12:18 PM

Having similar credits.

ballardw · Posted 01-25-2019 11:59 AM

Really not seeing the connection between your ABC1 data set an the output.

There are times when a person is so involved with a process they don't realize how little the rest of world does not know. So people like us here may need the steps described in small chunks. Such exactly how Stud13 and Stud4 variables are chosen for table T1? And why is that one T1 instead of possibly T3 or T5 or some other name? Why only one variable in T3? Why only 3 output sets?

I am going to guess that you want to group the STUDNAME by the values of CREDIT in ABC1.

Here may be one way to approach this:

data abc;
input label$ stud1 stud2 stud3 stud4 stud5;
datalines;
Sub1 10 20 30 65 12
Sub1 45 12 45 12 23
Sub1 45 12 14 75 32
Sub1 23 45 65 42 32
;

data abc1;
input credit Tscore StudName$;
datalines;
2 123 Stud1
1 99 Stud2
3 169 Stud3
3 195 Stud4
1 95 Stud5
;
run;

Proc sort data=abc1;
   by credit studname;
run;

Proc transpose data=abc1 out=abctrans
   prefix=Var
;
by credit;
var studname;
run;

data work.code;
   set abctrans;
   array v var: ;
   varlist= catx(',', of v(*));
   /* caution credit must be integer for this to work*/
   length tablename $ 41. ;
   tablename= cats('T',credit);
   length longstring $ 200;
   longstring = catx(' ','proc sql; create table',tablename,'as select',varlist,'from abc;quit;');
   call execute(longstring);

run;

CALL execute places code strings into a stack that executes AFTER the current data step generating the calls completes. In this case I save the generated code into a data set for review but frequently a DATA _null_ is used.

If the code you need to generate for each record in the input set is more complex you can have multiple Call execute statements. Each statement is just code placed into the stack so complete statements or procedure calls need not be in one call execute statement.

Kaushik1 · Posted 01-25-2019 12:32 PM

Q1. Really not seeing the connection between your ABC1 data set an the output.

ans- Values of abc1(data2) are calculated on the basis of abc(data1)

Q.2- How Stud3 and Stud4 variables are chosen for table T1?

ans- Because they are having similar credits.

Q.3 Why only 3 output sets?

ans- Because maximum credits assign is 3.

Note- (Stud3 & Stud4) can put in any one of table(t1,t2,t3).

How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Re: How to extract columns of data1 with respect to data2?

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away