BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Syntas_error
Quartz | Level 8

So I have this task where I have the variable "team number" and "tec". Unfortunately, the value of "tec" isn't always correct. The correct team numbers and corresponding tec- values are listed in a separate document,  I'm charged with listing the of the incorrect ones. 

 

I start by writing

 

 

proc sql feedback;
select a.teamno as teamno, a.tec as tec1, b.tec as tec2
from tabell_1 as a, val2 as b
where a.teamno=b.teamno and tec1 ^= tec2 ;
quit;

Which would be all well and good....if each team number was assigned to one and only one tec-number, unfortunately, this isn't the case, and the following six teams can have multiple tec numbers assigned to them:

 

 

Team Tec

120    3
120    9

 

302    5
302    8

 

502    15
502    16

 

512    14
512    15

 

520    9
520    15

 

523    11
523    16

 

I want to make an exception for these values (I don't know if that's the most elegant solution), so I write:  

 

 

proc sql feedback;
select a.teamno as teamno, a.tec as tec1, b.tec as tec2
from tabell_1 as a, val2 as b
where a.teamno=b.teamno and tec1 ^= tec2 
	and (a.teamno ^=120 and (a.tec=3 or 9 and b.tec=3 or 9))
	;
quit;

 

 

Unfortunately, only the first part of the added condition seems to be excecuted:

 

NOTE: Statement transforms to:

        select A.teamno as teamno, A.tec as tec1 label='tec', B.tec as tec2 label='tec'
          from WORK.TABELL_1 A, WORK.VAL2 B
         where (A.teamno = B.teamno) and (A.tec1 not = B.tec2) and (A.teamno not = 120);

1155  quit;

 

 

what an I doing wrong, and is there a better way of doing this?

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Syntas_error
Quartz | Level 8

Right, I'm trying the following atm, I "think" this is a better solution but do tell me if I'm wrong. 

 

First I perform a full join on two conditions, where team number "and" tec-value is equal for values in the two tables, I save the resulting join as a new table. 

 

proc sql feedback;
create table kanske as
select a.teamno as teamno, a.tec as tec1 label='tec1', b.tec as tec2 label='tec2'
from tabell_1 as a full join val2 as b
on a.teamno=b.teamno and a.tec=b.tec;
quit;

Then I make a query on this new table, for the observatoins where tec2 does not equal tec1

 

proc sql feedback;
select teamno, tec1, tec2
from kanske
where tec1^=tec2;
quit;

I make a manual inspection of the results, this seems legit.

 

Finally I tell SAS to count every observation for which pec1=/=pec2:

 

 

proc sql feedback;
select count(*)
from kanske
where tec1^=tec2;
quit;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

Can you provide some representative data of what tabell_1 and val2 looks like? And also what you want the desired result to look like given this data?

 

Makes it much easier to provide a usable code answer 🙂

Syntas_error
Quartz | Level 8

Tabell_1 is enormous but I'll provide a snippet:

 

(variable)(variable)Teamnotec(variable)(variable)(variable)
710B212610301551710
1038B212607301581420
846B212608301591512
897B2130533025161551
722B2130543025181438
796B21514030216351645
721B213055302519179
1257B2130503025121122
867B213047302510658
1370B213048302513855
1369B213049302514948
898B213052302515158
919B2130513028111210
332B2130583025141945
333B213057302511855
334B213056302517184
1234B2116093036221215

 

 

The "val2"-file looks like this:

 

22919
2309
3005
3015
3025
3028
3036
3046
3056
3064
3074
3084
3094
3104
3209
3224

 

In my query I try to compare "real" and "reported" tec-numbers, that would look something like this:

 

TeamnoTec1Tec(real)
30258
30258
302165
302168
30258
30258
30258
30258
30258
30258
30285
30258
30258
30258
30794
320129

 

 

We can see that team 302 is assigned to two different areas (or "pecs"). There's no unique ID-number present in the val2-file so I can't really work through standard procedures. The only values I can list as being incorrect (with any degree of certainty) is the ones not allowed for that team number. In this section, I would only really want to include the teams with a recorded PEC-value of 16, Values of 5 and 8 could both be fine, since we don't have any unique ID-values to work with!

Syntas_error
Quartz | Level 8

Right, I'm trying the following atm, I "think" this is a better solution but do tell me if I'm wrong. 

 

First I perform a full join on two conditions, where team number "and" tec-value is equal for values in the two tables, I save the resulting join as a new table. 

 

proc sql feedback;
create table kanske as
select a.teamno as teamno, a.tec as tec1 label='tec1', b.tec as tec2 label='tec2'
from tabell_1 as a full join val2 as b
on a.teamno=b.teamno and a.tec=b.tec;
quit;

Then I make a query on this new table, for the observatoins where tec2 does not equal tec1

 

proc sql feedback;
select teamno, tec1, tec2
from kanske
where tec1^=tec2;
quit;

I make a manual inspection of the results, this seems legit.

 

Finally I tell SAS to count every observation for which pec1=/=pec2:

 

 

proc sql feedback;
select count(*)
from kanske
where tec1^=tec2;
quit;
Astounding
PROC Star

If your data sets were smaller, this would be a simple sort-and-merge problem:

 

proc sort data=table1;
   by team tec;
run;
proc sort data=val;
   by team tec;
run;

data want;
   merge table1 (in=in1) val (in=in2);
   by team tec;
   if in2=0;
run;

I'm assuming the reason you aren't doing this is the size of TABLE1 and the time it would take to sort it.  You can overcome that in a couple of ways.  Hashing is one.  Formats is another, as shown here:

 

data valid_vals;
   set val;
   start = catx(' ', team, tec);
   label='VALID';
   fmtname = '$found';
run;
proc format cntlin=valid_vals;
run;

This gives you a format that translates valid combinations of TEAM + TEC into the word "VALID".  That's the hard part.  Using the format is easier:

 

data want;
   set table1;
   where put(catx(' ', team, tec), $found.) ne 'VALID';
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1862 views
  • 2 likes
  • 3 in conversation