BookmarkSubscribeRSS Feed
wheddingsjr
Pyrite | Level 9

Thanks art297

 

First let me apologize to everyone, I am green when it comes to SAS code so forgive my naivete, but this code is confusing to me, Usint this code means I do or do not have to seperate the "TYPE"? I am also confused by the "from have as a" statement as well as the "  group by key1,key2,key3,key4) as b on a.key1 = b.key1" statement.

mkeintz
PROC Star

@art297   I'm still under the impression that the OP wants one row with seven groups of variables for each key combination, not 7 rows per combo with one set of vars.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
art297
Opal | Level 21

@mkeintz, you could of course be correct, but that isn't what I gleaned from the OP's example. Of course, if my latest suggested code does what he/she wants, a simple proc transpose of the results would satisfy that requirement.

 

Art, CEO, AnalystFinder.com

 

p.s. Are you going to Orlando? I didn't think I was, but just got a nice surprise from SAS Canada and will, after all, be attending.

 

 

art297
Opal | Level 21

You didn't show/mention type in your example dataset. Adding it would be simple. 

 

The code simply looks for two things: (1) all instances sharing the same key1 values and (2) all instances sharing all four key values. It creates counts for each (i.e., rec_count and comp_count) and only selects those key1 values that have equal rec_count and comp_count values (i.e., don't have any records that have multiple key2, key3 and key4 values).

 

The from have as a and from have as b statements are simply referring to the two versions of your data (i.e., the first instance where it is getting the counts of all records that share the same key1 and type values AND referring to it later as file a, the other instance where it is getting the counts of all records that share the same key1, key2, key3, key4 and type values AND referring to it later as file b).

 

I'd think that the following modification of the code would do what you need regarding the type variable:

 

proc sql;
create table want (drop=rec_count comp_count) as
select *, count(*) as rec_count
from have as a
join (
select key1,key2,key3,key4,type,count(*) as comp_count
from have
where type in ('A','B','C','D','E','F','G')
group by key1,key2,key3,key4,type) as b
on a.key1 = b.key1 and a.type eq b.type
group by a.key1,a.type
having rec_count eq comp_count and
type in ('A','B','C','D','E','F','G')
;
quit;

 Art, CEO, AnalystFinder.com

 

wheddingsjr
Pyrite | Level 9

Thanks all

 

The types were used because i began the project thinking that it was best to seperate the types. The types are actually already in my dataset but thought it would be easier bumping them up against each other thats why I wanted to differentiate them but I am finding that its use is not really necessary in the grand scheme of things.

Reeza
Super User

@wheddingsjr wrote:

Thanks all

 

The types were used because i began the project thinking that it was best to seperate the types. The types are actually already in my dataset but thought it would be easier bumping them up against each other thats why I wanted to differentiate them but I am finding that its use is not really necessary in the grand scheme of things.


Again, if you want the TYPES set up side by side, this a Transpose task rather than all the data steps and sql steps. 

If you want code look at PROC TRANSPOSE. And you're correct, typically the long format is just as useful if not easier to work with than a wide format.

 

wheddingsjr
Pyrite | Level 9

hHi again Art

 

I was almost there but at the very end got this error message:

 

415 proc sql;

416 create table results (drop=rec_count comp_count) as

417 select *, count(*) as rec_count

418 from Step2 as a

419 join (select MEMNO,BEGDATE,SProv_Spec,TAXID,count(*) as comp_count

420 from Step2

421 group by MEMNO,BEGDATE,SProv_Spec,TAXID) as b

422 on a.MEMNO = b.MEMNO

423 group by MEMNO

424 having rec_count eq comp_count;

ERROR: Ambiguous reference, column MEMNO is in more than one table.

 

(tired of writing key1 etc because that juat may be the disconnect so using the real field names)

 

art297
Opal | Level 21

My first guess would be to just change (toward the bottom of your code):

 

group by MEMNO

 

to

 

group by a.MEMNO

 

Art, CEO, AnalystFinder.com

 

Shmuel
Garnet | Level 18

Do you mean that you want only those keys with all 7 types 'A' to 'G' ?

 

There are several ways to do it:

 

1) add sort of step2 datasetbefore the 7 sql selections partA to partG and continue with the  merge with sligh changes:

data want;
merge partA (where=(type='A' )) in=inA
      partB (where=(type='B')) in=inB
      partC (where=(type='C')) in=inC
         ......
      partG(where=(type='G')) in=inG
     ;
 by  key1 key2 key3 key4;     /* equvalent to sql partA.1  partA.2 partA.3 partA.4 */
     if inA and inB and inC and ... and inG;
run;

OR

2) After sorting and selectcting by type:

   

proc sort data=step2(where=(type in ('A', 'B', 'C', 'D' 'E' 'F' 'G'))) 
                 out=TEMP; 
by key1 key2 key3 key4; 
run; 

  add two steps:

   2.1) check wether all 7 types exist in 4 keys group:

          

data keys;
  set TEMP;
   by key1 key2 key3 key4;
        length types $7;
        retain types;
        if first.key4 then types = ' ';
        select (type);
          when ("A") substr(types,1,1) = 'A';
          when ("B") substr(types,2,1) = 'B';
          when ("C") substr(types,3,1) = 'C';
        .....
          when ("G") substr(types,7,1) = 'G';
       end;
       if last.key4 and types = 'ABCDEFG'  then output;
run;  
      

    2.2)  dataset KEYS holds now only those with all 7 types.

      next merge will select them only:

 

   

data WANT;
merge TEMP (in=inT)
      KEYS (in=inK);
  by key1 key2 key3 key4;
      if inK;
run;

Both methods should have same results.

You are invited to check them and decide what is your preference;   

 

 

 

wheddingsjr
Pyrite | Level 9

Hi Shmuel

 

No I do not need the type...it was my fault for including it. I did so because I was under the impression that in order to do what I wanted to do I needed to have the 7 different tables until I learned in this forum that I didnt require it....

art297
Opal | Level 21

I hadn't tested the code before, but did this time. I was correct in my last suggestion. The following worked for me:

 

proc sql;
  create table want (drop=rec_count comp_count) as
    select *, count(*) as rec_count
      from have as a
        join (
          select memno,begdate,SProv_Spec,TAXID,count(*) as comp_count
            from have
                group by memno,begdate,SProv_Spec,TAXID) as b
          on a.memno = b.memno
            group by a.memno
              having rec_count eq comp_count
  ;
quit;

Art, CEO, AnalystFinder.com

 

wheddingsjr
Pyrite | Level 9

Hi Art

 

It worked, unfortunately there still remains lines that do not fit the criteria....the line count however is significantly less than before though.

art297
Opal | Level 21

Glad to hear that it worked but, if there is still a problem, you'd have to post examples of your have and want datasets in order for any of us to figure out what might be wrong.

 

Art, CEO, AnalystFinder.com

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 27 replies
  • 1370 views
  • 0 likes
  • 6 in conversation