BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
WLP165
Calcite | Level 5

I have the following groupings of data (fictitious data) and need a little help.

Value1   Value2   Value3   Value4   Value5   Value6

01          123         00          2            1234       1

01          123         01          3            1235       1

01          123         02          3            1234       2

02          124         00          2            1234       1

02          124         01          3            1236       1

02          124         02          3            1235       1

03          125         00          2            1230       1

03          125         01          3            1234       1

03          125         03          5            1235       1

The Value2 variable will always be the same within each group (account), such as shown above.  Where the Value4 = 2 and Value5 = 1234, then I have to include ALL of the observations in each group (i.e., all three obs in the first two groups above) into an output SAS dataset.  The last grouping of data would NOT be included.

Any help/examples are appreciated (have not yet done something like this in SAS; I'm a novice user).

1 ACCEPTED SOLUTION

Accepted Solutions
WLP165
Calcite | Level 5

Forgot the second parenthesis in the second "sum" line (typo).  This produced what I needed:

proc sql;

  create table want as

    select * from have

      group by value1,value2 having

        (sum(catx('  ',value4,value5,value6) = '3 1234 1') or

         sum(catx('  ',value4,value5,value6) = '5 1234 1')) and

        (sum(value4 = '2' and value5 ne '1234'));

quit;

View solution in original post

26 REPLIES 26
Community_Help
SAS Employee

stat_sas
Ammonite | Level 13

proc sql;

create table want as

select * from have

where value1 in (select value1 from have where value4=2 and value5=1234);

quit;

WLP165
Calcite | Level 5

Thanks so much for the help!

WLP165
Calcite | Level 5

I expanded the data example somewhat to be more realistic.  See below.  Given the same value4 and value5 extract criteria, the first group of obs for value1 = 01 would be pulled, the first group of obs where value1=02 would be pulled, and the second group of obs where value=03 would be pulled.  Would this change your code solution any?

Value1   Value2   Value3   Value4   Value5   Value6

01          123         00          2            1234       1

01          123         01          3            1235       1

01          123         02          3            1234       2

01          127         00          2            1237       1

01          127         01          3            1234       1

01          127         02          3            1234       1

02          124         00          2            1234       1

02          124         01          3            1236       1

02          124         02          3            1235       1

02          128         00          2            1238       1

02          128         01          3            1234       1

02          128         02          3            1234       1

03          125         00          2            1230       1

03          125         01          3            1234       1

03          125         03          5            1235       1

03          129         00          2            1234       1

03          129         01          3            1234       1

03          129         02          3            1234       1

art297
Opal | Level 21

If I correctly understand what you want to do, one minor correction needed to be made for stat's proposed code to work, namely:

proc sql;

  create table want as

     select * from have

        where value2 in (select value2 from have where value4=2 and value5=1234);

quit;

Ksharp
Super User
data have;
input (Value1   Value2   Value3   Value4   Value5   Value6 ) ($);
cards;
01          123         00          2            1234       1
01          123         01          3            1235       1
01          123         02          3            1234       2
01          127         00          2            1237       1
01          127         01          3            1234       1
01          127         02          3            1234       1
02          124         00          2            1234       1
02          124         01          3            1236       1
02          124         02          3            1235       1
02          128         00          2            1238       1
02          128         01          3            1234       1
02          128         02          3            1234       1
03          125         00          2            1230       1
03          125         01          3            1234       1
03          125         03          5            1235       1
03          129         00          2            1234       1
03          129         01          3            1234       1
03          129         02          3            1234       1
;
run;
proc sql;
create table want as
 select * from have group by value1,value2 having sum(catx(' ',value4,value5)='2 1234') gt 0;
quit;


Xia Keshan

WLP165
Calcite | Level 5

I tried Xia's code.  I do get a SAS dataset created, but with no observations and six columns.  Also, the SAS Log shows a message that says "The query requires remerging summary statistics back with the original data."

WLP165
Calcite | Level 5

One additional note.  I bring in the data from an external text file via a data step using infile and input statements.  However, the suggested Proc SQL code used was from Xia.

William

art297
Opal | Level 21

Are Value4 and Value5 both character variables? Also, are both Value1 and Value2 needed to identity a unique group, or just Value2?

The remerging summary statistics note is simply a note produced when incorporating summary functions. Nothing to be concerned about. But not getting the desired data is definitely a concern.

WLP165
Calcite | Level 5

The Value4 and Value5 are both character variables.  Also, yes, both Value1 and Value2 would be needed to identify unique groups.  The Value1 is basically a fictitious code (for, let's say, a State), and Value2 is simply a unique fictitious number identifier of each group within each Value1.  In other words, there will be many unique Value2's within each Value1.

art297
Opal | Level 21

Actually, 's code will work with both character and numeric. I'll have to guess that your actual data doesn't have any records where Value4=2 and Value5=1234

WLP165
Calcite | Level 5

My actual data does have Value4 of 2s, which are also inputted as character format.  The Value5 values (also inputted as character) are actually six-digit numbers (just did them as four-digits here for illustrative purposes).  Also, there is no summing done in the mathematical sense. Just trying to extract certain observations (records) from a database based on the criteria noted, and then output the records to a separate dataset.

William

art297
Opal | Level 21

's code is simply counting the number of times that value4 and value5 contain a 2 and 1234, respectively, within the groups records. If the combination exists at least once, all of the groups records are output. You now said that value5 is actually a 6 character variable, thus obviously 1234 won't match. I think you have to provide a better example of what your data look like and what your criteria are. His code can handle any length variable.

WLP165
Calcite | Level 5

My apologies, I inadvertently used an incorrect Value5 in my extract criteria.  The Proc SQL code worked and got what I needed.  Thanks so much for your help!

William

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 26 replies
  • 1470 views
  • 0 likes
  • 5 in conversation