BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FerGui1
Fluorite | Level 6

Hello,

From the data set below (data set A) I need to retain (or extract to a subdata set) common Condition values between the Period 1 and 2 (which would be B, C, D, and E); I also need to do the same among Period 1, 2 and 3 (C, D, E); and among Period 1, 2, 3, and 4 (D, E). Any help on this will be much appreciated.

Thanks.

 

Data set A
Period      Condition    Response1    Response2    Response3

1              A                                                               

1              B                                                               

1              C                                                               

1              D                                                              

1              E                                                               

2              B                                                               

2              C                                                               

2              D                                                              

2              E                                                               

3              C                                                               

3              D                                                              

3              E                                                               

4              D                                                              

4              E                                                               

1 ACCEPTED SOLUTION

Accepted Solutions
FerGui1
Fluorite | Level 6

Hi Mkeintz,

It works.

Thank you much for your assistance on this.

 

View solution in original post

12 REPLIES 12
ballardw
Super User

You are not expressing your need very clearly.

Show us least two different of these subsets.

 

And why do you need a different data set? If the criteria is easy enough you could likely use that for any need.

Reeza
Super User

Could you please illustrate the expected output, if this is your input data?

Are your rules stated below fixed, or do they change depending on the data somehow?

 


@FerGui1 wrote:

Hello,

From the data set below (data set A) I need to retain (or extract to a subdata set) common Condition values between the Period 1 and 2 (which would be B, C, D, and E); I also need to do the same among Period 1, 2 and 3 (C, D, E); and among Period 1, 2, 3, and 4 (D, E). Any help on this will be much appreciated.

Thanks.

 

Data set A
Period      Condition    Response1    Response2    Response3

1              A                                                               

1              B                                                               

1              C                                                               

1              D                                                              

1              E                                                               

2              B                                                               

2              C                                                               

2              D                                                              

2              E                                                               

3              C                                                               

3              D                                                              

3              E                                                               

4              D                                                              

4              E                                                               


 

FerGui1
Fluorite | Level 6

Hi Reeza,

The major data set is:

Data set A                                                                                                       

Period           Batch          Condition      Response1        Response2        Response3

1                     1                  A                                                                          

1                     1                  B                                                                          

1                     1                  C                                                                          

1                     1                  D                                                                          

1                     1                  E                                                                          

1                     2                  A                                                                          

1                     2                  B                                                                          

1                     2                  C                                                                          

1                     2                  D                                                                          

1                     2                  E                                                                          

2                     1                  B                                                                          

2                     1                  C                                                                          

2                     1                  D                                                                          

2                     1                  E                                                                          

2                     2                  B                                                                          

2                     2                  C                                                                          

2                     2                  D                                                                          

2                     2                  E                                                                          

3                     1                  C                                                                          

3                     1                  D                                                                          

3                     1                  E                                                                          

3                     2                  C                                                                          

3                     2                  D                                                                          

3                     2                  E                                                                          

4                     1                  D                                                                          

4                     1                  E                                                                          

4                     2                  D                                                                          

4                     2                  E                                                                          

The Batch variable is a block effect in the experiment (replication).

From this data set I need to create 3 sub data sets for further analyses:

The first one (sub data set A1) should only include common observations within the variable Condition for Period 1 and Period 2, s this sub data set will look like:

Sub data set A1                                                                                                              

Period                               Batch       Condition      Response1       Response2        Response3

1                                         1              B                                                                          

1                                         1              C                                                                          

1                                         1              D                                                                          

1                                         1              E                                                                          

1                                         2              B                                                                          

1                                         2              C                                                                          

1                                         2              D                                                                          

1                                         2              E                                                                          

2                                         1              B                                                                          

2                                         1              C                                                                          

2                                         1              D                                                                          

2                                         1              E                                                                          

2                                         2              B                                                                          

2                                         2              C                                                                          

2                                         2              D                                                                          

2                                         2              E      

 

The second one (sub data set A2) should only include common observations within the variable Condition for Period 1, Period 2, and Period 3, so this sub data set will look like:

Sub data set A2                                                                                           

Period                      Batch    Condition    Response1    Response2    Response3

1                              1           C                                                              

1                              1           D                                                             

1                              1           E                                                              

1                              2           C                                                              

1                              2           D                                                             

1                              2           E                                                              

2                              1           C                                                              

2                              1           D                                                             

2                              1           E                                                              

2                              2           C                                                              

2                              2           D                                                             

2                              2           E                                                              

3                              1           C                                                              

3                              1           D                                                             

3                              1           E                                                              

3                              2           C                                                              

3                              2           D                                                             

3                              2           E                                                              

And the third sub data set will include all periods, so it will look like:

Sub data set A3                                                                                           

Period                      Batch    Condition    Response1    Response2    Response3

1                              1           D                                                             

1                              1           E                                                              

1                              2           D                                                             

1                              2           E                                                              

2                              1           D                                                             

2                              1           E                                                              

2                              2           D                                                             

2                              2           E                                                              

3                              1           D                                                             

3                              1           E                                                              

3                              2           D                                                             

3                              2           E                                                              

4                              1           D                                                             

4                              1           E                                                              

4                              2           D                                                             

4                              2           E                                                              

 

tarheel13
Rhodochrosite | Level 12
Why don’t you just make a dataset for period 1 and another for period 2 and then merge by condition. You can use in operators (in=a) and (in=b) and do if a and b.
FerGui1
Fluorite | Level 6
I did that but the only output I've got was a column with the condition values that were common among periods, nothing else.
mkeintz
PROC Star

It's unclear, but I think this is what you want:

data have;
  input Period Condition :$1. ;
datalines;
1 A
1 B
1 C
1 D
1 E
2 B
2 C
2 D
2 E
3 C
3 D
3 E
4 D
4 E
run;
data want (drop=period);
  merge have (where=(period=1) in=in1)
        have (where=(period=2) in=in2)
        have (where=(period=3) in=in3)
        have (where=(period=4) in=in4);
  by condition;
  period_1_2 = min(in1,in2);
  period_1_2_3 = min(in1,in2,in3);
  period_1_2_3_4 = min(in1,in2,in3,in4);
run;


This assumes that within each period, the data are sorted by condition, although the data need not be sorted by period.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
FerGui1
Fluorite | Level 6

Hi MKeintz,

My apologies if I was not clear with my question. To be more clear, the major data set is:

Data set A                                                                                                       

Period           Batch          Condition      Response1        Response2        Response3

1                     1                  A                                                                          

1                     1                  B                                                                          

1                     1                  C                                                                          

1                     1                  D                                                                          

1                     1                  E                                                                          

1                     2                  A                                                                          

1                     2                  B                                                                          

1                     2                  C                                                                          

1                     2                  D                                                                          

1                     2                  E                                                                          

2                     1                  B                                                                          

2                     1                  C                                                                          

2                     1                  D                                                                          

2                     1                  E                                                                          

2                     2                  B                                                                          

2                     2                  C                                                                          

2                     2                  D                                                                          

2                     2                  E                                                                          

3                     1                  C                                                                          

3                     1                  D                                                                          

3                     1                  E                                                                          

3                     2                  C                                                                          

3                     2                  D                                                                          

3                     2                  E                                                                          

4                     1                  D                                                                          

4                     1                  E                                                                          

4                     2                  D                                                                          

4                     2                  E                                                                          

The Batch variable is a block effect in the experiment (replication).

From this data set I need to create 3 sub data sets for further analyses:

The first one (sub data set A1) should only include common observations within the variable Condition for Period 1 and Period 2, s this sub data set will look like:

Sub data set A1                                                                                                              

Period                               Batch       Condition      Response1       Response2        Response3

1                                         1              B                                                                          

1                                         1              C                                                                          

1                                         1              D                                                                          

1                                         1              E                                                                          

1                                         2              B                                                                          

1                                         2              C                                                                          

1                                         2              D                                                                          

1                                         2              E                                                                          

2                                         1              B                                                                          

2                                         1              C                                                                          

2                                         1              D                                                                          

2                                         1              E                                                                          

2                                         2              B                                                                          

2                                         2              C                                                                          

2                                         2              D                                                                          

2                                         2              E      

 

The second one (sub data set A2) should only include common observations within the variable Condition for Period 1, Period 2, and Period 3, so this sub data set will look like:

Sub data set A2                                                                                           

Period                      Batch    Condition    Response1    Response2    Response3

1                              1           C                                                              

1                              1           D                                                             

1                              1           E                                                              

1                              2           C                                                              

1                              2           D                                                             

1                              2           E                                                              

2                              1           C                                                              

2                              1           D                                                             

2                              1           E                                                              

2                              2           C                                                              

2                              2           D                                                             

2                              2           E                                                              

3                              1           C                                                              

3                              1           D                                                             

3                              1           E                                                              

3                              2           C                                                              

3                              2           D                                                             

3                              2           E                                                              

And the third sub data set will include all periods, so it will look like:

Sub data set A3                                                                                           

Period                      Batch    Condition    Response1    Response2    Response3

1                              1           D                                                             

1                              1           E                                                              

1                              2           D                                                             

1                              2           E                                                              

2                              1           D                                                             

2                              1           E                                                              

2                              2           D                                                             

2                              2           E                                                              

3                              1           D                                                             

3                              1           E                                                              

3                              2           D                                                             

3                              2           E                                                              

4                              1           D                                                             

4                              1           E                                                              

4                              2           D                                                             

4                              2           E                                                              

 

                                                                    

 

 

tarheel13
Rhodochrosite | Level 12
Can you please post the data as data lines? It is hard to see what columns you have like this.
mkeintz
PROC Star

Consider:

data period_12   (where=(_max_period>=2 and period<=2))
     period_123  (where=(_max_period>=3 and period<=3))
     period_1234 (where=(_max_period>=4 and period<=4));

  merge have (where=(period=1) in=in1)
        have (where=(period=2) in=in2) 
        have (where=(period=3) in=in3) 
        have (where=(period=4) in=in4) ;
  by batch condition;

  if min(in1,in2,in3,in4) then _max_period=4; else
  if min(in1,in2,in3)     then _max_period=3; else
  if min(in1,in2)         then _max_period=2;

  if _max_period^=. then do period=1 to _max_period;
    output;
  end;
run;

The OUTPUT statements write to all 3 output datasets.  But those datasets each have their own filter to ensure that only the right obs are written (i.e. so that period_1234 doesn't get any obs that are common only to 1 and 2).  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
FerGui1
Fluorite | Level 6

Hi Mkeintz,

It works.

Thank you much for your assistance on this.

 

Ksharp
Super User
data have;
  input Period Condition :$1. ;
datalines;
1 A
1 B
1 C
1 D
1 E
2 B
2 C
2 D
2 E
3 C
3 D
3 E
4 D
4 E
;
run;

proc sql;
create table period_1_2 as
select Condition from have where period=1
intersect
select Condition from have where period=2
;


create table period_1_2_3 as
select Condition from period_1_2
intersect
select Condition from have where period=3
;

quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 798 views
  • 0 likes
  • 7 in conversation