BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Demographer
Pyrite | Level 9

Hi,

I have a first dataset called "survieH" that looks like this:

p1p2p3p4p5p6p7p8
0,998053950,999729430,999788890,999832480,999860970,999880870,999892950,99989952

I have a second dataset called "survieF" that is similar to the first one:

p1p2p3p4p5p6p7p8
0,998294330,999768910,999826960,999865110,999890350,99990690,999915570,99992048

Finaly, I have the main database ("bd2006") that contains many observations (30000) and several variables, which "sex" (0-1).

I want to create a new dataset that merges them under the condition that if sex=1 then variable I pick the p1-p8 variables in the survieF dataset and if sex=0, I pick the other one.

I tried this, without succes:

data work.bd2007;

set work.bd2006;

if _N_ EQ 1 and sex=0 then set survieH ; 

if _N_ EQ 1 and sex=1 then set survieF ;

I know I can do it by doing this, but I want to reduce the size of the dataset, so I don't think it's the best way.

data survieH (keep ph1-ph8);

set survieH;

p1=ph1;

...

p8=ph8;

run;

data work.bd2007;

set work.bd2006;

if _N_ EQ 1 then do set survieH; set survieF ; end;

array ph ph1-ph8;

array p p1-p8;

do i=1 to 8;

if sex=0 then p(i)=ph(i);

end;

drop ph1-ph8;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Here is one way using the POINT= option to prevent SAS from stopping because you read all of the P value records.

Added logic to force P1-P8 to be missing if SEX is neither 0 or 1.  SAS will automatically drop the variables named in the POINT= options.

data work.bd2007;

  set work.bd2006;

  pointer1=1;

  pointer2=1;

  if sex=0 then set survieH point=pointer1;

  else if sex=1 then set survieF point=pointer2;

  else call missing(of p1-p8);

run;

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

Here is one way using the POINT= option to prevent SAS from stopping because you read all of the P value records.

Added logic to force P1-P8 to be missing if SEX is neither 0 or 1.  SAS will automatically drop the variables named in the POINT= options.

data work.bd2007;

  set work.bd2006;

  pointer1=1;

  pointer2=1;

  if sex=0 then set survieH point=pointer1;

  else if sex=1 then set survieF point=pointer2;

  else call missing(of p1-p8);

run;

Demographer
Pyrite | Level 9

Thank you, it works.

But I don't understand what is the purpose of pointer1 and pointer2.

Tom
Super User Tom
Super User

The POINT= option of the SET statement uses a data step variable to allow you to dynamically directly access observations in a dataset. That is what the two pointer variables are used for.  Because we have two SET statements we need two different variables.

In this case we are always accessing the same observation because the pointer variables are always set to one.  Why it is useful for this application is that it lets you use the extra two SET statements without worrying that SAS will move past the end of one of them and cause your data step to end prematurely. Also by re-accessing the same observations from the two P value tables over and over we can reload the P1-P8 variables without having to rename them or create arrays or code DO loops to copy from the renamed variables to the desired names.

Demographer
Pyrite | Level 9

So if I have 15 SET statements, I will need 15 variables?

Tom
Super User Tom
Super User

If you want to use POINT= option on each of them.  But if you have 15 then you probably do not want to code it this way, or you want to use a macro to generate the code for you.

You could do your combination using PROC SQL by adding the SEX variable to the data with the P values.  Either keep the SEX variable when you are generating the P values or add it back in.

proc sql ;

  create table want as

    select a.*,b.*

    from have a

       left join

        (select 1 as SEX,f.* from survieF f

         union

         select 0 as SEX,h.* from survieH h) b

    on a.sex = b.sex

  ;

quit;


Astounding
PROC Star

Tom,

Do you really need 15?  Wouldn't a single variable work just as well across all SET statements?

Tom
Super User Tom
Super User

Actually I tested and it does not require different variables.  So a single pointer variable could work.  I am just used to using different variables because of the confusion that would occur if you used the same variable for options that the SET statement modifies the values such as END= or NOBS=.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1060 views
  • 0 likes
  • 3 in conversation