Re: Cyclic reading of the datasets

joycefilm · Posted 07-10-2023 11:51 AM

The following datasets temp1 and temp2. What methods are available to get temp3?

数据集temp1：

Visit tpt VAL

C1D1 -12 1

C2D1 0 2

C3D1 1 4

C4D1 2 5

C5D1 3 1

...

数据集temp2：

VAL=3

数据集temp3：

Visit tpt VAL

C1D1 -12 1

C1D1 -12 3

C2D1 0 2

C2D1 0 3

C3D1 1 4

C3D1 1 3

C4D1 2 5

C4D1 2 3

C5D1 3 1

C5D1 3 3

FreelanceReinh · Posted 07-10-2023 12:27 PM

Hello @joycefilm,

You can read dataset TEMP2 in each iteration of a DATA step reading dataset TEMP1 by using the POINT= option (see documentation of SET options):

data temp3;
set temp1;
output;
_n_=1;
set temp2 point=_n_;
output;
run;

To generalize this to a TEMP2 dataset with two or more observations, use a DO loop and the NOBS= option:

data temp3;
set temp1;
output;
do _n_=1 to n2;
  set temp2 point=_n_ nobs=n2;
  output;
end;
run;

Tom · Posted 07-10-2023 12:31 PM

Do you know how TEMP3 is supposed to be created from TEMP1 and TEMP2, but just don't know how to program the algorithm?

If so can you explain how TEMP3 is related to TEMP1 and TEMP2?

Or are you expecting use to GUESS how TEMP3 is related to TEMP1 and TEMP2?

Reeza · Posted 07-10-2023 01:11 PM

You can hard code in the duplication with Val = 3, but if that will change and you need multiple or something it will be a different process. And is TEMP2 a single variable with the val = 3 or does it have a variable called VAL set to 3?

mkeintz · Posted 07-10-2023 11:41 PM

Let's generalize a little, and assume that TEMP2 has more than one observation, and has more than one variable, some of which may not be in TEMP1.

data temp1;
  input Visit :$4.   tpt   VAL;
datalines;
C1D1   -12    1
C2D1    0     2
C3D1    1     4
C4D1    2     5
C5D1    3     1
run;

data temp2;
  input val   newvar;
datalines;
3     1001
3.1   1002
run;

Then it might be worthwhile to make a hash of TEMP2 (with an additional _ORDER variable). For each obs in TEMP1, output to temp3 and add an extra obs for every dataitem in the hash, in sequence determined by _ORDER.

data dummy2;
  _order=_n_;
  set temp2;
run;

data temp3 (drop=_:) ;
  set temp1;
  output;

  if _n_=1 then do;
    if 0 then set dummy2 nobs=size_of_temp2;
    declare hash h (dataset:'dummy2');
      h.definekey('_order');
      h.definedata(all:'Y');
      h.definedone();
  end;

  do _order=1 to size_of_temp2;
    h.find(key:_order);
    output;
  end;
  call missing(of _all_);
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Cyclic reading of the datasets