BookmarkSubscribeRSS Feed
Tariq519
Fluorite | Level 6

I am running into a problem where I would like to duplicate data from one row that contains data to other rows with the same key value, but does not contain any additional data. However, ONLY conditioned if variable 3 or V3 is equal to "Cardio". For instance. I have a dataset that looks like this.

 

Tariq519_3-1717809190112.png

 

 

 

The first two row contain the same key, but the second row does not contain any other information other than the key (no value for v1,v2, v3 etc.) Because the first row has "cardio" for V3, this row would need to be duplicated to the second row that does not contain the information. Now the second image is what I want the dataset to look like.

 

Tariq519_2-1717809154360.png

 

 

How am I able to do that in SAS? Thanks in advance.

4 REPLIES 4
ballardw
Super User

It is real hard to program against pictures. Best on this forum is to provide example data in the form of data step code so we can make a data set.

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

Completely untested and assumes your data is sorted by the Key variable. Assumes there is only one row that needs to be assigned.

data want;
   set have;
   by key;
   lv1 = lag(v1);
   lv2 = lag(v2);
   lv3 = lag(v3);
   lv4 = lag(v4);
   lv5 = lag(v5);
   if not first.key and lv3='Cardio' and missing(v2) then do;
      /* assumes that missing v2 means all of v1-v4 need to be replaced*/
      v1 = lv1;
      v2 = lv2;
      v3 = lv3;
      v4 = lv4;
      v5 = lv5;
   end;
   drop lv1 - lv5;
run;

The BY statement creates variables that indicate whether the current observation is the first or last of the By group. The values are 1 (true) and 0 (false) and accessed using the First.variablename and last.variablename syntax.

Lag will retrieve the previous observation values from a data set on the SET statement. LAG is a queued function and will retrieve the value the last time a condition was true so you likely in this case do not want to conditionally request the lagged values.

Tariq519
Fluorite | Level 6

Thank you so much for this! This is my first post here, however I will definitely take your advice moving forward whenever I run into any problems. Thank you!!!

mkeintz
PROC Star

@ballardw 's response is the most sensible given you only have five variables V1 through V5.  

 

But if you have, say, 50 vars, you would need to code 50 LVx=lag(Vx) statements and 50 Vx=LVx statements:  In that case, given my understanding of your task, this might be preferable:

 

data want (drop=_:) ;
  set have ;
  by id;
  retain _id . ;
  if _n_=1 then do;
    declare hash h (dataset:'have (obs=0 rename=(id=_id))');
      h.definekey('_id');
      h.definedata(all:'Y');
      h.definedone();
  end;

  if first.id then do;
    if v3='Cardio' then _rc=h.replace();
    else _rc=h.remove();
  end;
  else if missing(v2) then _rc=h.find();
run;

Remember, this checks for any ID that begins with V3='Cardio', and all subsequent records that have missing V2.  That's how I understand your description.

 

 

Editted note: corrected code from 

    if v3='Cardio' then _rc h.replace();

to

    if v3='Cardio' then _rc=h.replace();

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tariq519
Fluorite | Level 6

this is highly helpful as well! Going to test this one out.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 339 views
  • 3 likes
  • 3 in conversation