BookmarkSubscribeRSS Feed
kalbo
Obsidian | Level 7

hi, I have a situation where ADSL variables are selectively added to a list. In this example CORE1 should be replaced by a list of variables present in ADSL that have a coregroup of 1. CORE2 should be replaced by a list of variables in ADSL with coregroup of 2 and so on. I would like to automate this addition of variables.

 

HAVE:

dataset variable label coregroup

ADAE   USUBJID  Subject ID

ADAE :CORE1

ADAE :CORE2

ADAE  TRT  Treatment

ADMH   USUBJID  Subject ID

ADMH  MHTERM  Medical History Term

ADMH  TRT  Treatment Group

ADSL  USUBJID  Subject ID

ADSL RAND  Rand flag  1

ADSL ETHNEW  New Ethnicity  2

ADSL  SITEID  Site ID

 

WANT:

dataset variable label coregroup

ADAE   USUBJID  Subject ID

ADAE RAND  Rand flag  1

ADAE ETHNEW  New Ethnicity  2

ADAE  TRT  Treatment

ADMH   USUBJID  Subject ID

ADMH  MHTERM  Medical History Term

ADMH  TRT  Treatment Group

ADSL  USUBJID  Subject ID

ADSL RAND  Rand flag  1

ADSL ETHNEW  New Ethnicity  2

ADSL  SITEID  Site ID

11 REPLIES 11
andreas_lds
Jade | Level 19

Please post data in usable form, making it easier to understand the problem and giving us something to work with.

The output statement can be used to create additional obs.

kalbo
Obsidian | Level 7
data have;
	infile cards dsd;
	input dataset $ variable : $10. label : $20. coregroup $;
	cards;
ADAE,USUBJID,Subject ID,
ADAE,:CORE1,,
ADAE,:CORE2,,
ADAE,TRT,Treatment,
ADMH,USUBJID,Subject ID,
ADMH,MHTERM,Medical History Term,
ADMH,TRT,Treatment Group,
ADSL,USUBJID,Subject ID,
ADSL,RAND,Rand flag,1,
ADSL,ETHNEW,New Ethnicity,2,
ADSL,SITEID,Site ID,1,
;
kalbo
Obsidian | Level 7

For example to process CORE1, we need to find all records in ADSL with coregroup=1. Then add these 2 records in to replace CORE1.

ADAE RAND  Rand flag  1

ADAE  SITEID  Site ID 1

 

Then move to CORE2 and add in the following records to replace CORE2:

ADAE ETHNEW  New Ethnicity  2

 

The ADSL records can be added to any data set in the list which will contain all data sets. The list isnt so big.

 

WANT:

dataset variable label coregroup

ADAE   USUBJID  Subject ID

ADAE RAND  Rand flag  1

ADAE  SITEID  Site ID 1

ADAE ETHNEW  New Ethnicity  2

ADAE  TRT  Treatment

ADMH   USUBJID  Subject ID

ADMH  MHTERM  Medical History Term

ADMH  TRT  Treatment Group

ADSL  USUBJID  Subject ID

ADSL RAND  Rand flag  1

ADSL ETHNEW  New Ethnicity  2

ADSL  SITEID  Site ID 1

PaigeMiller
Diamond | Level 26

In addition to the request from @andreas_lds , please tell us if this is the entire data set, or if this is just a small part of a much larger data set. (If it is much larger, please tell us how many rows). Are there different subjects or different IDs in this larger data set? Would the code need to be smart enough to make different replacements for different subjects/different IDs?

--
Paige Miller
kalbo
Obsidian | Level 7

The example I gave is a part of a data set (which is the metadata of adam data sets) which is about 1000 rows. I would need to replace any CORE value with observations from the ADSL. Can be up to 4 core groupings. So whenever CORE1, 2, 3 etc appears it should also work for those.

PaigeMiller
Diamond | Level 26

I asked "Are there different subjects or different IDs in this larger data set? Would the code need to be smart enough to make different replacements for different subjects/different IDs?" I don't see an answer.

--
Paige Miller
kalbo
Obsidian | Level 7

hi, yes there would be other data sets listed. The id is the coregroup of 1,2,3 or 4 (for ADSL) which corresponds to CORE1, 2, 4, 5 for all other data sets.

kalbo
Obsidian | Level 7

yes it should  be smart enough to handle that.

Patrick
Opal | Level 21

@kalbo Your descriptions are somewhat confusing. May be that's because of formulations like:

 

>CORE1 should be replaced by a list of variables
That's not a list of variables. You want to replace records with specific values for variables dataset and variable with values from some other records in the same table. What increases the confusion are the variable names "variable" and "dataset".

 

>The ADSL records can be added to any data set in the list 

What's confusing here: In the SAS world the term "data set" normally means "table" and not row/observation/record.

 

Based on your WANT data I believe what you're after is something like below:

data have;
  infile cards dsd truncover;
  input dataset $ variable : $10. label : $20. coregroup $;
  cards;
ADAE,USUBJID,Subject_ID,
ADAE,:CORE1,,
ADAE,:CORE2,,
ADAE,TRT,Treatment,
ADMH,USUBJID,Subject_ID,
ADMH,MHTERM,Medical_History_Term,
ADMH,TRT,Treatment_Group,
ADSL,USUBJID,Subject_ID,
ADSL,RAND,Rand_flag,1,
ADSL,ETHNEW,New_Ethnicity,2,
ADSL,SITEID,Site_ID,1,
;

data want;
   input dataset $ variable : $10. label : $20. coregroup $;
   datalines;
ADAE USUBJID Subject_ID .
ADAE RAND Rand_flag 1
ADAE SITEID Site_ID 1
ADAE ETHNEW New_Ethnicity 2
ADAE TRT Treatment .
ADMH USUBJID Subject_ID .
ADMH MHTERM Medical_History_Term .
ADMH TRT Treatment_Group .
ADSL USUBJID Subject_ID .
ADSL RAND Rand_flag 1
ADSL ETHNEW New_Ethnicity 2
ADSL SITEID Site_ID 1
;
run;

data want_derived;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'have(where=(dataset="ADSL" and not missing(coregroup))', multidata:'y');
      h1.defineKey('coregroup');
      h1.defineData('variable','label','coregroup');
      h1.defineDone();
    end;
  set have;
  if dataset='ADAE' and find(variable,'core','i')>0 then
    do;
      do while(h1.do_over(key:strip(scan(variable,-1,,'kd'))) eq 0);
        output;
      end;
    end;
  else
    output;
run;

proc compare data=want compare=want_derived;
run;
proc print data=want;run;
proc print data=want_derived;run;
PaigeMiller
Diamond | Level 26

@kalbo wrote:

hi, yes there would be other data sets listed. The id is the coregroup of 1,2,3 or 4 (for ADSL) which corresponds to CORE1, 2, 4, 5 for all other data sets.


Not sure I understand this. It would help to see other data sets. Could you please provide sample data which covers at least two data sets?

 

Lots of words have been written, I have to scroll up and down to get a complete picture. Can you rewrite the question to address all of these concerns in one post?

--
Paige Miller
Tom
Super User Tom
Super User

You are not explaining your problem very clearly.

Let's see if we can guess what you are asking to do.

I suspect that you are trying to automate some part of the process of generating metadata about clinical trial data from your use of dataset names like ADAE and ADSL 

 

So you seem to have a dataset named HAVE which has incomplete metadata about the variables in the datasets used (generated for?) your clinical trial. 

Spoiler
data have;
	infile cards dsd truncover ;
	input dataset :$32. variable :$32. label :$256. coregroup $;
cards;
ADAE,USUBJID,Subject ID,
ADAE,:CORE1,,
ADAE,:CORE2,,
ADAE,TRT,Treatment,
ADMH,USUBJID,Subject ID,
ADMH,MHTERM,Medical History Term,
ADMH,TRT,Treatment Group,
ADSL,USUBJID,Subject ID,
ADSL,RAND,Rand flag,1,
ADSL,ETHNEW,New Ethnicity,2,
ADSL,SITEID,Site ID,1,
;
Tom_0-1740627620590.png

The source of this incompleteness is the use of special values like ':CORE1' and 'CORE2' as PLACEHOLDERS for an actual set of variable names that should be inserted into the metadata table.

 

So you should be able to do some simple joins to get what you want.

proc sql;
create table want as
select  dataset,variable,label
  from have
  where not (variable eqt ':CORE')
union
select a.dataset,b.variable,b.label
  from have a
  inner join have b
  on a.variable eqt ':CORE'
    and b.dataset='ADSL' and b.coregroup=substr(a.variable,6)
order by dataset,variable
;
quit;

Tom_1-1740628029249.png

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1746 views
  • 0 likes
  • 5 in conversation