BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lee_wan
Obsidian | Level 7

Hi all,

 

I'm writing a piece of code about hash. Detail please see as below.

 

data cm1;
if _n_=1 then do;
  dcl hash h();
  h.definekey('usubjid','cmseq');
  h.definedata('usubjid','cmseq','atc','atc3');
  h.definedone();
end;
  do until(last.idvarval);
    set suppcm;
    by usubjid idvarval;
    if qnam='ATC' then atc=qval;
    if qnam='ATC3' then atc3=qval;
  end;
  cmseq=input(idvarval,best.);
  h.add();

  set cm;
  if h.check()=0 then a=1;
  call missing(atc,atc3);
  rc=h.find();

  keep usubjid cmseq atc atc3 a;
run;

 

 

What makes me amazed is that cmseq=2-9 is always missing,but the data is exist when 2<=cmseq<=9.

 

as.png

 

When I use h.output() to output hash object,it real exist:

afd.png

 

 

Is there something wrong with my code?

 

The structure of SUPPCM is as follows:

sss.png

 

Thanks.

 

What I have:

 

data WORK.SUPPCM;
infile datalines dsd truncover;
input USUBJID $ 1 IDVARVAL $ 3-4 QNAM $ 6-9 QVAL :$200.;
label USUBJID="Unique Subject Identifier" IDVARVAL="Identifying Variable Value" QNAM="Qualifier Variable Name" QVAL="Data Value";
datalines;
1 1 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 1 ATC3 OTHER ANTINEOPLASTIC AGENTS
1 10 ATC ALIMENTARY TRACT AND METABOLISM
1 10 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 11 ATC CARDIOVASCULAR SYSTEM
1 11 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 12 ATC VARIOUS
1 13 ATC ALIMENTARY TRACT AND METABOLISM
1 13 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 14 ATC CARDIOVASCULAR SYSTEM
1 14 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 15 ATC VARIOUS
1 16 ATC VARIOUS
1 17 ATC VARIOUS
1 18 ATC VARIOUS
1 2 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 2 ATC3 ANTIMETABOLITES
1 3 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 3 ATC3 OTHER ANTINEOPLASTIC AGENTS
1 4 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 4 ATC3 ANTIMETABOLITES
1 5 ATC CARDIOVASCULAR SYSTEM
1 5 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 6 ATC ALIMENTARY TRACT AND METABOLISM
1 6 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 7 ATC RESPIRATORY SYSTEM
1 7 ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 8 ATC RESPIRATORY SYSTEM
1 8 ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 9 ATC ANTIINFECTIVES FOR SYSTEMIC USE
1 9 ATC3 OTHER ANTIBACTERIALS
;;;;
run;

data WORK.CM;
infile datalines dsd truncover;
input USUBJID $ 1 CMSEQ;
label USUBJID="Unique Subject Identifier" CMSEQ="Sequence Number";
datalines;
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
1 11
1 12
1 13
1 14
1 15
1 16
1 17
1 18
1 19
;;;;
run;

data WORK.SUPPCM;
  infile datalines dsd truncover;
  input USUBJID:$30. IDVARVAL:$40. QNAM:$8. QVAL:$200.;
  label USUBJID="Unique Subject Identifier" IDVARVAL="Identifying Variable Value" QNAM="Qualifier Variable Name" QVAL="Data Value";
datalines;
1 1 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 1 ATC3 OTHER ANTINEOPLASTIC AGENTS
1 10 ATC ALIMENTARY TRACT AND METABOLISM
1 10 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 11 ATC CARDIOVASCULAR SYSTEM
1 11 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 12 ATC VARIOUS
1 13 ATC ALIMENTARY TRACT AND METABOLISM
1 13 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 14 ATC CARDIOVASCULAR SYSTEM
1 14 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 15 ATC VARIOUS
1 16 ATC VARIOUS
1 17 ATC VARIOUS
1 18 ATC VARIOUS
1 2 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 2 ATC3 ANTIMETABOLITES
1 3 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 3 ATC3 OTHER ANTINEOPLASTIC AGENTS
1 4 ATC ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 4 ATC3 ANTIMETABOLITES
1 5 ATC CARDIOVASCULAR SYSTEM
1 5 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 6 ATC ALIMENTARY TRACT AND METABOLISM
1 6 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 7 ATC RESPIRATORY SYSTEM
1 7 ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 8 ATC RESPIRATORY SYSTEM
1 8 ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 9 ATC ANTIINFECTIVES FOR SYSTEMIC USE
1 9 ATC3 OTHER ANTIBACTERIALS
;;;;
run;

data WORK.CM;
  infile datalines dsd truncover;
  input USUBJID:$30. CMSEQ:BEST.;
  format CMSEQ BEST.;
  label USUBJID="Unique Subject Identifier" CMSEQ="Sequence Number";
datalines;
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
1 11
1 12
1 13
1 14
1 15
1 16
1 17
1 18
1 19
;;;;
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Here is how I would do it.

 

data WORK.SUPPCM;
  infile datalines dsd truncover;
  input USUBJID $ 1 IDVARVAL $ 3-4 QNAM $ 6-9 QVAL :$200.;
  label USUBJID="Unique Subject Identifier" IDVARVAL="Identifying Variable Value" QNAM="Qualifier Variable Name" QVAL="Data Value";
datalines;
1 1  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 1  ATC3 OTHER ANTINEOPLASTIC AGENTS
1 10 ATC  ALIMENTARY TRACT AND METABOLISM
1 10 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 11 ATC  CARDIOVASCULAR SYSTEM
1 11 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 12 ATC  VARIOUS
1 13 ATC  ALIMENTARY TRACT AND METABOLISM
1 13 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 14 ATC  CARDIOVASCULAR SYSTEM
1 14 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 15 ATC  VARIOUS
1 16 ATC  VARIOUS
1 17 ATC  VARIOUS
1 18 ATC  VARIOUS
1 2  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 2  ATC3 ANTIMETABOLITES
1 3  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 3  ATC3 OTHER ANTINEOPLASTIC AGENTS
1 4  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 4  ATC3 ANTIMETABOLITES
1 5  ATC  CARDIOVASCULAR SYSTEM
1 5  ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 6  ATC  ALIMENTARY TRACT AND METABOLISM
1 6  ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 7  ATC  RESPIRATORY SYSTEM
1 7  ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 8  ATC  RESPIRATORY SYSTEM
1 8  ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 9  ATC  ANTIINFECTIVES FOR SYSTEMIC USE
1 9  ATC3 OTHER ANTIBACTERIALS
;;;;
run;

data WORK.CM;
  infile datalines dsd truncover;
  input USUBJID $ 1 CMSEQ;
  label USUBJID="Unique Subject Identifier" CMSEQ="Sequence Number";
datalines;
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
1 11
1 12
1 13
1 14
1 15
1 16
1 17
1 18
1 19
;;;;
run;

proc transpose data=suppcm out=suppcm_wide(drop=_:);
    by USUBJID IDVARVAL;
    id QNAM;
    var QVAL;
run;

data cm1;
  format usubjid cmseq atc atc3 a;
  if _n_=1 then do;
    dcl hash h();
    h.definekey('usubjid','cmseq');
    h.definedata('usubjid','cmseq','atc','atc3');
    h.definedone();

    do until(lr);
      set suppcm_wide end=lr;
      cmseq=input(idvarval, best.);
      rc=h.add();
    end;
  end;

  set cm;
  call missing(atc,atc3);
  if h.find()=0 then a=1;

  keep usubjid cmseq atc atc3 a;
run;

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

This happens because your idvarval in SUPPCM is character and cmseq is numeric. 

 

This means that you read the numeric seq numbers 2, 3, 4, 5, 6, 7, 8, 9 before these exist in the hash object because you did not read them in yet. 

Lee_wan
Obsidian | Level 7

Hi draycut,

Thanks for your respond.
But seq numbers 21,22,23.... do not missing. Only SEQ=2,3,4,5,6,7,8,9 are all missing. And I have the code="cmseq=input(idvarval,best.);"

PeterClemmensen
Tourmaline | Level 20

I can't grab the data from your images, so needless to say, this is untested. but why not do something like this?

 

Do the conversion from character to numeric and read in the entire suppcm in one go (until lr). Then read cm sequentially and do the lookup. That way, you shouldn't be concerned about if it exists in your hash obejct yet.

 

Feel free to ask.

 

data cm1;
  if _n_=1 then do;
    dcl hash h(multidata:'Y');
    h.definekey('usubjid','cmseq');
    h.definedata('usubjid','cmseq','atc','atc3');
    h.definedone();

    do until(lr);
      set suppcm end=lr;
      cmseq=input(idvarval,best.);
      if qnam='ATC' then atc=qval;
      if qnam='ATC3' then atc3=qval;
      h.add();
    end;
  end;

  set cm;
  call missing(atc,atc3);
  if h.find()=0 then a=1;

  keep usubjid cmseq atc atc3 a;
run;

 

 

Lee_wan
Obsidian | Level 7
I've updated the data on it, you can copy that.
PeterClemmensen
Tourmaline | Level 20

Here is how I would do it.

 

data WORK.SUPPCM;
  infile datalines dsd truncover;
  input USUBJID $ 1 IDVARVAL $ 3-4 QNAM $ 6-9 QVAL :$200.;
  label USUBJID="Unique Subject Identifier" IDVARVAL="Identifying Variable Value" QNAM="Qualifier Variable Name" QVAL="Data Value";
datalines;
1 1  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 1  ATC3 OTHER ANTINEOPLASTIC AGENTS
1 10 ATC  ALIMENTARY TRACT AND METABOLISM
1 10 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 11 ATC  CARDIOVASCULAR SYSTEM
1 11 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 12 ATC  VARIOUS
1 13 ATC  ALIMENTARY TRACT AND METABOLISM
1 13 ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 14 ATC  CARDIOVASCULAR SYSTEM
1 14 ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 15 ATC  VARIOUS
1 16 ATC  VARIOUS
1 17 ATC  VARIOUS
1 18 ATC  VARIOUS
1 2  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 2  ATC3 ANTIMETABOLITES
1 3  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 3  ATC3 OTHER ANTINEOPLASTIC AGENTS
1 4  ATC  ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
1 4  ATC3 ANTIMETABOLITES
1 5  ATC  CARDIOVASCULAR SYSTEM
1 5  ATC3 SELECTIVE CALCIUM CHANNEL BLOCKERS WITH MAINLY VASCULAR EFFECTS
1 6  ATC  ALIMENTARY TRACT AND METABOLISM
1 6  ATC3 DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)
1 7  ATC  RESPIRATORY SYSTEM
1 7  ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 8  ATC  RESPIRATORY SYSTEM
1 8  ATC3 EXPECTORANTS, EXCL. COMBINATIONS WITH COUGH SUPPRESSANTS
1 9  ATC  ANTIINFECTIVES FOR SYSTEMIC USE
1 9  ATC3 OTHER ANTIBACTERIALS
;;;;
run;

data WORK.CM;
  infile datalines dsd truncover;
  input USUBJID $ 1 CMSEQ;
  label USUBJID="Unique Subject Identifier" CMSEQ="Sequence Number";
datalines;
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
1 11
1 12
1 13
1 14
1 15
1 16
1 17
1 18
1 19
;;;;
run;

proc transpose data=suppcm out=suppcm_wide(drop=_:);
    by USUBJID IDVARVAL;
    id QNAM;
    var QVAL;
run;

data cm1;
  format usubjid cmseq atc atc3 a;
  if _n_=1 then do;
    dcl hash h();
    h.definekey('usubjid','cmseq');
    h.definedata('usubjid','cmseq','atc','atc3');
    h.definedone();

    do until(lr);
      set suppcm_wide end=lr;
      cmseq=input(idvarval, best.);
      rc=h.add();
    end;
  end;

  set cm;
  call missing(atc,atc3);
  if h.find()=0 then a=1;

  keep usubjid cmseq atc atc3 a;
run;
Lee_wan
Obsidian | Level 7

Thank you very much for your response.

 

But my mind is to put the transpose step into the data step by using "do until (last.idvarval)".

 

I want to use hash to output this by one step.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 612 views
  • 0 likes
  • 2 in conversation