BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
makset
Obsidian | Level 7

It would seem like a simple thing extracting _n_ from a table.
I used to program like this:

data A;
informat datetime  datetime21.;
input info $7. datetime; 
format datetime  datetime21.;
cards;
GC_5_0	30APR2020:00:30:00
GC_5_2	30APR2020:00:40:00
GC_5_0	30APR2020:00:39:00
GC_5_2	30APR2020:00:17:00
run;

data B;
informat datetime  datetime21.;
input datetime	high_temp low_temp end_temp; 
format datetime  datetime21.;
cards;
30APR2020:00:00:00	31.3	25.6	29.4
30APR2020:00:00:30	29.9	28.4	29.2
30APR2020:00:01:00	29.5	28.9	29.2
30APR2020:00:01:30	29.5	28.6	29
30APR2020:00:02:00	30	28.8	29.9
30APR2020:00:02:30	30.3	29.5	29.9
30APR2020:00:03:00	30.3	29.2	29.9
30APR2020:00:03:30	30.3	29	30.1
30APR2020:00:04:00	30.2	29.3	29.3
30APR2020:00:04:30	30.6	29.4	29.4
30APR2020:00:05:00	30.8	29.6	29.6
30APR2020:00:05:30	31	30	30.3
30APR2020:00:06:00	31	29.7	30.2
30APR2020:00:06:30	31	29.7	30.5
30APR2020:00:07:00	30.4	29.4	29.6
30APR2020:00:07:30	30	27.6	28.4
30APR2020:00:08:00	29.6	28.1	28.9
30APR2020:00:08:30	30.2	28.6	29.4
30APR2020:00:09:00	30	28.9	29.3
30APR2020:00:09:30	30.5	28.4	28.4
30APR2020:00:10:00	31.7	30.9	31.2
30APR2020:00:10:30	32	30.6	30.6
30APR2020:00:11:00	31.4	29.7	30.8
30APR2020:00:11:30	31.1	29.3	29.8
30APR2020:00:12:00	29.4	28.6	29
30APR2020:00:12:30	29.4	28.2	29
30APR2020:00:13:00	30.4	28.7	30.4
30APR2020:00:13:30	31.7	29.5	30.8
30APR2020:00:14:00	33.8	31.8	33
30APR2020:00:14:30	33	30.5	31.2
30APR2020:00:15:00	31.2	30.4	30.4
30APR2020:00:15:30	30.5	30	30
30APR2020:00:16:00	30.6	29	30.4
30APR2020:00:16:30	30	28.9	29.7
30APR2020:00:17:00	29.9	29.3	29.5
30APR2020:00:17:30	30.6	29.3	29.3
30APR2020:00:18:00	30.2	29.3	29.3
30APR2020:00:18:30	29.6	28.5	28.5
30APR2020:00:19:00	29.6	27.9	28.4
30APR2020:00:19:30	29.4	28.5	28.5
30APR2020:00:20:30	28.8	28	28
30APR2020:00:21:00	31.1	24.2	31
30APR2020:00:21:30	31.1	29.6	31.1
30APR2020:00:22:00	30.7	29.7	30.2
30APR2020:00:22:30	30.4	29.7	29.7
30APR2020:00:23:00	30.4	29.5	30.3
30APR2020:00:23:30	30.5	29.4	30.5
30APR2020:00:24:00	31	29.1	30.6
30APR2020:00:24:30	30.4	29.3	30.4
30APR2020:00:25:00	31.9	29.9	30.6
30APR2020:00:25:30	31.6	30.2	31.2
30APR2020:00:26:00	31	29.6	29.6
30APR2020:00:26:30	30.6	28.4	28.4
30APR2020:00:27:00	30.7	29.7	30.4
30APR2020:00:27:30	30.8	29.7	30.4
30APR2020:00:28:00	31.2	30	30.7
30APR2020:00:29:00	31	29.5	30.1
30APR2020:00:29:30	30.4	29.4	29.7
30APR2020:00:30:00	30.3	28.7	30.3
30APR2020:00:30:30	30.1	29.6	30.1
30APR2020:00:31:00	30	28.6	28.7
30APR2020:00:31:30	29.9	28.1	28.1
30APR2020:00:32:00	30.3	29.6	29.6
30APR2020:00:32:30	30.7	29.2	29.8
30APR2020:00:33:00	29.4	28.5	28.5
30APR2020:00:33:30	29.7	28.7	29.2
30APR2020:00:34:00	30.2	28.9	30.2
30APR2020:00:34:30	29.8	29	29.8
30APR2020:00:35:00	30.4	28.8	29.6
30APR2020:00:35:30	30.2	29	29.6
30APR2020:00:36:00	30.1	29.3	29.5
30APR2020:00:36:30	30.4	28.6	29.6
30APR2020:00:37:00	30.7	29.4	30
30APR2020:00:37:30	30.9	29.2	29.2
30APR2020:00:38:00	30.2	28.8	29.6
30APR2020:00:38:30	29.8	28.9	29.3
30APR2020:00:39:00	29.6	28.7	28.7
30APR2020:00:40:00	30.7	28.6	30.7
30APR2020:00:40:30	30.9	29.3	30.4
30APR2020:00:41:00	30.5	29.5	30.2
30APR2020:00:41:30	30.6	28.6	29.9
30APR2020:00:42:00	31.3	29.4	31.3
30APR2020:00:42:30	30.9	29.3	29.9
30APR2020:00:43:00	30.1	28.9	29.7
30APR2020:00:43:30	30.5	28.8	30.5
30APR2020:00:44:00	30.2	28.4	28.4
30APR2020:00:44:30	30.2	28	29.5
30APR2020:00:45:00	30.4	29.3	29.3
30APR2020:00:45:30	30.7	28.9	30.2
30APR2020:00:46:00	30.1	28.8	29.9
30APR2020:00:46:30	29.6	28.8	29.3
30APR2020:00:47:00	29	26.6	27.8
30APR2020:00:47:30	27.4	26.4	27.4
30APR2020:00:48:00	29.8	27.3	29.2
30APR2020:00:48:30	29.9	29.2	29.2
30APR2020:00:49:00	29.9	28.8	28.8
30APR2020:00:49:30	30.3	28.7	29.3
30APR2020:00:50:00	30	28.6	29.2
30APR2020:00:51:00	29.4	28.8	29.4
30APR2020:00:51:30	29.9	28.5	29.9
run;



%macro test();

data _null_;
 if 0 then set A nobs = n;
 call symputx('Obs_A',n);
 stop;
run;
%put no. of observations = &Obs_A;

%do i = 1  %to &Obs_A %by 1;

	data _null_;
	set A;
	if _n_ = &i then do;
	call symputx('datetime',datetime);
	end;
	run;

	data _null_;
	set B;
	if datetime = &datetime then do;
	call symputx('nr_n_',_n_);
	end;
	run;

	%put datetime = &datetime;
	%put nr_n_ = &nr_n_;

	data A;
	set A;
	if _n_ = &i then do;
	nr = &nr_n_;
	end;
	run;

%end;
%mend test;
%test;

Now i do it like this:

 

		data B;
		set B;
		nr = _n_;
		run;

		data A;
		set A;
		if _n_ = 1 then do;
		  declare hash H1 (dataset:"B");
		  H1.definekey('datetime');
		  H1.definedata('nr');
		  H1.definedone();
		end;
		  if H1.find() = 0 then  nr = nr;
		run;

I have a feeling this step

data B; set B; nr = _n_; run;

  can be skipped but I don't know how?

 

As in hash table extracting the row number _n_ does not use an additional variable ??. Using an additional variable significantly extends the duration of the program and increases the size of the tables per disk by some 50GB.

Best regards and thank you in advance for your help

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

This step is needed to create a new variable with the running count. There is no way to get this count "on the fly" while reading the dataset in the DECLARE HASH statement.

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

This step is needed to create a new variable with the running count. There is no way to get this count "on the fly" while reading the dataset in the DECLARE HASH statement.

makset
Obsidian | Level 7
What a pity🙂
Thank you for your help.
Tom
Super User Tom
Super User

If you are just using the hash to be able to locate an observation by its position in the dataset then skip the has and just use the dataset.  You can use the POINT= option on a SET statement.  

data newb;
  set oldb;
  point = nr;
  if 1 <= point <= nobs then set a point=point nobs=nobs;
run;

 If you want put the table into memory to improve the access time then use SASFILE statement.

yabwon
Onyx | Level 15

Hi,

 

How about doing it with 1 data step:

data A;
  if _n_ = 1 then do;
    declare hash H1 ();
    H1.definekey('datetime');
    H1.definedata('nr');
    H1.definedone();

    do until (eof);
      set B(keep = datetime) curobs=curobs end = eof;
      nr = curobs;
      _N_ = H1.ADD();
    end;
  end;

  set A;

  if H1.find() = 0 then  nr = nr;
                   else  nr = .;
run;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 701 views
  • 3 likes
  • 4 in conversation