BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
makset
Obsidian | Level 7

It would seem like a simple thing extracting _n_ from a table.
I used to program like this:

data A;
informat datetime  datetime21.;
input info $7. datetime; 
format datetime  datetime21.;
cards;
GC_5_0	30APR2020:00:30:00
GC_5_2	30APR2020:00:40:00
GC_5_0	30APR2020:00:39:00
GC_5_2	30APR2020:00:17:00
run;

data B;
informat datetime  datetime21.;
input datetime	high_temp low_temp end_temp; 
format datetime  datetime21.;
cards;
30APR2020:00:00:00	31.3	25.6	29.4
30APR2020:00:00:30	29.9	28.4	29.2
30APR2020:00:01:00	29.5	28.9	29.2
30APR2020:00:01:30	29.5	28.6	29
30APR2020:00:02:00	30	28.8	29.9
30APR2020:00:02:30	30.3	29.5	29.9
30APR2020:00:03:00	30.3	29.2	29.9
30APR2020:00:03:30	30.3	29	30.1
30APR2020:00:04:00	30.2	29.3	29.3
30APR2020:00:04:30	30.6	29.4	29.4
30APR2020:00:05:00	30.8	29.6	29.6
30APR2020:00:05:30	31	30	30.3
30APR2020:00:06:00	31	29.7	30.2
30APR2020:00:06:30	31	29.7	30.5
30APR2020:00:07:00	30.4	29.4	29.6
30APR2020:00:07:30	30	27.6	28.4
30APR2020:00:08:00	29.6	28.1	28.9
30APR2020:00:08:30	30.2	28.6	29.4
30APR2020:00:09:00	30	28.9	29.3
30APR2020:00:09:30	30.5	28.4	28.4
30APR2020:00:10:00	31.7	30.9	31.2
30APR2020:00:10:30	32	30.6	30.6
30APR2020:00:11:00	31.4	29.7	30.8
30APR2020:00:11:30	31.1	29.3	29.8
30APR2020:00:12:00	29.4	28.6	29
30APR2020:00:12:30	29.4	28.2	29
30APR2020:00:13:00	30.4	28.7	30.4
30APR2020:00:13:30	31.7	29.5	30.8
30APR2020:00:14:00	33.8	31.8	33
30APR2020:00:14:30	33	30.5	31.2
30APR2020:00:15:00	31.2	30.4	30.4
30APR2020:00:15:30	30.5	30	30
30APR2020:00:16:00	30.6	29	30.4
30APR2020:00:16:30	30	28.9	29.7
30APR2020:00:17:00	29.9	29.3	29.5
30APR2020:00:17:30	30.6	29.3	29.3
30APR2020:00:18:00	30.2	29.3	29.3
30APR2020:00:18:30	29.6	28.5	28.5
30APR2020:00:19:00	29.6	27.9	28.4
30APR2020:00:19:30	29.4	28.5	28.5
30APR2020:00:20:30	28.8	28	28
30APR2020:00:21:00	31.1	24.2	31
30APR2020:00:21:30	31.1	29.6	31.1
30APR2020:00:22:00	30.7	29.7	30.2
30APR2020:00:22:30	30.4	29.7	29.7
30APR2020:00:23:00	30.4	29.5	30.3
30APR2020:00:23:30	30.5	29.4	30.5
30APR2020:00:24:00	31	29.1	30.6
30APR2020:00:24:30	30.4	29.3	30.4
30APR2020:00:25:00	31.9	29.9	30.6
30APR2020:00:25:30	31.6	30.2	31.2
30APR2020:00:26:00	31	29.6	29.6
30APR2020:00:26:30	30.6	28.4	28.4
30APR2020:00:27:00	30.7	29.7	30.4
30APR2020:00:27:30	30.8	29.7	30.4
30APR2020:00:28:00	31.2	30	30.7
30APR2020:00:29:00	31	29.5	30.1
30APR2020:00:29:30	30.4	29.4	29.7
30APR2020:00:30:00	30.3	28.7	30.3
30APR2020:00:30:30	30.1	29.6	30.1
30APR2020:00:31:00	30	28.6	28.7
30APR2020:00:31:30	29.9	28.1	28.1
30APR2020:00:32:00	30.3	29.6	29.6
30APR2020:00:32:30	30.7	29.2	29.8
30APR2020:00:33:00	29.4	28.5	28.5
30APR2020:00:33:30	29.7	28.7	29.2
30APR2020:00:34:00	30.2	28.9	30.2
30APR2020:00:34:30	29.8	29	29.8
30APR2020:00:35:00	30.4	28.8	29.6
30APR2020:00:35:30	30.2	29	29.6
30APR2020:00:36:00	30.1	29.3	29.5
30APR2020:00:36:30	30.4	28.6	29.6
30APR2020:00:37:00	30.7	29.4	30
30APR2020:00:37:30	30.9	29.2	29.2
30APR2020:00:38:00	30.2	28.8	29.6
30APR2020:00:38:30	29.8	28.9	29.3
30APR2020:00:39:00	29.6	28.7	28.7
30APR2020:00:40:00	30.7	28.6	30.7
30APR2020:00:40:30	30.9	29.3	30.4
30APR2020:00:41:00	30.5	29.5	30.2
30APR2020:00:41:30	30.6	28.6	29.9
30APR2020:00:42:00	31.3	29.4	31.3
30APR2020:00:42:30	30.9	29.3	29.9
30APR2020:00:43:00	30.1	28.9	29.7
30APR2020:00:43:30	30.5	28.8	30.5
30APR2020:00:44:00	30.2	28.4	28.4
30APR2020:00:44:30	30.2	28	29.5
30APR2020:00:45:00	30.4	29.3	29.3
30APR2020:00:45:30	30.7	28.9	30.2
30APR2020:00:46:00	30.1	28.8	29.9
30APR2020:00:46:30	29.6	28.8	29.3
30APR2020:00:47:00	29	26.6	27.8
30APR2020:00:47:30	27.4	26.4	27.4
30APR2020:00:48:00	29.8	27.3	29.2
30APR2020:00:48:30	29.9	29.2	29.2
30APR2020:00:49:00	29.9	28.8	28.8
30APR2020:00:49:30	30.3	28.7	29.3
30APR2020:00:50:00	30	28.6	29.2
30APR2020:00:51:00	29.4	28.8	29.4
30APR2020:00:51:30	29.9	28.5	29.9
run;



%macro test();

data _null_;
 if 0 then set A nobs = n;
 call symputx('Obs_A',n);
 stop;
run;
%put no. of observations = &Obs_A;

%do i = 1  %to &Obs_A %by 1;

	data _null_;
	set A;
	if _n_ = &i then do;
	call symputx('datetime',datetime);
	end;
	run;

	data _null_;
	set B;
	if datetime = &datetime then do;
	call symputx('nr_n_',_n_);
	end;
	run;

	%put datetime = &datetime;
	%put nr_n_ = &nr_n_;

	data A;
	set A;
	if _n_ = &i then do;
	nr = &nr_n_;
	end;
	run;

%end;
%mend test;
%test;

Now i do it like this:

 

		data B;
		set B;
		nr = _n_;
		run;

		data A;
		set A;
		if _n_ = 1 then do;
		  declare hash H1 (dataset:"B");
		  H1.definekey('datetime');
		  H1.definedata('nr');
		  H1.definedone();
		end;
		  if H1.find() = 0 then  nr = nr;
		run;

I have a feeling this step

data B; set B; nr = _n_; run;

  can be skipped but I don't know how?

 

As in hash table extracting the row number _n_ does not use an additional variable ??. Using an additional variable significantly extends the duration of the program and increases the size of the tables per disk by some 50GB.

Best regards and thank you in advance for your help

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

This step is needed to create a new variable with the running count. There is no way to get this count "on the fly" while reading the dataset in the DECLARE HASH statement.

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

This step is needed to create a new variable with the running count. There is no way to get this count "on the fly" while reading the dataset in the DECLARE HASH statement.

makset
Obsidian | Level 7
What a pity🙂
Thank you for your help.
Tom
Super User Tom
Super User

If you are just using the hash to be able to locate an observation by its position in the dataset then skip the has and just use the dataset.  You can use the POINT= option on a SET statement.  

data newb;
  set oldb;
  point = nr;
  if 1 <= point <= nobs then set a point=point nobs=nobs;
run;

 If you want put the table into memory to improve the access time then use SASFILE statement.

yabwon
Onyx | Level 15

Hi,

 

How about doing it with 1 data step:

data A;
  if _n_ = 1 then do;
    declare hash H1 ();
    H1.definekey('datetime');
    H1.definedata('nr');
    H1.definedone();

    do until (eof);
      set B(keep = datetime) curobs=curobs end = eof;
      nr = curobs;
      _N_ = H1.ADD();
    end;
  end;

  set A;

  if H1.find() = 0 then  nr = nr;
                   else  nr = .;
run;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 671 views
  • 3 likes
  • 4 in conversation