cannot see the gain, maybe i'm missing something

You could still add a little compare test like eh:

if hl > 121 then hl = hl - 121 a=sqrt (121) de=a*2+1 --> routine

if hl > 256 then hl = hl - 256 a=sqrt (256) de=a*2+1 --> routine

...

and those tests for what? to gain like 15 or 16 loops when the number is high? with the cost that it will take more time when the number is low!, now...

a test like

if hl> 16384 then hl=hl-16384 :a= #pre_calculated_value: de=a*2+1 --> loopstart

if hl> 32767 .....

For the square root of 256 it now passes only one once through the routine instead of eh... sixteen times.

The square root of 289 takes two passes.

Not enough gain... never mind silly me then.

Comparing for 256 is not that slow is it, could be done within the time of one pass right?

GhostwriterP

you propose a sort of look up table to enter in the routine with specific initial values that shorten the calculation.

It can work, even with multiple thresholds, say 121,256,16384 etc, but it is a matter of trade off.

In speed, what you gain on large numbers is lost on small numbers, and the code will be by far larger and maybe the Ricardo's solution is preferable.

for 65535 it loops 255 times

for 32768 it loops 181 times

for 16384 it loops 128 times

for 8192 it loops 90 times.... and so on...

well will depends on the use of the sqr routine, if you use it in a formula that handles large numbers in the sqr input...... the improve will be a gain...

now, if you use it with little numbers like <255 ... there is no gain.

; in hl

; out a = sqrt(hl)

ld a,h

and &HC0

jr nz, large

ld de,1

xor a

1: sbc hl,de

ret c

inc de

inc e

inc a

jp 1b

large: cp &H40

jr z,

cp &H80

jr z,

' it is &HC0

......

A thing nice to consider in this "shortened" method is that the sum of the first N odd numbers, if N is power of two,

can be computed as (N*N) with shift and rotations

What one could do is compare hl with all the square powers of two, find the range that comprises hl, subtract the lower extreme and compute the refinement with the routine

This could be done without tables...

but maybe I'm falling towards the Ricardo's solution

A thing nice to consider in this "shortened" method is that the sum of the first N odd numbers, if N is power of two,

can be computed as (N*N) with shift and rotations

What one could do is compare hl with all the square powers of two, find the range that comprises hl, subtract the lower extreme and compute the refinement with the routine

This could be done without tables...

but maybe I'm falling towards the Ricardo's solution

Yes, I tought that, but then I thought nahhhh it is very much overhead..

it is like a bucle, that finds the first bit seted, scanning from the MSB to LSB. As anyway you needs a copy of the original value, and other for the scanning, you will use HL & DE, DE as copy and HL for the rotating purpose....

but you needs a counter, being B , when you finds the bit seted, it escapes the routine, then you use the value of the counter as a pointer to a little lookup table of just 16 registers of 4 byte, use the table to set the HL & DE values of the original routine, you needs to plus HL+ original value with the bit found RESETED......

what a mess......

Now, there is the gain...

if the original value is 1000000000

it will loops just ones!....

if the original value is 1111111111

it will loops like that the value where 0111111111

that is exponentialy less..... that is the gain

but for values like

0000000000011

the overhead loop will runs 14 times ...

to process the root of just 3!

but there is to see the AVERAGE!!! how many takes measuring

to process all the range from 0 to 65535

needs a counter