[libre-riscv-dev] [Bug 208] implement CORDIC in a general way sufficient to do transcendentals
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Tue May 5 18:02:11 BST 2020
https://bugs.libre-soc.org/show_bug.cgi?id=208
--- Comment #53 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Michael Nolan from comment #52)
> (In reply to Luke Kenneth Casson Leighton from comment #50)
> > btw 12 stages is too many, because we need a minimum of one FunctionUnit
> > "managing" (matching up) source operands and results.
> >
> > each extra Function Unit added creates an Order N^2 increase in the overall
> > size of the Dependency Matrices.
> >
> > so - and i don't mean right now - if this can be either:
> >
> > * cut into two separate pipelines (partial data fed back as a micro-op)
> > * number of combinatorial blocks increased to 5 or 6
> >
> > we stand a chance of keeping the FU count down (8 is ok).
> >
> > have to think that through carefully, how to do micro-ops.
>
> Sure. I don't know about splitting it up into multiple uops,
neither do i at the moment! i think i might have got confused about
sharing the MUL unit with FPMUL.
> but reducing
> the number of stages is pretty easy. I added a parameter to the module
> called rounds_per_stage which governs how many cordic rounds go in each
> pipeline stage. Increasing that number will decrease the number of pipeline
> stages.
fantastic, that's exactly the kind of parameterisation _we_ need,
however as a general-purpose IEEE754 FP library, it's exactly
the kind of parameterisation that other _users_ need :)
>
> (In reply to Luke Kenneth Casson Leighton from comment #51)
> > the other option:
> >
> > keep the long pipeline lengths: we *know* that there's more stages than
> > there are FUs, and that's just "tough luck". we know that with only
> > 8 FUs and 12 pipeline stages, 4 of those at any one time will run
> > empty, and we just... live with that.
>
> I was going to ask if we could do something like this. Since I don't think
> the cordic will be used *that* often, this seems like a reasonable thing to
> do.
yehyeh.
ok what else can we throw in here? LOG1P is probably a good thing to try
this is a great explanation:
http://www.mclenegan.com/public/thesis.pdf
section 4.4.1 explains "rotation mode"
however we also need *vector* mode (4.4.2)
i know - somewhere - i've seen a python implementation of vector mode.
it also specified, as you can see, the three other modes: circular,
linear, hyperbolic.
ah! i know why we want to keep the two answers: see p39 of that
thesis: you can use them (with some post-processing) to do
tan.
also, see circular-vectoring mode, it ends up computing the
normal distance and... and the angle? ohh, that's for... darn
what was it... it's for arcs. there's a name for this... sorry
i forget what it's called right now.
https://www.ijert.org/vhdl-implementation-of-cordic-algorithm-for-wireless-lan
this version, p14, explains much clearer that "rotate" mode goes
by angle and computes x,y
"vector" mode the vector is rotated to be flat against the x-axis,
recording the angle (Z) needed to do so.
this one is a survey of CORDIC algorithms:
http://www.ee.ic.ac.uk/pcheung/teaching/ee3_DSD/crdcsrvy.pdf
it shows - *briefly* - how to use them to do different things, but
it also, by being a survey, has some really useful insights into
various optimisations.
so can i suggest, first adding the infrastructure back in to allow
cancellation and operations (ctx), then see if you can add one extra
"mode" (the m=-1, m=0, m=1 thing seems trivial which will give the
circular, linear and hyperbolic modes, respectively).
"circular" mode is covered with the output being SIN/COS.
ah ha! here's a useful suite of implementations (annoyingly being
in jupiter)
https://github.com/suyashmahar/cordic-algorithm-python/blob/master/cordic_implementation.ipynb
extractign the two modes:
circular = 1
linear = 0
hyperbolic = -1
def ROM_lookup(iteration, coordinate):
if (coordinate == circular):
return math.degrees(math.atan(2**(-1*iteration)))
elif (coordinate == linear):
return 2**(-1*iteration)
elif (coordinate == hyperbolic):
return (math.atanh(2**(-1*iteration)))
def rotation_mode(x, y, z, coordinate, iterations):
a = 0.607252935; # = 1/K
x_val_list = []
y_val_list = []
z_val_list = []
iterations_list = []
i = 0; # Keeps count on number of iterations
current_x = x # Value of X on ith iteration
current_y = y # Value of Y on ith iteration
current_z = z # Value of Z on ith iteration
di = 0
if (coordinate == hyperbolic):
i = 1
else:
i = 0
flag = 0
if (iterations > 0):
while (i < iterations):
if (current_z < 0):
di = -1
else:
di = +1
next_z = current_z - di * ROM_lookup(i, coordinate)
next_x = current_x - coordinate * di * current_y * (2**(-1*i))
next_y = current_y + di * current_x * 2**(-1*i)
current_x = next_x
current_y = next_y
current_z = next_z
x_val_list.append(current_x)
y_val_list.append(current_y)
z_val_list.append(current_z)
iterations_list.append(i)
if (coordinate == hyperbolic):
if ((i != 4) & (i != 13) & (i!=40)):
i = i+1
elif (flag == 0):
flag = 1
elif (flag == 1):
flag = 0
i = i+1
else:
i = i+1
return { 'x':x_val_list, 'y':y_val_list, 'z':z_val_list,
'iteration':iterations_list, }
ahhh, i recognise this code :) i've seen it before.
def vector_mode(x, y, z, coordinate, iterations):
a = 1.2075; # = 1/K
x_val_list = []
y_val_list = []
z_val_list = []
iterations_list = []
i = 0; # Keeps count on number of iterations
current_x = x # Value of X on ith iteration
current_y = y # Value of Y on ith iteration
current_z = z # Value of Z on ith iteration
di = 0
# This is neccesary since result for i=0 doesn't exists for hyperbolic
# co-ordinate system.
if (coordinate == hyperbolic):
i = 1
else:
i = 0
flag = 0
if (iterations > 0):
while (i < iterations):
di = -1*math.copysign(1, current_y);#*current_x);
next_x = current_x - coordinate * di * current_y * (2**(-1*i))
next_y = current_y + di * current_x * 2**(-1*i)
next_z = current_z - di * ROM_lookup(i, coordinate)
current_x = next_x
current_y = next_y
current_z = next_z
x_val_list.append(current_x)
y_val_list.append(current_y)
z_val_list.append(current_z)
iterations_list.append(i)
if (coordinate == hyperbolic):
if ((i != 4) & (i != 13) & (i!=40)):
i = i+1
elif (flag == 0):
flag = 1
elif (flag == 1):
flag = 0
i = i+1
else:
i = i+1
return { 'x':x_val_list, 'y':y_val_list, 'z':z_val_list,
'iteration':iterations_list }
if you can drop that into a repository, get it working, write a
few simple experiments and see how it operates in each of the 6
modes, we're on a roll :)
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-riscv-dev
mailing list