[libre-riscv-dev] spike-sv non-default element widths

Wed Oct 10 08:54:34 BST 2018

On Wed, Oct 10, 2018 at 6:03 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Tue, Oct 9, 2018 at 11:53 AM Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
> > ok adding in native support in spike for non-default element widths is
> > a massive task, where i really do not want to do a redesign of spike.
> >
> That's part of why I originally thought it would be better to write a new
> ISA simulator from scratch.

 yehh i personally prefer incremental approaches (you may have
noticed).  that stems from 4 years of reverse-engineering for
samba-tng that i did, starting around 1996, where tracking down 1 bit
difference, with zero knowledge, in dozens of packets, meant the
difference between "success" and "fail".

 so, now i have reasonably-working hardware-macro-loop-unrolling, i
have a state machine where yes, it would *now* be reasonable to
consider doing an entirely new ISA simulator.

 that having been said, it's not _that_ unreasonable to morph spike to
a c++-template-based design... i'm curious to know if it can be done
with macros, leaving the actual instructions completely alone.

> You could overload the f64_eq and family for your class.

 yes, and... i'm wondering.... hmmmm... i know, i think i have an
idea.  here's the current sv_insn_template.cc:

 reg_t FN(processor_t* p, insn_t s_insn, reg_t pc)
{
    #include "riscv/insns/NAME.h"
}

where this for example is the riscv/insns/fadd_d.h file:

require_extension('D');
require_fp;
softfloat_roundingMode = RM;
WRITE_FRD(f64_add(f64(FRS1), f64(FRS2)));
set_fp_exceptions;

if instead i make that
 reg_t processor_overload_template_class_t<SRCTYPE,
...>::FN(processor_t* p, insn_t s_insn, reg_t pc)

it becomes a class (templated), and it would be possible to make
f64_add actually a member of that templated class.  from _there_ the
(global) version, i believe that can be done by prepending "::" to
f64_add, so "::f64_add" would be called in the templated
processor_template_class_t::f64_add() and so on.

operator-overloaded +, -, >, etc. would do for integer operations
(requiring int32, int64, uint32 and uint64 versions), and... that
would probably do the trick.

> > it'll be a bit... complex.  thoughts on alternative approaches appreciated.
> >
> You could have a function that handles all register reads and another
> function that handles all register writes and have them handle all the
> actual location complexity.

 yes.  that's quite reasonable.  the issue is that float16_t adds do
not behave the same, and where dest element width is set to e.g. 16
and src1 is say 32 and src2 is say 64 bits, an add *really* would not
behave the same.  likewise, for integer adds (particulary
sign-extended integer adds), there's a huge set of combinations and
changes in behaviour...

 ... it's not just about the register writes and reads, in other
words, which is where i'm kinda stuck for alternative options.

l.