[libre-riscv-dev] register requirements of SimpleV
lkcl
lkcl at libre-riscv.org
Wed Oct 10 21:44:17 BST 2018
On Wed, Oct 10, 2018 at 8:16 AM lkcl <lkcl at libre-riscv.org> wrote:
>
> On Wed, Oct 10, 2018 at 5:56 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
> > One critical point is that VL is the number of element groups, not the
> > total number of elements as that makes it easier to use as VL can be set
> > once or a few times rather than needing to be changed everytime we need a
> > different vector.
>
> yyeah this is why i seriously considered having a separate VL per CSR
> entry. however that would mean 7 bytes times 16, or redoing the
> 16-bit x 2 entries per register (8 32-bit CSRs) to 23-bit x1 entries
> (16 23-bit regs, 9 bits unused).
ok i might have a solution, here, it involves being able to sub-set
the CSRs and switch the STATE information, at the same time.
STATE is presently defined as:
struct {
mvl: 6
vl: 6
offs_src: 6
offs_dest: 6
}
which leaves 8 bits spare. i wanted to use 5 of them to subset the 16
CAM entries, which are done as 16-bits each, so CAM entry 0+1 are in
one, CAM entry 2+3 in the next and so on, for a total of 8 CSRs.
therefore, the 5 bits can be divided 3:2, 3 to indicate the start
point of which CSR CAM (address) is active, and 2 as a 1<<(i*2)
indicator of how many are active:
start = ((subset>>2)&0x7)
len = 1<<((subset & 0x3)*2)
for (i = start; i < start+len && len < 16; i++) {
CSRregcam[i].active = true;
CSRpredcam[i].active = true;
}
something like that.
however then what i figured it would be possible to do would be to
have *four* STATE CSRs, and to be able to have (another) 5-bit CSR
that:
* sets which of 4 STATE CSRs is active, thus instantly changing the
length, MVL, and so on
* sets a secondary "max table size" (3 bits) so that context-switching
doesn't have to save the entire table. changing this would zero out
CSRregcam and CSRpredcam entries above the limit.
the secondary "max table size" would also set how many of the 4 STATE
CSRs were actually valid, so, again, saving on how much
context-switching is needed.
in this way you would kiiiinda be able to do a "stack" depth of maybe
up to 3 or 4... sort-of... for nested vector loops, or where there
were tight interactions between registers, and flip between them using
a single CSRRWI instruction (hence the 5-bit limit).
note that this would *not* be anything to do with the loop variable in
which SETVL was stored: it would be necessary to use e.g. t0 for one
of the loop variables and t1 for another.
thoughts?
l.
More information about the libre-riscv-dev
mailing list