Re: [Libre-silicon-devel] SoC for GPU project

27 Jun 2018


      ---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
On Wed, Jun 27, 2018 at 10:03 PM, Mohammad Amin Nili
manili.devteam@gmail.com wrote:
...
Would you mind tell me what do you mean by “state” DS from main core and why
do you think it is a problem
ok so you can't just execute GPU instructions on the main core,
right?  can you?  because they're assembly code designed for.... the
GPU, right?  not the CPU, yes?
so you have some OpenGL data structures - "state" - which is on the
main CPU, right?  and, obviously, it's f***-all use sitting there on
the CPU, you have to "package' that up, and get it to the GPU, yes?
but you can't just "throw it at the GPU and hope like hell it'll
magically get there and do something", can you?
so you have to:
(a) "package" the OpenGL data structures - "state" - up into a format
THAT THE GPU UNDERSTANDS.
 (b) *TELL* the GPU "here's your data, do something".
 (c) *** STOP *** the CPU from executing (or context-switch -  do
something else) whilst the GPU is working on it
 (d) *** RETURN*** or otherwise communicate with the GPU to tell you
when the job is done.
that's damn complicated, isn't it?  and how many data structures are
there, and how much data to send over?  you now need to design a
hardware-software API, to deal with all that state, yes?
basically a CPU-GPU interface needs an IPC (Inter-Process
Communication) mechanism that has to take into account TWO TOTALLY
DIFFERENT ARCHITECTURES, doesn't it?
even if the only hardware was that ChiselGPU code, it would *still*
be necessary to write some IPC system, packaging up the data on the
CPU, telling the ChiselGPU engine "go", waiting for it to say "done",
yes?
now compare that to just... taking the MesaGL source code and hitting
the "compile" button, and taking the gallium3d-llvm source code and
hitting the "compile" button.  does that sound a lot easier?
in the case where you keep all of the code, the state, and the data
structures in *ONE* processor, the entire development process becomes
drastically, drastically simpler, yes?
one thing: it may be possible to begin profiling the gallium3d-llvm
code *right now* (even on x86) to assess the inner loops and see where
most of the time is spent, in each of the different areas associated
with 3D rendering.  take a look at jeff's nyuzi2016 paper to see what
i mean.  it's *really* important to know how many cycles are spent (on
average, per pixel) transferring data from memory into registers (and
back).  it's really important to know how many cycles per pixel are
spent on rasterisation, and so on.
whilst gallium3d-llvm on x86 will be heavily-optimised for SSE, it
will at least give a good indication.
...
So if I understood right you are talking about a Larrabee-like RISC-V
architecture?
yeah pretty much.  with a focus on finding out *where* time is spent,
then investigating if fixed-functions can be designed to speed that up
(and reduce power consumption at the same time).
also see what low-level data types (FP16, FP12) and what sorts of
SIMD / Vector widths would do.
also i would *really* like to know is: if extending RISC-V to 64
registers (it's currently 32), could the extra 32 registers (on a
64-bit system) be effectively used as a substitute for a tiling
architecture's scratch-RAM area?  4x4 x 32bpp is basically 16 32-bit
registers which is only 8 64-bit SIMD registers.  which really is not
a lot.
l.

2025

2024

2023

2022

2021

2020

2019

2018

Re: [Libre-silicon-devel] SoC for GPU project