General Category > Implementation Details

LLVM/C gen

(1/2) > >>

lerno:
The two backends makes it a bit hard to keep feature parity of both. LLVM is far behind, but what is the strategy?

Using C, a nice thing is that we can start bootstrapping early if we'd like to(!) We can build parts of the server in C2, then compile to C and then automatically copy that code into the main source!

On the other hand, keeping the same behaviour between LLVM and C isn't easy. I've looked at Clang's LLVM gen, and it produces a lot of optimized code by leveraging intrinsics for certain "known" functions. So for example, if Clang sees sqrt, it can swap the normal library version for a LLVM intrinsic. To complicate things further, those are target dependent :( So there are *massive* amounts of work to do – on the LLVM gen.

Obviously if we get more people behind the project then that might be an easier thing to do. Without a lot of people spending time on C2 it will have a hard time being anywhere near optimized.

So what would the plan be?

bas:
The 2 back-ends are indeed worlds apart.

When starting with the back-ends, I needed one that was easy to write/debug to test with. That is the C back-end.
The generated C code is readable and can be easily checked.

The LLVM/Ir back-end is only in the initial phase. Generating code for this is more complex than C and also the result
is harder to check. When C2 started, LLVM was at 3.2 (I think). Since then, the API has changed a lot, so the 2nd
goal was to keep the contact layer between C2 and LLVM/Clang to a minimum, to be able to rebase easily. That has
proven to be very nice: most rebases take one-two hours.

In the end, the IR back-end will be the main one and we might even drop the C back-end if it becomes very hard to
map C2 functions to that one. It might be good fun to start integrating the IR back-end more and to really generate
an executable with that (for a small subset of the language at start). C2C currently just generated IR. What is missing
is calling LLVM for optimization passes and then generate binary code and link etc.

Language wise, the back-end is not important. So fleshing out the language itself is the highest priority. But it might be
fun to start doing something (like generating an executable)..

lerno:
I started to work on IR generation, but found it extremely painful to even implement something as simple as a while-loop DESPITE WORKING DIRECTLY FROM THE IMPLEMENTATION IN THE CLANG SOURCE!

I think it's amazing that people manage to work with LLVM given how extremely poor the documentation is for actually finding what "function X" does. That the LLVM docs and examples are constantly outdated does not help very much.

The best advice I found was someone who wrote "I write some C code and look at what the LLVM IR output is". One webservice to do that is http://ellcc.org/demo/index.cgi

Some overview of the LLVM IR output should be done as well to make sure we use the diagnostics in LLVM.

bas:
Yes, I concur, I ran into the same issue. Since C2 and C are close relatives (and the Clang code served as an example in many
occasions), look at the Clang source code. Usually that's more complex than C2 needs, but it should contain the basics.

lerno:
Should we try to integrate the diagnostics or not?

Navigation

[0] Message Index

[#] Next page

Go to full version