Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - lerno

Pages: 1 ... 10 11 [12] 13 14 ... 17
166
Implementation Details / Parsing numbers
« on: October 28, 2018, 11:42:56 AM »
Currently we're using C style number parsing. If we add a new lexer then possible to tweak that a little.

My idea would be:

1. Allow underscore: 0x3333_4322_ABCD, 100_000_000 etc
2. Support dec, hex, oct, bin
3. Support hex and dec floating points, e.g. 10.0e-2, 0x23A.A2p+2
4. Use o and b for bin and oct, so 0o12231, 0b102012120211



167
Implementation Details / Keywords keywords keywords
« on: October 28, 2018, 11:36:12 AM »
First a quick note: I think the documentation is missing keywords. At least "func" isn't there and there might be more.

Anyway, I note that the types ixx and uxx are keywords in the language. Is there any reason why they are tokens instead of identifiers? Or is this just a consequence of borrowing from clang (c) where they are indeed tokens?

Furthermore, there are today keywords for things that are function lookalikes but are actually more like macros:

- elemsof
- enum_min
- enum_max
- sizeof

If the suggestion to use @ to indicate that the compiler does something special at this point, we could swap them for

- @elemsof
- @enum_min
- @enum_max
- @sizeof

This makes them reseved macro names rather than function names (I personally always found the sizeof operator a bit weird since it looked like a function but was a keyword)

The current list has 46 keywords. Eliminating the types and moving the above mentioned things to the "@ namespace", we get that down to 30.

168
General Discussion / Re: Overview of syntax
« on: October 28, 2018, 10:53:27 AM »
For me, begin / end is very much Pascal.

However, there is something to be said for "some start"-end.

Consider this:

K&R
Code: [Select]
func int test() {
   if (foo == 45) {
      do_a();
   } else {
      do_b();
   }
}


Allman
Code: [Select]
func int test()
{
   if (foo == 45)
   {
      do_a();
   }
   else
   {
      do_b();
   }
}

Ruby-like
Code: [Select]
func int test()
   if (foo == 45) do
      do_a();
   else
      do_b();
   end
end

Here I'd actually say that the Ruby-like syntax retains the positive terseness of K&R (no special lines just for the start of a block), with the readability of Allman (blocks are visually extremely easy to make out). The only thing I don't like about this style is that "end" seems such a long keyword.

169
General Discussion / Re: State of progress?
« on: October 28, 2018, 10:07:38 AM »
I was thinking that maybe we could write it in C2, then convert that C2 code to C and then move the generated C into the normal source dir.

I worked on a (C style) lexer last night that could be turned into C2 code.

170
General Discussion / Re: Contribute / get into the code
« on: October 28, 2018, 10:02:08 AM »
brew install cmake

171
Ideas / Re: Macros again
« on: October 27, 2018, 06:09:15 PM »
I'm going to backpedal a bit on this proposal. :D

Consider my proposed definitions:

macro int @foo2(int v, int w)
macro @foo(int v)
macro int @foo3(int a, {} body)

First the parameters. I think it's reasonable that we could mix both actual real values / variables, and placeholders. So for this, let's use the & character to indicate a variable that is taken from the parent scope.

This means macro @foo(int v) should be macro @foo(int &v).

Secondly, in order to make things easier I suggest dropping the type unless it is passed by value.  This further modifies it to macro @foo(&v) if type is required for readability (and perhaps strict matching), then we can use "auto": macro @foo(auto &v). That would allow strict typing, like macro @foo(i32 &v), but this type would be strict, so it would fail to compile if the variable is not i32 (i64, i16, u32 would all be disallowed) in this case.

With this, we can add pass-by-value: macro @foo(i32 v). Here the macro would work identical to an inlined function. Not very useful. The usefulness comes from combining it with body and variable references: macro @foo(i32 v, auto &b, {} body) in usage: @foo(32 * a, c) { print("."); }. Here 32 * a is guaranteed to be evaluated once.

We should also consider nested macros, that is, we can give a macro and expand it similar to the "body" here. In fact, we might be able to consider the {} as a macro that's passed in to the macro and expanded as the macro renders.

To summarize:

Code: [Select]
macro @foo1(i32 a, i32 &b) { ... }  // Works like you would expect from C++ if this was an inlined function. b must be i32.
macro @foo2(i32 a, auto &b) { ... }  // Like above but the b type can be anything (unlike for a function)
macro @foo3(auto a, auto &b) { ... }  // Both a and b have wildcard types
macro i32 @foo4(auto a, auto &b) { ... }  // Same as above, but is guaranteed to return an i32
macro auto @foo5(auto a, auto &b) { ... }  // Return type determined during macro expansion

(The advantage of the "@" in front of the name is that we don't need to declare @foo1 / @foo2 / @foo3 as void to make parsing simple)



Another thing is the amount of analysis. I wonder if it is a good idea to have much typing on the macros. The more statically typed, the more complex it is to express the type restrictions, and that complexity both makes it harder to use and harder to read. Let's use the strength of C as a weakly typed language and don't go too far. I'm thinking that typed macros (ones that does not use auto in the examples above) should be few rather than many. Semantic analysis on the macros can be done by assuming wildcard types and just check that "there could be a type that satisfies this", trying to hit the sweet spot between Cs text macros and more complex macro systems.

It should be noted that LISP, hailed for its powerful and useful macro system, is completely untyped when it expresses a macro. One of the things that really stand out when reading the examples on C2 is that the code is very clean and simple code compared to many other recent languages out there, and it would be nice to keep that for the macro system as well if possible.

172
Ideas / Nested comments alt second level of comments.
« on: October 27, 2018, 02:54:12 PM »
As far as I know C2 supports // and /* */ style comments.

I propose a third style of comments that are different from the above. I suggest /+ +/, which is what D uses. Alternatively this commenting style is simply complementing the current ones.

So a few different possible proposals:

1. Allow /* */ to nest

This path is taken by quite a few languages.

2. Introduce a new multi-line comment that is different from /* */ and will ignore any /* */ in the comments.

So:

Code: [Select]
/+
  all is commented */ /* */ /* out here
+/

/*
  all is commented +/ /+ +/ /+ out here
*/

/+ /+  +/ +/  <- parse error.

3. Same as (2) but allow the new comment to nest.

That is this *will* parse:

Code: [Select]
/+ /+  +/ +/

Any of those solutions would be an improvement.

173
General Discussion / Overview of syntax
« on: October 27, 2018, 02:41:05 PM »
I recommend the following two sites for overview of different types of syntax:

https://rosettacode.org
http://rigaux.org/language-study/syntax-across-languages.html

174
Ideas / Re: Cast syntax
« on: October 27, 2018, 01:47:44 PM »
Personally I think the C style casts are hard to visually parse, so for me 18/19/22/23 feel like very little progress. I never remember the correct precedence rules :(

"foo as i32" is getting some traction due to some new languages using it, but you almost always want to wrap that in ( ) anyway if you do a . invocation: (foo as some_struct).a even if foo as some_struct.a isn't ambiguous. Anything that can be chained easily is better, which would exclude 5, 6. The "type(variable)" cast is ok for built in types, but start to look weird for things like structs, not to mention pointers and functions. I'd eliminate 16 as well for that reason. 20, 21 uses the [ ] which at this point feels like the wrong syntax direction, just like < >

We can eliminate 1, 2 as too close for the current syntax. If I also remove the dupes that have "@" for prefix (that could be introduced later if we wanted to) I see the following alternatives as possible:

a. cast(foo as i32)
b. cast(foo, i32)
c. foo->i32
d. (foo:i32) alt. cast(foo:i32)
e. (foo::i32) alt. cast(foo::i32)
f. foo.as(i32)
g. foo :> i32

175
Ideas / Cast syntax
« on: October 27, 2018, 01:31:13 PM »
Currently the cast syntax is:

cast<new_type>(variable)

As Bas already mentioned, this looks out of place with the rest of the syntax.

I'm therefore going to list a few alternatives (including the current). For these examples I'm casting "foo" to an i32.
  • cast<i32>(foo)
  • @cast<i32>(foo)
  • cast(foo as i32)
  • @cast(foo as i32)
  • foo as i32
  • foo to i32
  • cast(foo, i32)
  • @cast(foo, i32)
  • foo->i32
  • (foo:i32)
  • cast(foo:i32)
  • @cast(foo:i32)
  • (foo::i32)
  • cast(foo::i32)
  • @cast(foo::i32)
  • i32(foo)
  • foo.as(i32)
  • cast(i32)foo
  • @cast(i32)foo
  • cast[i32](foo)
  • @cast[i32](foo)
  • [i32]foo
  • (i32)(foo)
  • foo :> i32


Those are the ones I've found (and I've done the @ variation for the keyword based ones.


176
Ideas / Re: Basic list of improvement points
« on: October 27, 2018, 01:00:44 PM »
A problem with working on the syntax though: any keyword must first be updated in the clang fork :( It makes dev harder since you have to know the workings of [part of] clang as well as C2's source. Some of C2Parser seems to be doing C/C++ parsing (commented out at steps). Cleaning out those remnants would make it easier for further refactorings.

I've made a stab at moving DiagnosticsEngine to be wrapped by "our own" class in a step to refactor things. I had to cheat at some places and obviously the diag:: messages had to be left as is, but maybe if there's some specification as to how C2 uses the Preprocessor, Lexer, DiagnosticsEngine and SourceManager so it's easier to know what can be stripped out in what order.

Obviously since since you wrote the whole thing in the first place you probably would have an easier time than me ripping out the Clang code, but for me (and any other contributor) it has to be done piecemeal so nothing gets broken by accident.

Maybe write up some specs? That would be useful for later anyway.

177
General Discussion / Re: State of progress?
« on: October 27, 2018, 12:51:13 PM »
I've looked at the lexer. I would be interested in writing a C-style lexer that more easily could be turned into C2 later on.

Do we need multiple active lexers?

178
General Discussion / Broken link to the forum
« on: October 27, 2018, 11:26:11 AM »
The link from the main c2lang.org to this forum is broken. You get unauthorized access when you use it. You can still edit the URL to enter the forum if you know it’s working but most new visitors might think the server is down.

179
Ideas / Re: A group of function pointers & a struct
« on: October 27, 2018, 01:26:58 AM »
A quick addition. I think the challenge here is:

1. Prevent the feature from ballooning into a huge part of the language. Look at C++ on how things can grow from fairly simple beginnings. People want more, and it's often much easier to add a little something than to keep it lean. We see this over and over again. The joy of C is that it is still a very simple and lean language. There isn't much fat on it. Compare to other languages targetting the same space: those languages are really, really big and complex.

2. Do not amputate the feature beyond usefulness – people will want you to fix it and then you have problem (1) all over again.

3. Make it usable without providing libraries to go with it, it's an add-on for certain usecases – not a core feature.

4. Above all, keep the language lean.


180
Ideas / Re: A group of function pointers & a struct
« on: October 27, 2018, 01:10:47 AM »
Maybe we should start listing the options:

(1) Some sort of C++/Java style classes – possibly simplified so only struct + interface. Basically an opaque struct + pointer to a vtable. Very simple.

Advantages: fast, well known
Disadvantages: does not solve the generic issue. Grows like a cancer.

(2) ObjC style classes - message passing.

Advantages: no need for generics at all, classes used for large scale integration and doesn't replace normal structs, very simple runtime.
Disadvantages: Slow message passing, requires a rudimentary standard lib to be useful, often misunderstood and misapplied: people think it's supposed to be used like C++.

(3) The Go approach: tagged unions.

Advantages: Similar to (1), but does not invite class creation.
Disadvantages: Does not solve the generics issue, the "interface()" problem.

(4) Virtual wrapper (see above). Basically an explicit version of the Go approach, but less inviting to "program to interfaces"

(5) The manual C way

Advantages: Simple, clear
Disadvantages: Lots of manual labour, poor clarity

Please add more to the list!

Something else?

Pages: 1 ... 10 11 [12] 13 14 ... 17