Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - kyle

Pages: [1] 2 3 4
1
General Discussion / clangd compilation server
« on: February 13, 2017, 07:22:40 PM »
https://reviews.llvm.org/rL294291

This is a concept that I think was discussed a couple of years ago.  It provides the compiler as a service rather than a library.   I thought it was an interesting idea back when it was first mentioned, but now someone has done it for Clang.

2
Implementation Details / incremental arrays
« on: January 20, 2017, 04:35:51 AM »
I was going through the language reference and finally read the section on incremental arrays.  Very nice!  That should eliminate a lot of ugly boilerplate early set up.

What are the limitations on the array additions?  Can they be in any compilation unit where the array is visible or do they have to be in the same compilation unit (i.e. file)?

3
Ideas / Re: Vectors in combination with semi-automatic memory management
« on: January 16, 2017, 07:41:07 AM »
You are covering a lot of ground here.

I'll try to address each part.

First, I'll note that if you require the compiler to have a hook for when control leaves a function to delete a vector, you might as well make it general.  There are a lot of things that can be "freed" when control leaves a function, not just memory.  You can close files, sockets, shared memory etc.  C++;s RAII idiom is based on this and Go uses a more explicit method of doing this with its defer construct.  I am more of a fan of RAII than defer simply because defer can get forgotten whereas RAII is baked into the code when you declare your variable.

Data lifetime and the function call hierarchy are not necessarily the same.  Data tends to fall into three main buckets: ephemeral within one function, data only passed down to sub-functions that do operations on it, or data that is built in a function and intended to be returned upward.   For the latter, think of a function that builds a tree node.  As far as I can tell, your vectors only cover the first two of these cases.  How do you turn it off so that functions can construct data and return it to the caller?

Vectors and Blocks

If I understand correctly, vectors are resizable and blocks are not.   Other than that, they are intended to be used more or less the same way?  If so, why the difference?  The memory in question is allocated out of the heap in both cases.  If you allow resizing to reallocate, you can merge the two back into one construct.  The fewer things the better.

The for..range loop is not possible without either significant other infrastructure and conventions (C++) or limited application (just arrays).  From what you are doing here, it seems like it might be easier just to drop C's pointers-are-arrays (except when they are not) approach.  Either disallow pointer arithmetic or minimize it and make arrays carry their sizes with them.  This may be another area for simplification:  If you use an array, that means you want bounds checking.  If you use a pointer, you do not.

You bring some interesting ideas to the table, but I think it might make sense to strip them down into the more basic parts possible and see how those can be combined to build what you want.

  • Automatic freeing: generalize this to some sort of on-function-exit hook.  That is very powerful and you can do a lot with it. C.f. C++'s RAII and Go's defer.
  • Vectors/blocks:  simplify to arrays having bounds.  The for..range construct only works with arrays.  Arrays could be resizeable and assignable independent of this.  I would make a much more clear separation between pointers and arrays.
  • slices.  These are generally good and can be implemented as arrays with shared data parts (though you need to watch the lifetimes!)

Side Effects

Removing globals does not make functions pure.   Think about passing an address around (a pointer).  I pass the value of the pointer around, but each function that gets it can dereference the pointer to look at the value in memory.   So, I just made it a little bit harder to have globals, but I did not really remove them.

I think that memory safety is definitely something interesting.  There are so many security exploits that rely on buffer overflows, bad pointer arithmetic etc. 

4
General Discussion / Re: Pthreads
« on: January 11, 2017, 05:15:40 AM »
I ran across this blog post:

https://thefeedbackloop.xyz/thoughts-on-dependency-hell-is-np-complete/

I thought it was particularly interesting since the thread has drifted to package management.  Since C2 does its own management, it might be informative to see what others are doing.

5
Ideas / Re: Vectors in combination with semi-automatic memory management
« on: January 11, 2017, 04:55:04 AM »
So for each level in the call hierarchy, you have data elements that are "local" in the sense that they can only be freed when the function in which they are allocated goes out of scope.  You can pass slices down to called functions, but not 'up' or 'out' to any higher level data structure.  Higher level here means higher in the call hierarchy.

So, this does sound like C99's dynamic arrays, only they are allocated more like malloc than on the stack.   You could implement something like this with GCC-specific additions now and prototype it.

I am not sure why structs would be a problem since you could just allocate them on the stack and the final return and increase of the stack pointer would wipe them out.   In the case of your arrays, you would need to trigger a clean up function when leaving the function scope.

It does sound like this is very much like Rust.

6
Ideas / Re: Vectors in combination with semi-automatic memory management
« on: January 07, 2017, 07:45:49 PM »

I'm sorry, I must be really missing something key that holds this together :-(

The part about preventing other functions from deallocating the new array sounds almost like Rust's borrowing mechanism.  I can see how implementing a semi-transparent version of C++'s unique_ptr and shared_ptr could be really powerful without having to implement too much of a run time.  GCC already lets you implement a form of Go's defer() which would give you part of what you are looking for...  I think?

Can you give some concrete examples using pseudo-code?   I must have had extra stupid pills this morning :-(

7
Ideas / Re: Vectors in combination with semi-automatic memory management
« on: January 04, 2017, 05:46:09 AM »
I'm trying to follow along here.   

So how do you handle the case that the function mallocs something and then puts it somewhere it can be reached from outside the scope of the function?   This is a common pattern where a function initializes an entry in a tree or something similar.  In that case, you would need to be able to zero out (or otherwise cause free to be skipped) the local pointer to the malloc'ed data.   

Is this just for the case that there is a local, growable, array in a function?

Clearly you have something in mind here, but I am getting confused by what seems to be a collection of not-completely related things like fat pointers, deferred (a la Go) deallocation etc.

8
Ideas / Re: Struct Functions - part 2
« on: December 22, 2016, 02:32:41 AM »
Is there a "Struct Functions - part 1"?

9
General Discussion / Re: Light-weight classes
« on: December 22, 2016, 01:17:53 AM »
(and still catching up...)

Interesting points.

I tend to like having a limited form of classes.   I programmed in "C with classes" briefly back in my college days.  It was simple and clear and easy to understand.

Go seems to hit a lot of the benefits of "OO"-style programming without really implementing anything special.  One of the things I like about it is that a function and a method are NOT shown in the same syntax. 

I like this.

10
General Discussion / Re: Pthreads
« on: December 22, 2016, 12:50:07 AM »
(still catching up...)

For what it is worth, I just ported a small project from a Makefile-based build system to CMake.  The initial stages were rough.   However, it is interesting how CMake did manage to make most of the differences between building executables and libraries under Linux and Windows either disappear or at least become more minimal.

It seems like a sort of "platform shim" would be necessary as part of the port of C2 to a platform.   That would need to encapsulate the peculiarities of the platform.

As you note, other projects are just including everything they need.   While this is an easy approach to take, I have had a lot of problems with the results when a code base gets large and old.   Most of the problems I've seen with this are from Java, but I think the lessons apply here too.   The things we run into often are that there will be two, three or even four versions of Java.  Which one gets used for what task is not always obvious and sometimes not even deterministic.  Add the effort of testing for specific fixes (i.e. Java fixes for crypto problems) and it gets really fun.

One way that seems to be gaining traction is that used by Red Hat and Canonical (similar systems, but of course different implementations and syntaxes and tools.  NIH?  Never heard of it!).   This copies a bit from the Mac in that every app has its own copy of the binaries.   They did better than that by having the ability to share some of the libraries.   NixOS uses the Nix packaging system and has a very interesting method of handling dependencies.

In Nix, each package is stored in a directory with the project name and the hash (cryptographic) of the arguments used to make the project.   Thus, you get different builds with different directories when you make a library with different configurations.  Symbolic or hard links are used to mash it all together again later. 

I hope that made sense...

11
Ideas / Re: Enum classes
« on: December 22, 2016, 12:36:18 AM »
Sorry for the late reply.  I lost my links to C2 during a particularly busy period and it dropped off my mental horizon (which is admittedly rather small).

The type features of C++ new enums are really nice.   I would strongly suggest something that keeps that kind of distinction.

As to modules.   C has (sort of) two levels of function: one for static (translation unit) and one for extern (whole program).   This has been sufficient for a long, long time.   I'd keep just one level of module.

I haven't caught up yet with all the other work on the forum so please excuse any duplication!

12
Ideas / Re: one big design item: Macros
« on: May 18, 2015, 08:59:07 PM »
It seems like there are only a few cases where existing macro use is not directly replaceable with something easier/better:

  • X-Macros.
  • Code snippets designed to be inlined into existing code.

Now, let's look at why you need those.

The first is really for using the compiler for code generation (in this case to avoid repetition).  The second is more often used for adding a little syntactic sugar.

For code generation, I'll take an example I currently have pending.  I am trying to write a mini-RPC library that is a single .h file in C.  Don't ask why, there is not a good answer :-)

But, to do that, I need to write macros that generate function stubs to marshal/unmarshal arguments.  And, I would like those macros to look a lot like functions themselves to avoid mixing in any extra syntax. 

So, RTTI would allow me to do that.  But RTTI is heavy and requires quite a bit of overhead.  If I can generate code at compile time, I can possibly work around that.

Now, maybe there is a way to do something about this.  Looking around at other languages, I note that Java is now using annotations more and more and more.  Perhaps (the following is not fully fleshed out) there is a way to hook into the parser with something like that.  Suppose we allow you to do annotations:

Code: [Select]
@RPC
func remote_add_nums(a:int, b:int): int

And somewhere else you define a compile-time function @RPC that takes some sort of arguments.  In the case of Java, you get a fair amount of information, but it is all handled by run-time libraries.  In C2, this is not desirable. 

I can see that you could, theoretically, have @RPC be compiled and then run at compile time.  I am not sure what the arguments should be.  In the above example, there should probably be some additional arguments to @RPC for the server etc.  Passing the function information as strings is probably not ideal since we are trying to get away from uncontrolled string handling.

Is there a way to do this usefully?  Exposing the AST seems like a problem since it would force the AST to become a fixed, external API.

Now, take the second of my major cases, syntactic sugar.  I have some macros I use to help me visually see things like mutex-protected blocks

Code: [Select]
synchronized_block(my_mutex) {
      ... do some protected things...
}

That is not natively in C.  It is just a couple of for loops and some C99 inline variable declaration magic in a macro.  If you return from the middle of the block, you lose.  But, it is much, much easier for me to see what is in the block, whether I remembered to close the block etc.  I got rid of a lot of bugs in a couple of programs when I did this.  Very handy.

I would really like to be able to introduce some things like this.  Sure, this example may be bad because it should be built into the language anyway, but hopefully you get the idea.

Perhaps we can use something like this:

Code: [Select]
sugar synchronized_block(m:*mutex)=for(...) for(...)

I'm not very happy with that, but there are some possibilities of combining annotation/compile time functions and this.

Sorry this is not well thought out.   I had a few ideas and wanted to throw them out there.

Best,
Kyle

13
Ideas / Re: one big design item: Macros
« on: April 17, 2015, 09:38:05 PM »
Macros are a very interesting topic to me.

Your list of uses for macros is good, but you are missing one extra thing that happens with C11, generic functions/behavior using the new _Generic keyword.

My thought was to try to do something with limited, but similar functionality to C++ templates.  I don't like the C++ syntax though.  In order to do that, you would need to find a replacement for all the existing uses of macros.

I'll take each of these in turn.

Feature Selection

Though you can use #ifdef etc. to include and exclude things, code that uses it is now generally considered bad code.    From what I have seen in the past few years, most of this has moved to the build system. 

This is sometimes used for debugging that can be compiled out:

Code: [Select]
if(DEBUG) printf("Foutje!  Bedankt.\n");

The assert macros work in a similar way.

As far as I know, that is about the only remaining case where feature selection is really used any more.  I don't have a good idea here other than using the build system to use different source files.  That can work and I use it for platforms-specific code.  It would be a bit strange for something like assert though.   I think more thought needs to go into this before it could work.

One way to do the assert idea would be to use one .c file for debug (that defined assert() a body that did something) and another for release (that defined assert() as a function with nothing in the body).    The build system would figure out which file to use based on the build target.

Constants

As you mentioned, just use const instead.  The compiler is smart enough to figure it out.

Code Expansion

This really should be something more like a template so that it can be typesafe.

Code: [Select]
generic T max`T(T a, T b) { return (a<b ? b : a); }

Note that this will work with calls like:

Code: [Select]
int a=2;
int b=3;
int c = max`int(a++,b++);

It won't evaluate the ++ multiple times.

Modern compilers can figure out that max() should be inlined by themselves, so this is not any extra overhead, but it is much safer and cleaner.  I just threw in some random syntax for variable types. 

Some code expansion is a lot more difficult to emulate.  For instance X Macros.  I have used things like that in the past.  I would like to get away from all of the pure text replacement types of macros.  While they are powerful, they are also dangerous and fragile.  Often there are better ways to do things.

I am not sure I understand where you are going with local and non-local macros...

Another alternate syntax I thought of would be like this:

Code: [Select]
macro pdebug(const char *msg, ...args) = if(debug) vaprintf(msg, args);

Visually, it makes a macro look like a declaration/definition.  It also puts types on the arguments. 

14
Ideas / Re: other data type extensions
« on: April 04, 2015, 06:13:33 PM »
There is a very long thread on the Rust RFC GitHub issues tracker that has some interesting discussion (as well as some heated discussion) about overflow checks etc.

https://github.com/rust-lang/rfcs/pull/560

It is very long, but I found a lot of interesting information hidden in it.

15
Ideas / Re: Array types
« on: March 17, 2015, 10:24:12 PM »
Well, it depends  :)

If you want to do things exactly the same way as C (conversion from array to pointer), then there isn't much you can do.   

However, if you are willing to extend things a bit, then you could pass array bounds in either fat pointers, or via metadata that can be retrieved via the pointer.

Personally, I would like to be able to pass

Code: [Select]
int a[14] = {0,};

foo(a);

and have
Code: [Select]
foo() be defined:

Code: [Select]
void foo(int an_array[])

And have the passed array get copied.  If I want to share it, I'll explicitly pass a pointer.

Of course, if you did that, then to call C libraries, you would need to make sure that you explicitly passed a pointer to the first element of the array.

I'll think about this more...  It isn't that clear.

Pages: [1] 2 3 4