General Category > Ideas

one big design item: Macros

(1/2) > >>

bas:
One of the big remaining design issues (next to countless smaller ones  :) ) is the design
of a new macro system that is not based on textual substitution by the preprocessor.

Let's start by looking at the goal of macros:

* Feature selection
--- Code: ---#ifdef HAVE_FOO_FEATURE
..
#endif

--- End code ---

* Constants
--- Code: ---#define MAX_ITEMS 10
--- End code ---

* code expansion
--- Code: ---#define print_if_positive(x) \
   if (x > 0) printf("value is %d\n", x);

--- End code ---

I think each of these is a valid goal. So the new macro system should provide
a solution for each (in some way or another). For the Feature selection, C2 can
use the same way as C; since the C2 compiler also has a preprocessor, it's completely
identical.

The Constants goals is attained in C2 by using const of Numeric types:

--- Code: ---const int32 MAX_ITEMS = 10;
--- End code ---
This will just 'replace' all references with 10.

The Code-expansion is the hardest. Since there is no textual replacement, the macro system
has to be language aware. This means that when parsing the macro definition, the parser must
understand what's happening. This results in 2 types of macros: local and non-local. Local macros
can be used inside functions, while non-local macros can only be used outside function
bodies. So local macros are parsed as a series of Statements, while non-local macros are parsed
as a list of global declarations.

The syntax I currently think of is:

--- Code: ---local macro(x) {
   io.printf("value of "$$x" = %d", x);
}
macro(x) {
func gen_$x() {
}

--- End code ---

Open issues:

* Q: are public macros in module X allowed to access non-public Decls in X?
* Q: what to allow as macro arguments?
* Q: what syntax to use for argument replacement, concatenation and stringify?

kyle:
Macros are a very interesting topic to me.

Your list of uses for macros is good, but you are missing one extra thing that happens with C11, generic functions/behavior using the new _Generic keyword.

My thought was to try to do something with limited, but similar functionality to C++ templates.  I don't like the C++ syntax though.  In order to do that, you would need to find a replacement for all the existing uses of macros.

I'll take each of these in turn.

Feature Selection

Though you can use #ifdef etc. to include and exclude things, code that uses it is now generally considered bad code.    From what I have seen in the past few years, most of this has moved to the build system. 

This is sometimes used for debugging that can be compiled out:


--- Code: ---if(DEBUG) printf("Foutje!  Bedankt.\n");

--- End code ---

The assert macros work in a similar way.

As far as I know, that is about the only remaining case where feature selection is really used any more.  I don't have a good idea here other than using the build system to use different source files.  That can work and I use it for platforms-specific code.  It would be a bit strange for something like assert though.   I think more thought needs to go into this before it could work.

One way to do the assert idea would be to use one .c file for debug (that defined assert() a body that did something) and another for release (that defined assert() as a function with nothing in the body).    The build system would figure out which file to use based on the build target.

Constants

As you mentioned, just use const instead.  The compiler is smart enough to figure it out.

Code Expansion

This really should be something more like a template so that it can be typesafe.


--- Code: ---generic T max`T(T a, T b) { return (a<b ? b : a); }

--- End code ---

Note that this will work with calls like:


--- Code: ---int a=2;
int b=3;
int c = max`int(a++,b++);

--- End code ---

It won't evaluate the ++ multiple times.

Modern compilers can figure out that max() should be inlined by themselves, so this is not any extra overhead, but it is much safer and cleaner.  I just threw in some random syntax for variable types. 

Some code expansion is a lot more difficult to emulate.  For instance X Macros.  I have used things like that in the past.  I would like to get away from all of the pure text replacement types of macros.  While they are powerful, they are also dangerous and fragile.  Often there are better ways to do things.

I am not sure I understand where you are going with local and non-local macros...

Another alternate syntax I thought of would be like this:


--- Code: ---macro pdebug(const char *msg, ...args) = if(debug) vaprintf(msg, args);

--- End code ---

Visually, it makes a macro look like a declaration/definition.  It also puts types on the arguments. 

bas:
In C2, at a Module level there are 3 types of Decl's: Types, (global) vars and functions. Macros would be
a fourth type of Decl. Just like the other 3 they can be public / not.

The part about local/non-local is pretty important, so maybe I can explain better. Ansi-C macro's are just
plain text expansion, so whether you type

--- Code: ---if (x > 10) { .. }
--- End code ---
or

--- Code: ---typedef struct { .. } my_##x;
--- End code ---
It's all the same for the preprocessor.
The first example is only valid inside a function body, the second only outside the function body.
So when parsing macros in C2, the parser needs to know what to parse: global syntax or function body syntax.
Global syntax is a bag of Declarations (types, function, etc). Inside functions it's basically a list of Stmts.
The local keyword tells this to the parser. If local the parser will do ParseStmt() iteratively, while otherwise
it will do ParseDecl() iteratively. This also means we cannot use the same macro inside+outside a function. Not
a real problem I think.

In you max(..) example, it's easy to see that generally inlined functions are Always superior to macros, since
you get the type checking etc. In C2 max() could almost always be done with an inline function, since the compile
units are the whole program.

Maybe we can get a more concrete design by just creating a set of use-cases for macros and validating the design
on them. A few I can think of are: (only code-expansion macros, since the features/constants are covered)

* debug:
--- Code: ---debug("I'm in this function over here");
assert(ptr != 0);

--- End code ---


* enum list (keeping enum in sync with other list)
--- Code: ---addValue(STATE1, 10, "begin", func_begin);

--- End code ---



bas:
For generics, let me go back to one of the design principles:
C2 tries to be an evolution of C, not a completely new language. Therefore it should
not stray too far from C. Generics is one of those cases I think that might be a really
good option, but simply doesn't fit the C domain. I don't think it could be added as
a simple extra, because it influences a lot of design decisions..

On the macro-part, I think you have some really nice ideas:

* any - I have to think about it more, but it would be nice to always 'type' the argument
* only code expansion - yes, constants/feature selection should not use these macros
* no nesting - check
* macros called like function - yes, one idea to to show some difference at caller by using mymacro!(..) like Rust
* public macros - whether public macros can access non-public decls is something to think about. A macro is part of the interface, but different compilation units could cause problems indeed, well spotted!
* same indentation level - yes this would be required for parsing it well
One other issue to solve is described previously in this thread is whether a macro is meant to be
used at file-scope or at function-scope, because parsing it would be different. For example,
at file-scope, if-statements are invalid.

bas:
The issue about macros for use in function and for use outside functions remains tricky to understand.
Maybe this post can clear this up.

In C, parsing a macro is easy, just treat it as text and only look for some symbols like arguments etc. Replace
those and you're done.

In C2, parsing macros is a completely different story. The main difference is that the parser is used to
parse them, not the preprocessor. This offers advantages like better checking etc. I think most people agree
on the semantic part.

But a Parser, cannot simply treat the macro body as text, but needs to really understand the syntax, just like
any other part of the program. As a (pseudo) example, when the parser starts with a function definition, it is in a
state that it expects globals, eg in its own function parseFile().
The call stack might look like this:

--- Code: ---parseGlobal();
  parseFileDecl();
    parseType();   // the return type
    parseFunctionName();
    parseFunctionArguments();
    parseFunctionBody() {
       while (..) {
         parseStatement();
        }
    }

--- End code ---

So to parse a C2 macro, the Parser needs to know what to expect: Can it expect a sequence of Statements (like if, while
, calls, etc). Or should it expect top-level Declarations (like stuff at file scope). It could try to detect, but this would make
the Parser difficult and error-prone again. So a solution would be that the programmer tell the Paser what to expect, so it doesn't have to guess. In Rust this is less of an issue, since most things are Expressions.

I don't think adding this requirement would make the language less simple to use. In C, most macro's can only be used either at function scope or at file scope. But since the preprocessor doesn't care, it's up to the Parser to come up with good error messages.

Navigation

[0] Message Index

[#] Next page

Go to full version