Reserved words: what are they good for? (Absolutely nothing?)

Random musing for the day: I was thinking about reserved words in programming languages and whether they’re really necessary at a lexical level. As you know, most programming languages define in their lexical grammar a set of words that cannot be used anywhere in the language except when explicitly specified in the grammar. For example, VB reserves the word “Object”. So you can’t just say:

    ' Error: Keyword is not valid as an identifier.
    Sub Object()
    End Sub

Many languages (such as VB) allow you to work around this by providing some sort of lexical escape that suppresses the reserved nature of the word. So you can say in VB:

    Sub [Object]()
    End Sub

Confusingly, many keywords that we’ve been adding to VB lately haven’t been reserved words, to reduce the need to modify people’s code when they upgrade. Instead, they’ve been contextual keywords, that is to say they’re only reserved in certain syntatic contexts. For example, From is not a reserved word in VB in the lexical grammar, but if you start an expression with From and then follow it with an identifier, we say “Oh, yes, you’re starting a query…” For example:

    Dim From As Integer = 10
    ' OK: Unambiguously the local variable
    Dim x = From + 10
    ' OK: Unambiguously a query
    Dim y = From a In New Integer() {1, 2, 3, 4}

Which leads me to wonder: why bother with lexically reserved words at all? Why not just make all of your keywords contextual? When I started on VB, I guess I just accepted the practice since that’s what the language did before I showed up, but now I’m not so sure. Maybe there’s some blindingly obvious reason that I’m not seeing (probably there is). I can think of some historical reasons why keywords weren’t all contextual:

  1. Maybe it simplified writing a parser in “the old days,” or it simplified building a parser generator.
  2. Maybe it was because people were writing code in editors that didn’t have syntax coloring. “int int = 5; int = (int)int * (int)int / (int)int” looks pretty nonsensical if you don’t have nice coloring to tell you which are the keywords and which aren’t.
  3. Maybe there were grammatical problems with doing it? The previous example makes me wonder about whether C could handle it; I’m not an expert on how the C grammar handles the cast operator.

Anyway, it’s not extremely relevant at the moment–we’re not going to just start unreserving all the keywords in VB–but just something interesting to think about…

22 thoughts on “Reserved words: what are they good for? (Absolutely nothing?)”

  1. We need to keep keywords reserved because the parser in the human brain is a lot less state aware than the parser in the VB language πŸ™‚

  2. VB has always had reserved words. I imagine that it would be a major undertaking and require massive refactoring to convert all of the old reserved words into contextually reserved words. The same would hold true for other languages. The best thing is to leave the legacy reserved words alone and introduce new reserved words as contextually reserved.

  3. I think you’re on to something Paul.

    I enjoy all the expressiveness that you and the VB team added to VB 9:

    XML literals

    XML axis properties

    COMPLETE query expressions

    Smart query expressions (ordering by, defining a group, and selecting with multiple items automatically makes an anonymous type)

    I began programming with VB 7. I questioned weather I chose the correct language; now I know that I did.


  4. Speaking as a compiler developer, ease of parsing matters, both for the vendor’s compiler and third-party tools. Most industry compilers use hand-written recursive descent. Things like error recovery and intellisense are a lot easier to implement using recursive descent; recursive descent generally works a lot better when there are unique identifiable FIRST tokens.

  5. PL/1 let you use keywords as variables, because, darnit, someone figured you ought to be able to parse it correctly from context. I don’t remember any of the syntax, but you could do things like "if if = if then then = then else then = else". Hilarity ensued.

  6. "Those who forget history are doomed to repeat it."

    Let’s go back in the wayback machine about fifty years, to one of the first popular programming languages: Fortran IV. It didn’t have reserved words. (It also didn’t have syntactically-meaningful whitespace, but that’s another matter.) Because of that choice, and desire for backwards compatibility, Fortran still doesn’t have reserved words, even though it’s changed even more than Basic has between DOS’s BASICA and today’s VB.

    So, yeah, that experiment’s been run. And the conclusion is that it was entirely possible to do even with 1950s parser technology and no whitespace, and it’s still a Bad Idea. Why? Because parsers are not just called on to parse correct programs. They also have to parse incorrect programs, and give reasonable error messages to the programmer. In Fortran, there are many cases where the lack of reserved words makes it much harder for the compiler to recognize what the programmer intended to be doing — is this IF really meant to be the keyword, or is it meant to be a variable, for example? And, in the worst cases, the incorrect program may still be completely valid, and just do unexpected weird things.

    Admittedly, some of the other choices made when Fortran was first defined — the implicit declaration of variables, and the thing with whitespace being entirely meaningless — made the problem a fair bit worse. (Luckily, both of those have been fixed in subsequent versions.) But, even without those, I think it’s pretty clear that the difficulty it adds to producing clear error messages and reliably diagnosing problems is not worth the minor gain in simplicity for the user.

  7. I tried to implement a feature like this, but I wanted to use a parser generator instead of building the parser myself (because the grammar was frequently undergoing massive changes). I ended up modifying the source of a parser generator because I couldn’t find a good one that would support a feature like this.

    I eventually decided to concentrate more on the semantics and let the syntax be normal and boring.

  8. Sorry, but why bother ?

    The precious time of the VB developement team would be best used at implementing cool new features (or missing one like yield πŸ˜‰ than change a beavior whitch doesn’t annoy anyone…

    Why risk new bug ? For what gain ?

    The brackets stuff is ok for me.

  9. the resulting language would not be context-free and there’s a lot more knowledge and tools about how to parse context-free languages.

    also, as pointed out, you’d have to make sure that there wouldn’t be any ambiguities, as in, the context is not clear enough to know if we’re talking about a keyword or a variable — or at least in such cases to prefer one over the other.

  10. I know you said it was just a musing and you have no intentions of doing it, i would ask that you do that for enum though, its annoying having to write [Enum].Parse instead of Enum.Parse (not to mention a lot of people dont even know about [] escaping or the Enum.Parse et al) there may be a few others that could do with the same treatment that i don’t recall at this moment.

  11. "Quick example of resulting ambiguity in C:

    double int = 4.0;

    What’s the value of sizeof(int)?"

    C++ "solution" to this problem:

    sizeof(int) == sizeof(double), because int is a double.

    To get the size of a type, use sizeof(typename int)

  12. "Quick example of resulting ambiguity in C:

    double int = 4.0;

    What’s the value of sizeof(int)?"

    C++ "solution" to this problem:

    sizeof(int) == sizeof(double), because int is a double.

    To get the size of a type, use sizeof(typename int)

  13. If you want a language that is easy to learn you should keep reserved words.

    And since you mentioned square brackets, how about use then as indexers? This would be a much more clear syntax, in my opinion, inluding:

    dim myArray as Integer[] ‘ good

    dim myArray[] as Integer ‘ bad

    Besides uniformization of syntax it would lead to more coherency because type ‘Integer’ should mean only type ‘Integer’ and not an array. On the other hand sufix an identifier is more to Pearl than to VB.


  14. To me this falls close to the mixed case question in programming languages. It’s considered poor practice to vary variables based on case alone. VB prevents this by being case insensitive. When C# was released the lack of inteli-sense support made the case sensitivity something that invariably reduced productivity – on compile you had invariablly missed or missused case somewhere on the first compile.

    I think keywords are similar to this, they may not be required but by screening several ex. ones which it would be considered poor style or worse; both the compiler and the developer are in the long run more productive. I don’t know that keyword reservations are required – and certainly in those cases like ‘from’ I could see avoiding them (I expect there is alot of email related code that uses ‘from’ as a variable name.)

    Thus I think the keyword reservation list should be looked on more as a productivity enhancement then a language requirement. For example if tomorrow ‘If’ wasn’t a reserved word it wouldn’t break anything but I’d still smack any developer using it as a variable…

  15. I think it’s safe to say that your team’s experience in writing parsers beats that of most other people. Yet even you obviously can’t get the handling of contextual keywords right. At least, the current IDE is pretty buggy that way (try using ‘where’ as an identifier in a Linq expression).

    To me, this seems to be a pretty solid reason against contextual keywords: It makes writing a parser much, much harder. Off the top of my head, one of the main obstacles I see is the recovery after an error state, when the user has entered a syntactically wrong element. Until now, this didn’t pose such a big problem but if you really intend to make line endings more lenient then this could become more serious.

    However, all this is only true for control structures. I’ve never seen much sense in making builtin types and constants reserved words. In VB there is almost no difference between builtin types and own types anyway, so I would prefer displaying all data types identically. The same holds for constants such as ‘True’, ‘False’ and ‘Nothing’, however fundamental these might be.

  16. IMO, making all of the keywords contextual is a bit silly, and there’s no real reason for it. I’d go for contextualizing more keywords (especially the long ones, since they are more likely to be used as identifiers), but not going over the top with it. Shortening the reserved word list is one thing (and a good thing, IMO), but eliminating it altogether will just introduce a new possibility for obfuscation and nothing more.

    I can imagine calling a variable "WriteOnly", but naming it "If"?

  17. Pingback: reserved
  18. If it compiles my code, Its sounds interesting Paul, sure compiler theory will move along with a problem like that solved, no matter if not an easy language or if its no practical the experiences could be well used in other context,

    Computer science goes forward in many fields, new improvements on the lexer and parser steps must be significant and valuable, or you guys think compiler design well recall in the same papers and methods at 2070?

    Difference set difference fellows and setting new ideas is the more healthy task you can do in your life πŸ˜‰



  19. hello,

    i need your assistant on my research which my topics is "list any reserved words in basic and their meaning

Leave a Reply