Home Contact Sitemap

Stephen Cross

Software Engineer / Computer Scientist

Loci - Update

So I’ve been working on Loci for the last couple of months and this should serve as a useful update. I’m going to go through the aspects of other languages that influenced Loci and I’ll explain the details of Loci in subsequent articles. I’ll also be providing an update on OpenP2P in a later post, along with releasing its first version.

I’ve made significant progress in taking my ideas/specs for Loci and turning them into a workable compiler. The ultimate target for this project is to build a fully functional self-hosting (i.e. written in Loci itself) compiler. I went through the process as follows:

  1. I built a simpler interpreter for Loci in which the types given were basically ignored (and thus it is executed like a dynamically typed language). This was originally going to be the basis for the development of the self-hosting compiler, however once I was fairly well into this I discovered that writing an effective runtime library would be a slow process (most of the code would be boilerplate if’s to check what method is being called and check the parameters), and that building a compiler would make things easier.
  2. I wrote a very simple compiler that directly converted from Loci code to C++ code without doing any analysis, such that any errors would appear when C++ compiled the resulting code (great way to obfuscate errors). This was again not sufficient because it couldn’t support interfaces (and more advanced features of the language) correctly. This consumed only a few hours of my time, so wasn’t a big loss.
  3. I took on the task of writing a compiler that would perform semantic analysis itself and generate resulting code in C, so I can just tell gcc to finish the work off for me. I used the interpreter I had written to save some time and two weeks later it is now completed for the more basic features of Loci. I’ll be spending some time now developing a fairly basic runtime, which should be as easy as writing a few C++ classes and getting them included with each compile.

I also plan to make a parser generator for Loci, which will effectively be a wrapper of the LALR Lemon parser generator (I see no reason to write a completely new parser generator when a well tested C generator already exists).

Before Loci

I’ve written the compiler in C++ right now because that is the language I’m most familiar with and it provides a good set of tools for compiler development. While writing the compiler, OpenP2P and many other projects that may or may not be on this website, I’ve hit both the great and painful aspects of C++ on multiple occasions. I’ve also discovered many new languages in the last few years that have their own set of good and bad features. The purpose of Loci is to take all the good parts and fit them into a simple coherent model. I’m sorry if this list either seems obvious or you strongly disagree with any of what I’m about to say.

  • Having to provide a header and a source file for each class is very annoying – C++ compilers require that types are already known when you use them and a header file is therefore required to resolve this chicken and egg problem. So, if two classes rely on each other, at least one must have a separate class declaration and implementation. Java (along with many many other languages) uses multiple passes so as long as the class definition appears somewhere, it’s ok.
  • Static typing can be your friend – On the whole, I find it useful to discover errors at compile time than at run time, even if it does require a bit of extra typing. Futhermore, types help to describe variables. However the compiler should do some basic type inference to avoid repetition in statements such as List list = new List();.
  • Manual memory management is annoying and error-prone – While it does have the advantage of generally using less memory and potentially being faster, it consumes a very valuable resource: the time of the developer. Trying to work out who should deallocate what memory is an irritating process and integrating this with multiple threads in OpenP2P has been a frustrating task.
  • However, RAII is great – The ability to deterministically recollect resources other than memory is invaluable, and RAII gives you a simple mechanism to prevent you from forgetting to call the destructor/finalizer yourself. This is particularly useful when exceptions are thrown. The try…finally clause in Java isn’t at all great for this, while the using statement in C# is a fair improvement. However, C++ still by far has the best solution in this arena.
  • Concurrency is harder than it needs to be – Programming with threads can be very difficult, especially when you have to manage shared access to a particular resource. However a more fundamental solution, such as that found in Erlang, makes it much easier. Considering the current trend towards increasing concurrency, this problem needs to be solved.
  • Immutable data is useful – When dealing with value types such as numbers, strings, vectors etc. and basic data structures, it’s much easier to make them immutable (the numerous advantages can be found by a simple web search). However, most developers find it easier if a language allows side effects and mutable data as well (although every computer scientist should know and appreciate pure functional languages).
  • Objects can be good – While there seems to be much discussion over whether OOP is a good thing, I personally have found it improved the quality of my code. However each language implements OOP in its own way, from interface-only inheritance to single inheritance to multiple inheritance. I strongly favour composition over inheritance and interfaces over abstract classes. I also support the use of structural typing (used in Google’s Go language) so you don’t have to specify interfaces you implement in advance. I think static methods and static class data are a bad idea and I favour allowing programmers to define functions outside classes instead (rather than Java’s anti-function stance which led to static methods). Finally, I believe access specifiers add unnecessary complexity and therefore support making all object methods public and all object data private.

I’ll probably be adding more points to this article as I think of them. As I mentioned at the start, I’ll be explaining how Loci works in later posts and how I think it solves existing problems.

Sep 2, 02:36 PM

Comment