(This question is a copy of a question asked elsewhere. I am most curious as to the thoughts of people in this "machine learning" community website)

In the past week I've been following a discussion where Ross Ihaka wrote:

I’ve been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from S and some are peculiar to R.

He then continued explaining. This discussion started from Xi'an's Og, and was then followed by comments at reddit, statalgo, DecisionStats, columbia.edu, Hacker News, r-help mailing list, and maybe other places.

As someone who isn't a computer scientist, I am trying to understand what to make of this.

  • Is R so flawed that it is better to rewrite it then to fix it? Searching on stackoverflow, I came by When to rewrite a code base from scratch and Under what situation should code be rewritten from scratch? (based on Joel's article Things You Should Never Do), both threads argue that a very(!) extreme case is needed in order to justify a rewrite of the code. But is this the case with R?
  • Can R be patched in a way to fix these problems and do become "the stat language of the future" ?
  • What about the social aspect of this? R already has a large user base. If R were to "die", is it possible to imagine all the users willing to move to a new language?

I think this question is not subjective, but since it has so many uncertainties, I decided to mark it as a community wiki.

Lastly, I just published my response to the topic in a post titled "Open source and money – why R developers shouldn’t be paid" (in case any of you where wondering)

This question is marked "community wiki".

asked Sep 16 '10 at 18:17

Tal%20Galili's gravatar image

Tal Galili
149359

(posting as a comment since I'm not really that familiar with R, having never used it for a major project)

It seems to me that R's situation is similar to that of emacs: it's got a lot of quirks, it's not as fast as it could be, has odd language features (including dynamic scoping), etc, but it also has man-decades of acumulated knowledge, code, and experience baked in. Maybe some things can be fixed in an odd way (such as making a JVM R port so it can be more easily optimized and would make it easy to use R's great libraries in other more "modern" languages/environments, etc) and some other things are going to be dealt with (naming conventions go a long way towards minimizing the scope of scoping issues, as demonstrated by most emacs code).

(Sep 16 '10 at 19:14) Alexandre Passos ♦

2 Answers:

Disclaimer: I don't R and I don't know about the original debate. I'll talk about Python.

What are the "more fundamental issues"? I will respond to the question of speed.

There is a difference between Python the language and python the interpreter than runs Python code. The question is whether a) the language is inherently slow, or b) whether the current interpreter is what's slow.

If the language is inherently slow, a complete overhaul is necessary, and a new language with new syntax and semantics is necessary. That means that your existing R code is deprecated. I'm not sure when this would occur. Perhaps if the language is too low-level to be optimized correctly.

If it's just the interpreter, writing another interpreter is a viable option. Consider that Python has the official interpreter, as well as alternate interpreters that are being developed. A new interpreter means that existing R code will work.

One could also try to find the middle ground, by deprecating only slow features in the R language. Consider RPython, a restricted subset of Python for which it is far easier to write an interpreter.

answered Sep 16 '10 at 20:36

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
467541105126

edited Sep 16 '10 at 20:45

-1

Yes. Yes. Yes

This answer is marked "community wiki".

answered Sep 16 '10 at 22:40

dogy's gravatar image

dogy
3065918

edited Sep 16 '10 at 23:05

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.