|
(This question is a copy of a question asked elsewhere. I am most curious as to the thoughts of people in this "machine learning" community website) In the past week I've been following a discussion where Ross Ihaka wrote:
He then continued explaining. This discussion started from Xi'an's Og, and was then followed by comments at reddit, statalgo, DecisionStats, columbia.edu, Hacker News, r-help mailing list, and maybe other places. As someone who isn't a computer scientist, I am trying to understand what to make of this.
I think this question is not subjective, but since it has so many uncertainties, I decided to mark it as a community wiki. Lastly, I just published my response to the topic in a post titled "Open source and money – why R developers shouldn’t be paid" (in case any of you where wondering)
This question is marked "community wiki".
|
|
Disclaimer: I don't R and I don't know about the original debate. I'll talk about Python. What are the "more fundamental issues"? I will respond to the question of speed. There is a difference between Python the language and python the interpreter than runs Python code. The question is whether a) the language is inherently slow, or b) whether the current interpreter is what's slow. If the language is inherently slow, a complete overhaul is necessary, and a new language with new syntax and semantics is necessary. That means that your existing R code is deprecated. I'm not sure when this would occur. Perhaps if the language is too low-level to be optimized correctly. If it's just the interpreter, writing another interpreter is a viable option. Consider that Python has the official interpreter, as well as alternate interpreters that are being developed. A new interpreter means that existing R code will work. One could also try to find the middle ground, by deprecating only slow features in the R language. Consider RPython, a restricted subset of Python for which it is far easier to write an interpreter. |
|
Yes. Yes. Yes
This answer is marked "community wiki".
|
(posting as a comment since I'm not really that familiar with R, having never used it for a major project)
It seems to me that R's situation is similar to that of emacs: it's got a lot of quirks, it's not as fast as it could be, has odd language features (including dynamic scoping), etc, but it also has man-decades of acumulated knowledge, code, and experience baked in. Maybe some things can be fixed in an odd way (such as making a JVM R port so it can be more easily optimized and would make it easy to use R's great libraries in other more "modern" languages/environments, etc) and some other things are going to be dealt with (naming conventions go a long way towards minimizing the scope of scoping issues, as demonstrated by most emacs code).