Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Java plugins. Experiences, pros and cons, when to use them?
From
Phil Schumm <[email protected]>
To
Statalist Statalist <[email protected]>
Subject
Re: st: Java plugins. Experiences, pros and cons, when to use them?
Date
Thu, 20 Feb 2014 13:43:29 -0600
On Feb 20, 2014, at 10:08 AM, Christophe Kolodziejczyk <[email protected]> wrote:
> I would like to have some feedback from anyone on the list who could share his or her experience with Java plugins. It is a new feature of Stata 13 and I haven't tried it yet. But I was wondering in which cases they might be useful and whether there is an advantage in terms of execution time compared to mata. I don't really see why Stata Corp has actually introduced them, but I would be happy to learn more about them :-).
Perhaps the biggest reason to use Java is to make use of the many available libraries. These days, much of programming is knowing the libraries available for the language you're working in, and using these effectively instead of attempting to recreate things yourself. Recreating functionality that already exists in a standard library is not only inefficient, but it is unlikely that what you come up with will be as good (since the libraries have evolved over time, and are typically maintained and contributed to by multiple, professional programmers).
For example, last Spring I started a Stata application (written almost entirely in Mata) that needed to be able to read/write arbitrary files in both XML and JSON formats. Writing your own XML or JSON parser is exactly the type of thing you never want to do if you can avoid it, since doing it well is hard, and there are excellent, highly-tested libraries out there for most programming languages. But there I was, and not knowing that in ~3 months Stata 13 was going to be released with Java plugin capability, I bit the bullet. The job was made more doable by the fact that for this particular application, I only needed to support part of each spec. However, if I had known that Stata 13 was going to introduce Java plugins, I would definitely have waited.
Understand that in introducing Java plugins, Stata took a large step toward eliminating what had traditionally been an advantage of R over Stata. The fact that it is possible to link to code written in other popular languages (e.g., Python) from within R (and vice versa) makes possible projects like R/GRASS, which combines the statistical functionality of R with the GIS functionality of GRASS to facilitate geostatistical and spatial data analysis. By making it possible to extend Stata with Java code, a much larger pool of programmers and libraries are now available for developing new areas of application and new functionality. Ultimately, all Stata users will benefit (even if they don't write Java code themselves).
Now, as to the decision about whether to write a particular module in Mata or in Java (assuming it does not already exist in an available Java library), this is a decision that has has both objective and subjective/personal elements. Personally, my Java programming skills are not that strong, and so I would tend to lean toward Mata unless there were something that was obviously easier to do in Java (if Stata had chosen to add Python plugins instead of Java, my decision might be different). It is of course possible that certain tasks may be faster in one language than the other, and I'll be interested to read about such performance comparisons as they emerge. However, it is often the case that poor program design/implementation has greater implications for performance than properties of the languages themselves (assuming you are comparing among byte-compiled languages), and since the work I do generally doesn't require optimizing for performance, the decision about which l!
anguage to use for me will typically depend primarily on which is easier for a specific task. Of course, if you are working on a performance-critical application, this calculation would be different.
-- Phil
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/