Has anyone used Bradford's Crane library to launch elastic map-reduce jobs from S3?

I've figured out most of what is needed to launch a cluster, but it is unclear to me what the natural process is to get input/output data to/from S3. Is that handled natively by the hadoop infrastructure or do I need to write support for file-push?

Basically, what I feel I need is a simple example of how to use the crane library to replicate what I'm already doing with with the default hadoop-ec2 and a clojure-hadoop jar. I'm interested in the additional ability to leave the cluster running so I can run cascalog or other interactive operations on the output data as well as to customize how I use clojure + hadoop on EC2 over time.

asked Sep 26 '10 at 19:49

Ian%20Eslick's gravatar image

Ian Eslick
16112

edited Sep 27 '10 at 14:46

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
467541105126

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.