Author Archives: Joseph Turian

Joseph Turian has been working on artificial intelligence research since 1996. His focus is on using sophisticated machine learning techniques to approach large-scale problems in natural language. He is currently a post-doctoral research fellow at the Université de Montréal, studying deep learning methods with Professor Yoshua Bengio, Canada Research Chair in Statistical Learning Algorithms. Dr. Turian defended his dissertation, “Constituent Parsing by Classification” at New York University. He received his AB from Harvard University in Computer Science (cum laude).

Discussion 2.0: Personalization

[The fol­low­ing post is my sub­mis­sion to the Knight-Mozilla “Beyond Com­ment Threads” chal­lenge.]
The fol­low­ing are the core prob­lems with cur­rent dis­cus­sion systems:

Trolls, acri­mo­nious peo­ple, and low qual­ity com­men­tary can drown out thought­ful dis­cus­sion and destroy a good com­mu­nity.
Bias towards senior­ity: Deep insight is penal­ized if it comes from a new, unknown, or anony­mous voice. For exam­ple, on

Fat Free CRM in five minutes on a fresh Amazon EC2 micro instance

Would you like to get Fat Free CRM up-and-running, but spend only five min­utes on deploy­ment?
I am not a Rails hacker, so get­ting Fat Free CRM installed and run­ning is non-trivial for me.
fatfreecrm-ec2 will auto­mat­i­cally deploy Fat Free CRM on a fresh Ama­zon EC2 micro instance. I have also tested it on a fresh Ubuntu Lin­ode slice.
Caveat: The five min­utes will

NLP Challenge: Find semantically related terms over a large vocabulary (>1M)?

Sum­mary
In the spirit of shared tasks and NLP “bake offs”, I hereby announce the first MetaOp­ti­mize Chal­lenge. It’s an open prob­lem, and I am inter­ested in involv­ing prac­ti­tion­ers who want to demo their style, as well as peo­ple who want to learn some large-scale IR/NLP. Hope­fully, we’ll all learn some­thing about var­i­ous real-world approaches.
Join the announce­ment list

Information Organization: A case study in music recommendations

I intro­duce “infor­ma­tion orga­ni­za­tion”, an approach which I have been explor­ing for sev­eral years. As a case study, music rec­om­men­da­tions should be orga­nized, but exist­ing appli­ca­tions cur­rently orga­nize music rec­om­men­da­tions poorly. I dis­cuss issues with cur­rent appli­ca­tions, and dis­cuss fea­tures that address these issues.

Free consultation on data strategy (NLP, ML, business intelligence, etc.)

Sum­mary
Email me your pitch and how you need help mon­e­tiz­ing data.
If I like your pitch, I’ll give you a free con­sul­ta­tion on data strat­egy (NLP, ML, busi­ness intel­li­gence, etc.)
After­wards, if we both think that I can add value to your busi­ness, we can talk about a longer-term rela­tion­ship.
You should for­ward this blog post to any friend who could use

KEA Keyphrase Extraction as an XML-RPC service (code release)

Sum­mary
We release code writ­ten by Ali Afshar, which turns the KEA keyphrase extrac­tor into an XML-RPC ser­vice. This allows you to use KEA as a ser­vice, call­ing it from a vari­ety of dif­fer­ent pro­gram­ming lan­guages. The code is released under the New BSD License.

Back­ground
Keyphrase extrac­tion (AKA ter­mi­nol­ogy min­ing, term extrac­tion, term recog­ni­tion, or glos­sary extrac­tion) is the

PyLucene 3.0 in 60 seconds — Tutorial sample code for the 3.0 API

Until there is bet­ter doc­u­men­ta­tion for Lucene 3.0, I rec­om­mend you use Lucene 2.4 or 2.9. Nonethe­less, I pro­vide a basic index­ing and retrieval code using the PyLucene 3.0 API, per­haps the first such exam­ple code on the web.

Perhaps job hopping is a good thing?

Sum­mary
I spec­u­late that job hop­ping, if it becomes a wide­spread phe­nom­e­non, might actu­ally lead to improved busi­ness effi­ciency. In this way, the “Gen Y” job hop­ping phe­nom­e­non could ulti­mately prove beneficial.

Back­ground

Mark Suster begins the debate by writ­ing: “[Job Hop­pers] Make Ter­ri­ble Employ­ees”.
Paul Dix responds that job hop­ping is not cor­re­lated with employee qual­ity and there are

Code maintainability, and the joy of outsourcing

Sum­mary
Accord­ing to com­mon wis­dom, the best code is devel­oped in-house. I am begin­ning to believe this is only true when the code must be tightly cou­pled, or there are real­is­tic secu­rity con­cerns. These sce­nar­ios are less com­mon than man­agers like to believe.
For run-of-the-mill devel­op­ment projects, out­sourc­ing might have advan­tages above-and-beyond cost sav­ings. If your code effort

Lean Startup, and The Stooges

Okay, I’m ready.
After read­ing a hand­ful of arti­cles mak­ing ten­u­ous con­nec­tions between entre­pre­neur­ship and music, including :

The Noto­ri­ous CEO: Ten Startup Com­mand­ments from Big­gie Smalls
Being like The Sex Pis­tols can help your startup?

I’ve decided to come out and share my favorite startup music.
Dirt, by The Stooges, is a proto-punk cut that sprawls for seven-minutes, brood­ing and smol­der­ing. It