Tag Archives: Python

PyLucene 3.0 in 60 seconds — Tutorial sample code for the 3.0 API

Until there is bet­ter doc­u­men­ta­tion for Lucene 3.0, I rec­om­mend you use Lucene 2.4 or 2.9. Nonethe­less, I pro­vide a basic index­ing and retrieval code using the PyLucene 3.0 API, per­haps the first such exam­ple code on the web.

Why can’t you pickle generators in Python? A pattern for saving training state

Sum­mary

A pat­tern for per­sist­ing gen­er­a­tors is to turn them into pickle-able class objects. This is use­ful when you use gen­er­a­tors for stream­ing train­ing exam­ples.
I would also try generator_tools, which might be a more con­ve­nient alter­na­tive to the pat­tern I describe. I haven’t used it yet.

Gen­er­a­tors for stream­ing train­ing exam­ples
For machine learn­ing, python gen­er­a­tors are a sim­ple idiom that make it

Fast deserialization in Python

All stan­dard YMMV dis­claimers apply.
Update (20090324−2): Accord­ing to John Mil­likin, the author of json­lib, cjson is buggy and unmain­tained. I will eval­u­ate fur­ther and post a fol­lowup blog entry. My dis­cus­sion with Dan Pascu, the author of cjson, cor­rob­o­rates these claims. I urge read­ers to read John Millikin’s com­ment.
Sum­mary:
For quickly dese­ri­al­iz­ing data in Python, use cjson.
sim­ple­j­son is mys­te­ri­ously