Summary
I propose a default “Constitution for Governance of Open-Source Projects”.
Background
I recently got involved in the OSQA project, which is a fork of CNPROG, which in turn is a clone of the StackExchange Q&A forum software.
Note that the OSQA project has no formal “homepage”, or instructions on how to get involved. I only discovered by chance that there is a mailing-list …
Summary
A pattern for persisting generators is to turn them into pickle-able class objects. This is useful when you use generators for streaming training examples.
I would also try generator_tools, which might be a more convenient alternative to the pattern I describe. I haven’t used it yet.
Generators for streaming training examples
For machine learning, python generators are a simple idiom that make it …
Summary:
If you have text data (like a web scrape) stored in a MySQL database, and you want to share the data, mysqldump to XML using the –xml flag.
When fields are unlikely to contain tabs, an even simpler format is a tab-separated file, created using the –tab=path flag to mysqldump. path must be owned by the MySQL database user.
The Problem …
A script for automatically sorting graph curves, e.g. for gnuplot.
All standard YMMV disclaimers apply.
Update (20090324−2): According to John Millikin, the author of jsonlib, cjson is buggy and unmaintained. I will evaluate further and post a followup blog entry. My discussion with Dan Pascu, the author of cjson, corroborates these claims. I urge readers to read John Millikin’s comment.
Summary:
For quickly deserializing data in Python, use cjson.
simplejson is mysteriously …