JOSHUA HOLBROOK flickr : github : twitter

SciPy2010, Day 4 (posted 01 Jul 2010)


ONCE AGAIN I ACCIDENTALLY SLEPT IN

So I missed a few talks this morning. As an aside, I’m the only one here that indulges in energy drinks.

…but I didn’t miss the 0MQ talk!

Another excellent talk by Brian Granger. The hilights:

  • Py0MQ is BLAZING FAST. XMLRPC and JSONRPC are orders of magnitude slower in terms of message throughput and latency.
  • 0MQ is FAIRLY SIMPLE. Brian likes to say it has an “afternoon-sized” API.
  • Py0MQ doesn’t block the GIL. For scientific applications, this is a BIG DEAL.

Also, here’s a dead simple example showing it off:

import zmq

c = zmq.Context()
s = c.socket(zmq.REP)
s.bind(‘tcp://127.0.0.1:10001’)
s.bind(‘inproc://myendpoint’)

While True:
msg = s.recv()
s.send(msg)

I think this really shows off what 0MQ does well, which (in my unrefined opinion) distills message passing to a fundamental core of awesome and then wraps it up in some sexy, sexy abstractions.

The most powerful part of the presentation, though, was the live demo. Fernando Perez wrote a simple 1.5-page example, and Brian had one server broadcasting messages to three different clients, all really really quickly. Hearing “fast and simple” was great, but actually seeing it like that was really powerful.

For you node.js hackers out there: Do you think zeromq.js would be useful?

Numerical Pyromaniacs: The use of Python in Fire Research.

‘nuff said. Well okay, not really:

  • http://iscaliforniaonfire.com/

  • Plotting data on top of video. While I think some might find it too busy, the idea of pairing quantitative, derived data along with video of What You Would Actually See. I want to try this now.

  • BlenderFDS is a pretty niche tool, but the idea of being able to export blender models for the purpose of engineering and analysis is PRETTY RAD

ForecastWatch

An interesting quote (paraphrased!):

“I would say python is more object-oriented than java, right up there with Smalltalk…Java is class-oriented, but, I mean, is an int in Java a class? No.”

*rubs chin*

Something more useful to you and me is Forecast Advisor, since it tells you who to pay attention to regarding weather forecasts.

Something interesting you can see: Check out the accuracies of

Fairbanks vs. Austin

Some locations are easier to predict, weather-wise, than others. Perhaps interesting is that Fairbanks seems to be roughly 5% less predicatable than Austin. I wonder what accuracy data would look like if plotted on a map? Might be a good excuse to learn BeautifulSoup

Something else that may be interesting is Suds, a fairly slick-looking SOAP library. My impression of What’s Cool is not using SOAP yourself, but other people do sometimes, and Suds makes it relatively easy.

Madagascar

I once took an interest in literate programming, but I never actually did it. The idea of being able to mash discussion/explanation and software implementation is definitely compelling—however, literate programming works on the assumption that the code itself is what people are truly interested in, and not merely the results. In the case of scientists running numerical simulations, this second case is usually what we’re dealing with.

Madagascar tries to do this mashing without incorporating the particular philosophy of literate programming. I found that the project seems a little light on the way of tutorials (I’m a sucker for those “hello world!” tutorials that certain projects are into), and I don’t know how I feel about scons (certain other people hate on most “modern” build systems, except maybe waf), but I think the idea is pretty sweet. I think I’ll have to try it out when I get home.

Statsmodels

The presenter, Skipper Seabold, is a GSOC participant.

The thing that popped out for me is SimpleTable, which is a component that outputs tabular data as multiple formats including LaTeX. This sounds pretty awesome. I’d like to see it mashed up with, perhaps, Elaine’s tabular if it doesn’t already have this functionality. Skipper signed up for the interpolation sprint, and I can kinda see why. I wasn’t particularly optimistic about this sprint, but after seeing that there has been some work out there on it, I’m starting to think that trying my hand at this tomorrow could be pretty rad!

Aside from this, I’m mostly learning just how unfamiliar I am with stats. Sad! If I stay in school until next Spring, I’m totally taking stats.

pandas

Pandas gives you some labeled datatypes, similar to R’s dataframes.

That settles it. There is definitely demand for some kind of awesome tabular datatype. In fact, there is a “Birds of a Feather” meeting on it tonight that I’m going to attend. Shit could be sweet. We’ll see.

A Digression

SciPy is a total sausagefest. I’m really not here for the ladies since I totally have a sweetie back home, but even I’m gonna comment when I notice that there are maybe 4 women in a group of roughly 150 men.

josh@pidgey posts]$ python -c’print 4./150.’
0.0266666666667

awesome. ;)

Hubble Space Telescope exposure time calculator

Calculating exposure time is important, because use of the HST is so expensive that astronomers really want to make sure that their shot’s gonna turn out. The original ETC was written in java and somehow became a web application on accident, and eventually collapsed under its own weight. So, STSI is rewriting it, right this time, in python using django for the web interface and a collection of other tools (some of them custom). The take-home lesson here is supposed to be that astronomers, and many scientists, suck at writing code, and structure (sufficiently generalized code, robust code, version control, extensive testing, docs and relatively extensive refactoring) needs to be imposed.

The most interesting thing here to me, here, is actually their extensive testing. The presenter quoted EIGHT THOUSAND tests. No wonder they designed a testing framework in the process.

Keeping the Chandra Satellite Cool with Python

(Lisa, this one’s for you.)

What they actually do mean is temperature-cool, not attit2de-cool. Apparently, the harsh radiation environment of outer space totally pwned their insulation.

So, what happens is this:

  • The satellite needs to move around so that one side doesn’t get too hot. Been to a bonfire? Yeah, like that. Oh, and you lost your jacket so it’s even worse.

  • Meanwhile, people still need to use the thing, so they need to move it around such that people can take their x-rays along the way.

Clearly, this becomes a problem of effort.

This talk focused on modeling the thermal environment of Chandra, and not actually solving the “traveling salesman”-esque problem introduced by bullet 2 given that information.

A good part of the issue was accessing metric ass-tons of data with which to work with. Apparently PyTables is actually pretty bangin’, and allows for pulling in 10 years and 300 gigabytes (and that’s compressed!) of datapoints at reasonable speeds. Past this, they basically implemented something reminiscent of a finite difference method with the physical positions of external thermistors as the points and lots of free variables (80 or so). From here, they used Sherpa/CIAO to curve-fit their model to the previously-mentioned ten years+ of data.

SpacePy

Because it was developed at Los Alamos, bureaucratic gears have to grind for a while before their website gets put up.

Some choice quotes:

“If you haven’t used IDL, I’d like to not recommend it.

“NASA cdf is like HDF5 or netCDF, except not written as well.”

“Perl…is perl ever user-friendly?”

While SpacePy itself looks pretty cool, and handy for astronomers, I have to admit I was distracted by this guy’s sense of humor, and most of what I got from it was that IDL and NASA cdf both kinda suck.

The last two talks didn’t really click with me, unfortunately.

Lightning Talks!

  • “Let’s make Python debugging on the console not suck!” Enter puDB. Available via pypi, it looks pretty frickin’ sweet and I’ll definitely have to give it a try. “PuDB: Debug like it’s 1992!”

  • Dill allows pickling of more than mere objects. I just watched this guy serialize the entirety of numpy, a class, and the history of an ipython session. Insanity.

  • Life technologies uses pyMC. The presenter is an “enthusiastic user.”

  • Brent does some sorta genome analysis to find methylation in plant genes. I don’t know what that means, but it sounds really cool!

  • scikits.image is pretty rad. It uses openCV on the back-end for sexiness, uses cython to make better bindings than the stock ones, and is (they say) way better for science than PIL. Also seems to have a nice image viewer.

  • Fwrap makes using fortran code from python less painful.

  • This guy used the Apple Push Notification Service to send notifications to his iphone, via python.

  • TEAL does something similar to Traits’ automagic GUI action, but is lightweight and looks like it’s from 1992–just how astronomers like it.

  • STSCI is looking to get a python-specific python distro up and running. They’re considering modifying Sage or SPD.

  • Fernando Perez suggested that there is interest in putting something like his datarray into Numpy. There’s gonna be a quick session on it tonight.

  • Guys, start using http://ask.scipy.org for your scientific computing questions, particularly with python! Enthought and the scipy people really want this to reach critical mass. The reason for this and not StackOverflow is to avoid the noise that is generated by the more mainstream python community: for example, searching “optimization” would come out actually showing something on finding minimums instead of the more common code optimization.

  • Goddard is pulling wildfire smoke data from satellite images using PIL.

  • This dude is using python and fortran to study storm surges using CFD techniques with shallow water assumptions and an adaptive mesh. There was a pretty sweet video showing a model hurricane shed vortices in the water.

  • Josh Hemann wants to collect publication-quality, full-featured examples of matplotlib that show more than just, “look you can totally make a pie graph! Foo, bar.” Basically, he wants to best R’s examples. If you have anything, you can email him at josh.hemannATroguewave.com for now.

  • Creative Digital Systems made a pretty sweet radar simulator. Unfortunately, the projector was too jacked up for us to demo the thing.

  • Peter Wang gave a talk called “Python Evangelism 101.” …you’ll just want to watch this yourself when it comes up on video. It was exquisite.

BoF: datarray

I’m still enthusiastic about what can potentially happen with this, though I’m a little unsure about things now. :/ I think, on the other hand, that we have agreed on a few things, mostly:

  • The current way of indexing has promblems, so the ticks functionality will most likely be moved from getitem() to a standard method, with getitem* working numerically as if it were a ndarray.

  • Having to call .axis.a.axis.b is teh lame.

  • Man there are lots of things we’d love to be able to do with these, but including them in the numpy core would probably be a bit much.

  • someone needs to work on this! For real.

I don’t know if anyone will take my efforts seriously here, but I’m gonna try. I just forked the project on github, and I’ll try to start following the scipy wiki.

We’ll see how it goes.

Friends:

ALL THE BLOGPOSTS STANDING IN THE LINE FOR THE BATHROOM: