Open Mind is an open source effort to create a database of common sense concepts. I asked Catherine Havasi, Rob Speer and Ken Arnold about the project and their use of Launchpad.
Matthew: Open Mind is almost the stuff of 1950s science fiction. You’re building a database of the ambient knowledge that we all take for granted, is that right?
Catherine: Yup. We’re looking to capture that intuition about the world that people have but computers and devices don’t. When you’re walking down the street you don’t need to worry about whether you will fall through the sidewalk. When you tell a friend you bought some groceries at the store, you know not to explain to them how money works. This kind of knowledge, knowledge about the relationships between everyday objects and knowledge about people feel about situations and goals, is critical for pretty much any AI task: from knowledge management for companies to robots interacting with the world around them.
Rob: And while we do our day-to-day work on the short-term applications of our system, we do keep the idealistic long term of AI in mind sometimes. It’s interesting that you should mention science fiction — one of the things that indirectly led to me joining the group was the goal of creating a computer system that could hold a conversation, perhaps like Mycroft in “The Moon Is a Harsh Mistress” or at least one of his stupider siblings. Using what I knew about natural language processing, I tried making a silly little discourse system or two, but I could see a clear barrier to them ever saying anything useful, and that was their lack of access to common sense.
So from there I changed my goal to the lower-level goal of getting this common sense knowledge into a computer in the first place, and joined the Open Mind Common Sense project.
Catherine: I’ve actually never been much of a science fictiony person — I tend to read mysteries. However, I’ve been a part of collaborative projects on the internet since the ’90s — before Wikipedia — and I always wondered how to harness the power of this sort of project for something like artificial intelligence.
Matthew: What uses do you envisage for this data, or is that not yet important?
Rob: Oh, it’s important. If we didn’t have applications we wouldn’t have funding.
Catherine: Recently we have worked on a lot of external applications. A few years ago, we developed an inference algorithm called AnalogySpace, which uses linear algebra techniques to find patterns in our common sense data. We then take those patterns and extend them to our entire data set — it’s a great noise-resistant way to do inference. We’ve also got a new technique called “blending” which lets us reason over other datasets with common sense added in. This is a good first step to the dream of using Open Mind as a sort of semantic glue in AI and in human-computer interaction.
Rob: We use these techniques to develop tools for our sponsors that can deal with natural language text according to its meaning (and not just what words it contains), and experimental user interfaces that adapt to what they think the user’s goals are.
Catherine: Our machine-learning toolkit, Divisi, is open source, so other people can build applications that use these techniques as well. We’re trying to make our community more open for others to use our stuff in a wide variety of applications — that’s one of the reasons we’re using Launchpad.
Matthew: How are you populating the database?
Rob: Currently we populate the database through our public website. Anybody who logs into that site can contribute new knowledge or vote on existing knowledge. We’re also working on improving that site (which is the openmind-commons project in Launchpad).
Catherine: In addition, we’re working on games you can play online which would help populate and refine the database.
Matthew: How do you programmatically show links between concepts?
Catherine: We have a semantic network called ConceptNet, which is constructed automatically from common sentence patterns in our contributed data.
Rob: When someone has told Open Mind a statement of the form “You can use an X to Y”, for example, this lets us add the semantic link “UsedFor(X, Y)”. ConceptNet has hundreds of thousands of these links. The combination of Divisi and ConceptNet can be quite powerful for representing semantics in a programmatic way.
Matthew: Where does Launchpad come in?
Rob: At this point, doing research on Open Mind requires working with a rather complex set of code. Our group’s Subversion repository wasn’t cutting it any more. It was also becoming very difficult to introduce the code to new people (whether they are new members of the group or collaborators). Launchpad is what we’re using to organize and distribute our code, and to build community around it.
Catherine: We’re especially interested in making what we do open and accessible to everyone. We can’t do this alone.
Matthew: Why did you choose Launchpad?
Rob: The first decision was which revision control system we should use. We needed something that would enable a distributed workflow.
Ken: …so that one person trying out their crazy idea for a new representation wouldn’t have to choose between breaking everyone else’s code and leaving it uncommitted for weeks.
Rob: It was important to make branching not painful, so we could release packages such as ConceptNet more often than the coincidental times that every part of the system was in a stable state.
Ken: We also wanted to open up our processes to the community; some external observers have mistakenly thought that the project was “dead” because they couldn’t see our internal work.
Rob: We first experimented with Git a bit. Not everyone could get their head around it. It felt like black magic. When you don’t completely understand the system, you can’t be sure it’s doing what you want. (I recognize the irony of this statement coming from a bunch of AI researchers.) After looking at other options, we settled on Bazaar, which fit into our workflow well and let us continue to use the Subversion idioms we were familiar with.
From there, it became a natural choice to set up a Launchpad account. Now we have a reliable place to look for all our code — as well as a lot of other tools for organizing our development, and a way of keeping our downstream users in the loop.
We intend to use Launchpad to distribute releases of ConceptNet and Divisi, track bugs, answer questions from users, and possibly even keep tabs on our documentation. (Currently, it seems a bit inconvenient to manage documentation using Launchpad. I’ve filed bug #334688 about a minor fix to Blueprints that would help, but it would really be nice to have a “Documentation” tab or something.)
Other aspects of Launchpad seem promising also — for example, we could make our translation project more high-profile if we figure out how to bring it from our own Rosetta installation over to Launchpad, and the “Mentoring” system seems like a good idea for giving projects to undergraduates, who often take a long time getting their bearings and figuring out what they want to do. (I should know.)
About Catherine, Rob and Ken
Catherine Havasi: I started as one of the co-founders of this project back in 1999 as an undergrad at MIT. I’ve been with it on and off for nearly ten years now. I took breaks to pursue other projects, but kept coming back to Open Mind because of the need for common sense. I’m currently finishing a PhD at Brandeis and will be soon starting a postdoc at MIT working full-time on Open Mind.
Rob Speer: I joined this project as an undergrad in 2005. I then made it the subject of my graduate research starting in 2006. After a couple of research ideas that didn’t really go anywhere, I worked on updating our knowledge collection site to make it start doing the things our papers said it should do, including asking relevant questions to expand its knowledge. I gave the site the working title “Open Mind Commons”, which stuck. I’m working toward a Ph.D. now, with a goal of improving knowledge collection across multiple languages.
Ken Arnold: I’m a second-year Master’s student at the Media Lab. My first task my first year here was to rewrite the Open Mind Commons site in Python/Django. I’ve been maintaining it ever since (enough to get fed up with it and pine for a rewrite — again!). I also have been helping with Divisi. I’m mainly working on my master’s thesis now, but somehow I keep finding myself working on this stuff.