Swivel: Web 2.0 and Data

| 4 Comments

The latest craze on the internet is the migration of applications from desktop to the Web. The latest is "Swivel": the internet archive for data, something I have written about before. While there is not much to be seen at the site, TechCrunch has some intriguing snapshots:

I guess that one can upload the data, access data that others have posted, and perform some simple types of analysis. It might not sound much, but having a database of data will remove the need for people to provide summaries of it. Anyone interested in the problem can perform the summaries for himself. This will make data analysis much more approachable than before. This can also become competition to existing spreadsheet and statistical software, and a platform for deploying recent research: it is often frustrating for a researcher in statistical methodology how difficult it is to actually enable users to benefit from the most recent advances in the research sphere.

4 Comments

Wow---as the TechCrunch article explains, anybody can try to correlate anything with anything! This is the dream of data mining finally realized. In the past, the number of haphazard `does X cause Y' studies was limited by the work of gathering data and doing the math. But now, we'll have tens of thousands of people randomly linking any two columns of numbers!

Your applications make a lot of sense, and I can't wait to get fun data sets for educational purposes. But 1 in 20 statistically significant correlations are pure noise, so having the power to run millions of correlations between disparate data sets will produce tens of thousands of faux significant results.

Hi Samantha,

Thanks for the post on Swivel.

Most importantly, Go BLUE!!

Our philosophy about data analysis is that 'I'll know it when I see.' So, our approach with Swivel is to take a simple spreadsheet and pivot it and fold it into as many combinations as possible and then let folks cruise through it as if it were a photo album.

Computers are good and fast at grinding through loops like that and that's what Swivel does.

Brian Mulloy
CEO & Cofounder
http://www.swivel.com
UofM '97

B,

If your data are pure noise, one in twenty correlations will be statistically significant. This is not the same as your claim that "1 in 20 statistically significant correlations are pure noise." It's Pr(A|B) as compared to Pr(B|A).

Brian,

A few months ago I suggested this idea to some people I know at Google (I called it "Google data") but they had some reason why they didn't think it was a good idea. Apparently they'd thought of the idea also but decided it couldn't work out. I can't remember why, perhaps something to do with copyright.

This is a great idea. Most people underanalyze their data. Good examples are law enforcement (traffic alone has warehouses of data that are virgin, except for simple summaries). This idea is great.
dh

Leave a comment

Subscribe to Entry

Email:

Recent Comments

  • dwight hines: This is a great idea. Most people underanalyze their data. read more
  • Andrew: B, If your data are pure noise, one in twenty read more
  • Brian Mulloy: Hi Samantha, Thanks for the post on Swivel. Most importantly, read more
  • B: Wow---as the TechCrunch article explains, anybody can try to correlate read more