WSJ article: When Combined Data Reveal the Flaw of Averages

An example of Simpson's paradox. [Read more]

This article was emailed to me from Jacob Weaver, an alumni of our department.

Investor sets eyes on "data-as-a-service" start-ups. Or should it be "Stat-as-a-service"?

Entrepreneurs in the social and real-time data sectors would do well to study up on Venrock’s Brian Ascher. At the spritely age of 43 with several big tech exits under his belt (Adify, DatAllegro, Unicru), the investor is bound to be a force in the global Silicon Valley for a while to come.

He’s interested in startups that follow what he calls the "data-as-a-service" model: gather tons of data by offering some service or tapping into a data pipeline, create an algorithm that you refine over time, and sell intelligence to clients on an ongoing basis. He sees big opportunities in social search and services that help consumers make fuzzy, subjective kinds of decisions--what movies to watch and where to go on vacation--areas where social graphs are particularly valuable.

Ascher’s driven to boost personal empowerment, wring out inefficiencies and help smaller independents entities take back Main Street from the Big Boys. Below is Part 2 of our interview with him. Click here for Part 1, in which he gives advice for entrepreneurs and investors navigating the tech markets in 2010.

[This article is sent to me from a Columbia Alumni, David Park, Ph.D., founder of you can explore enhanced children's books and personalize them just the way you want it!]

NY Times: "For Today’s Graduate, Just One Word: Statistics"

Here is an article from the NY Times that outlines the up-and-coming status of statisticians in today's world. Enjoy!

A make-over for your correlation matrix

This morning I received a pleasant surprise in my inbox. Taiyun Wei , a student from Hunan, China emailed me about his/her R package, corrplot. This package can produce a number of pretty "stylish" visualization of your correlation matrices. Next time I need to present such a matrix of mine, I may give it a make-over from the "wardrobe" of Taiyun's functions. More of Taiyun's art-like statistical graphs can be found here.

Election-fraud detection published an article on "Rise and Flaw of Internet's Election-Fraud Hunters: Benford's Law, Which Tests Numbers for Authenticity, Might Detect Vote-Rigging but Can't Prove It" that describes how one can use the limited information in the election count data to possbly detect fraud. Because of the limited nature of the data, some "model" has to be assumed, which can be wrong in the first place.