Monday, September 13, 2010

Neato paper for stats workshop

Latent Variable Graphical Model Selection via Convex Optimization

Wednesday, July 28, 2010

Which type of programmer are you?

A great blog post recommended by Paul Litvak:

http://googletesting.blogspot.com/2010/07/code-coverage-goal-80-and-no-less.html

Thursday, July 15, 2010

Behavioral economics not a panacea

George being honest about the importance of BE.

Saturday, June 26, 2010

Bayesian Statistics and Philosophy of Science

I've been waiting for this one for a while:
Gelman and Shalizi: Bayesian Statistics and Philosophy of Science

Wednesday, June 23, 2010

Memory and Walking

Memory span and walking:
Memory and Walking

Monday, June 21, 2010

In my department, instead of doing research, we spend most of our time wooing each other with song. From Russ to Paul:

http://www.youtube.com/watch?v=Pwap79uy1G8

PhD student
PhD student (PhD student)
Dear Advisor, will you read my thesis?
It took years to write, even under your aegis.
It's based on research in decision science
And I need a job, so I want to be a Doctor of Philosophy,
Doctor of Philosophy.

It's an ample sample of buffet-goers
And manipulations on loaded dice throwers.
They fall victim to the sunk cost effect,
It's five years now, but I want to be a Doctor of Philosophy,
Doctor of Philosophy.

PhD student (PhD student)

It's a thousand references, give or take a few,
I'll be citing more in a week or two.
I can provide an abstract if you wish to inspect,
I can change it round and I want to be a Doctor of Philosophy,
Doctor of Philosophy.

If you really like it we can publish soon,
we could schedule my defense for this coming June.
Then submit a chapter to Management Science.
Hey, I need a break and I want to be a Doctor of Philosophy,
Doctor of Philosophy.

PhD student (PhD student)

PhD student - PhD student
PhD student - PhD student

Friday, June 18, 2010

Litvakian Gold

My friend Paul Litvak likes to write songs about people. Today I was fortunate enough to be the recipient of one:

"he was a fast machine / he kept his data clean / hes the best researcher i've ever seen / he had no guise / he told no lies / knockin me out with his effect size / discovers more than his share / never in err / i had some ideas, but already hes there

cause the lab was shaking / findings' no faking / my mind was aching / and data was raking because you -- got the data all night long / yeah you got data all night long"

Sheer brilliance. And it's all true too!

Wednesday, June 16, 2010

American intellectuals are overvalued

Outsourcing of grad students

Monday, June 14, 2010

Poem by Michael J. Mahoney

Pilgrim in Process

It's a season of transition
and you're on the move again
on a path toward something you cannot disown;
Searching for your being in the labyrinths of heart
and sensing all the while you're not alone.

Yes, you seem to keep on changing
for the better and the worse
and you dream about
the shrines you've yet to find;
And you recognize your longing
as a blessing and a curse
while you puzzle at the prisons of your mind.

For as much as you seek freedom
from your agonies and fears
and as often as you've tried to see the light,
There is still a trembling terror
that your liberation nears
as you struggle with the edges of your night.

For your Reason is a skeptic
and rejects what it desires,
playing hard to get with miracles and signs;
Till a Witness gains momentum
and emerges from within
To disclose the patterns well above the lines.

Then a window has been opened
and you've let yourself observe
how the fabric of your Being lies in wait;
And you want to scream in anger
and you want to cry for joy
And you worry that it still may be too late.

For the roller coaster plummets
with a force that drives you sane
as you tightly grasp for truths that will abide;
Never fully understanding
that your need to feel secure
Is the very thing that keeps you on the ride.

You survive the oscillations
and begin to sense their role
In a process whose direction is more clear
And you marvel as your balance point
becomes a frequent home,
and your lifelong destination feels like "here."

So with gentleness and wonder,
with questions and with quests
You continue on the path that is your way;
Knowing now that you have touched upon
the shores of inner life,
and excursions deeper can't be far away.

There will be so many moments
when an old view seems so strong
and you question whether you can really change;
And yet, from deep within you,
there's a sense of more to come
and your old view is the one
that now seems strange.

Take good care, my friend, and listen
to the whispers of your heart
as it beats its precious rhythm through your days;
My warm thoughts and hopes are with you
on your journeys
through it all . . .
and the paths of life in process find their ways.

Do be gentle, Process Pilgrim;
learn to trust that trust is dear,
and the same is true of laughter and of rest;
Please remember
that the living is a loving in itself,
And the secret is to ever be in quest . . .

Mahoney, M. J. (2003). Pilgrim in process. In M. J. Mahoney (Ed.),
Pilgrim in process (pp. 61-63). Plainfield, IL: Kinder Path Press.

Saturday, June 12, 2010

Feminist Hulk

Feminist Hulk

This is quite entertaining.

Wednesday, June 9, 2010

MLOSS

Machine Learning Open Source Software

JMLR: Massive Online Analysis

Massive Online Analysis

Monday, May 31, 2010

Israel attacks Gaza aid fleet - Middle East - Al Jazeera English

Saturday, May 22, 2010

Mathiness

"I was introduced to the expression by a mathematician who was an expert in the many hierarchies of mathematical logic, typically infinite sequences of types of sets definable by some class of formulas. Each step is defined by some critical bump in complexity or definitional power which can't be achieved at a lower level, and then one looks at what it takes to get beyond the whole sequence. One of the prof's PhD students, working in this area, punned on the Latin by titling his thesis Ad Astra Per Aspera.

Once you go high enough into one of these hierarchies, called the projective sets (of real numbers, or subsets of higher dimensional R^n), there are all kinds of interactions with the highest infinities. Assume bigger infinites and you get more structure and organization "down below". There are a bunch of mathematicians for whom this is the holy grail, to figure out how far out to go into the infinite, based on these more "concrete" consequences.

Others think this is pure moonshine, kind of a mathematical ideology. The originator of this line of thought, though, was Kurt Godel, who was a kook and believer in the reality of the mathematical infinite (he also starved himself to death in Princeton after his wife died, he was paranoid about people poisoning him I think). So the research programme has the sanctification of genius, and that goes a long way in math."

Thursday, May 20, 2010

Reinventing the Wheel

This has been exactly my sentiment for some time. Unless you build a system from the ground up, you don't understand it and cannot improve upon it:

Reinventing the Wheel

Wednesday, May 19, 2010

Computer Composers

Apparently the livelihood of musicians is at stake; fear the robots! They are also trying to take over the jobs of scientists with Machine Learning.

Computer Composer

Tuesday, May 18, 2010

Bayesian vs. Frequentist analysis

Reasons besides the use of subjective priors why Bayesian and Frequentist approaches are different:

"There is a popular myth that states that Bayesian methods differ from orthodox (also known as “frequentist” or “sampling theory”) statistical methods only by the inclusion of subjective priors that are arbitrary and difficult to assign, and usually do not make much difference to the conclusions. It is true that at the first level of
inference, a Bayesian’s results will often differ little from the outcome of an orthodox attack. What is not widely appreciated is how Bayes performs
the second level of inference. It is here that Bayesian methods are totally different from orthodox methods. Indeed, when regression and density estimation are discussed in most statistics texts, the task of model comparison is virtually ignored; no general orthodox method exists for solving this problem.

Model comparison is a difficult task because it is not possible simply to choose the model that fits the data best: more complex models can always fit the data better, so the maximum likelihood model choice would lead us inevitably to implausible overparameterized models that generalize poorly. “Occam’s razor” is the principle that states that unnecessarily complex models should not be preferred to simpler ones. Bayesian methods automatically and quantitatively embody Occam’s razor (Gull 1988; Jeffreys 19391, without the introduction of ad hoc penalty terms. Complex models are automatically self-penalizing under Bayes’ rule."

MacKay-Bayesian Interpolation

Thursday, May 6, 2010

Gelman comments on pointless and unethical decision research

Even if we knew this theory was true with certainty, how would it help us at all?

Gelman comments on pointless and unethical decision research

Wednesday, April 28, 2010

Difficulty in obtaining data for secondary analyses

A whopping 73% of people won't/cant' share their data.

The poor availability of research data for reanalysis has just become much poorer

Tuesday, April 27, 2010

This about sums it up

“I'll swallow a lie when I have to; I've swallowed a few big ones lately. But the stat games? That lie? It's what ruined this department. Shining up shit and calling it gold so majors become colonels and mayors become governors. Pretending to do policework while one generation fucking trains the next how not to do the job.”

Thursday, April 22, 2010

Green Ray T-Shirts

GreenRay T-Shirts

Very cool designs, some of which were done by one of my friends (seen modeling the "Matrix Ray").

Wednesday, April 21, 2010

Genie and Smile

Thank you UPitt for free decision analysis sofware:
Genie and Smile

Zombie Students

If I teach, I'm gonna open up the course with a comment on this.

Kaggle - a platform for forecasting and data mining competitions

Kaggle - a platform for forecasting and data mining competitions
This is where we need to go.

Wednesday, April 14, 2010

Beyond Awesome

Machine learning at its finest:

MLComp.org

Saturday, April 10, 2010

Inkjet Skin Printer

Skin Printer

Friday, April 9, 2010

Video game developers are plotting against us

http://www.cracked.com/article_18461_5-creepy-ways-video-games-are-trying-to-get-you-addicted_p1.html

Hat Tip: Cosma Shalizi

Monday, April 5, 2010

Wikileaks of Reuters reporters murder

http://www.youtube.com/watch?v=UaqY12VHFv4&feature=player_embedded

Sunday, April 4, 2010

Seeing meaning in random market fluctuations

I don't know the exact details, but almost every day on the business headlines I read either that the economy is headed toward disaster because some large corporation fell below its quarterly profit expectation, or that we are headed toward prosperity because a company exceeded it.

For example, from Bloomberg, "Best Buy, the largest U.S. electronics retailer, is among companies seeing demand pick up. The Richfield, Minnesota-based merchant last month reported fourth-quarter profit that exceeded analysts’ estimates as discounts helped boost sales.

Carnival, the biggest cruise-line operator, last month raised its full-year profit forecast as ticket prices rebounded from 2009’s lows amid more bookings."

Are there confidence intervals around these forecasts? Merely exceeding analysts' estimates doesn't seem like news. Furthermore, they provide an explanation--the treatment effect of discounts--that can explain this effect independently of economic recovery.

I guess I don't see how this is news. Does anyone know more about profit forecasting? Are they point estimates or confidence intervals? The same thing with the Dow Jones Index: people seem to interpret random fluctuations as meaningful with absolutely no accounting for margin of error in estimates.

Thursday, April 1, 2010

Spindle Cell wonder

http://neurocritic.blogspot.com/2006/07/spindle-neurons-next-new-thing.html

Laundry folding robot

http://www.eecs.berkeley.edu/~pabbeel/personal_robotics.html

Monday, March 29, 2010

Feynman Cargo Cult Science:

http://www.lhup.edu/~DSIMANEK/cargocul.htm
"All experiments in psychology are not of this type, however. For example, there have been many experiments running rats through all kinds of mazes, and so on--with little clear result. But in 1937 a man named Young did a very interesting one. He had a long corridor with doors all along one side where the rats came in, and doors along the other side where the food was. He wanted to see if he could train the rats to go in at the third door down from wherever he started them off. No. The rats went immediately to the door where the food had been the time before.

The question was, how did the rats know, because the corridor was so beautifully built and so uniform, that this was the same door as before? Obviously there was something about the door that was different from the other doors. So he painted the doors very carefully, arranging the textures on the faces of the doors exactly the same. Still the rats could tell. Then he thought maybe the rats were smelling the food, so he used chemicals to change the smell after each run. Still the rats could tell. Then he realized the rats might be able to tell by seeing the lights and the arrangement in the laboratory like any commonsense person. So he covered the corridor, and still the rats could tell.

He finally found that they could tell by the way the floor sounded when they ran over it. And he could only fix that by putting his corridor in sand. So he covered one after another of all possible clues and finally was able to fool the rats so that they had to learn to go in the third door. If he relaxed any of his conditions, the rats could tell.

Now, from a scientific standpoint, that is an A-number-one experiment. That is the experiment that makes rat-running experiments sensible, because it uncovers the clues that the rat is really using--not what you think it's using. And that is the experiment that tells exactly what conditions you have to use in order to be careful and control everything in an experiment with rat-running.

I looked into the subsequent history of this research. The next experiment, and the one after that, never referred to Mr. Young. They never used any of his criteria of putting the corridor on sand, or being very careful. They just went right on running rats in the same old way, and paid no attention to the great discoveries of Mr. Young, and his papers are not referred to, because he didn't discover anything about the rats. In fact, he discovered all the things you have to do to discover something about rats. But not paying attention to experiments like that is a characteristic of
cargo cult science."

Why it is important to record and report every auxiliary and decision made

The key ability of others to interpret and replicate experiments relies on the replication of the exact same decisions to result in the exact same auxiliary hypotheses. If these minute details are not reported, and they matter, then replication will fail and resources will be wasted searching for the proper design to replicate the previous experiment. Furthermore, as the research field builds, a standard set of decisions/auxiliaries builds and the decision-making process becomes easier and easier.
If we want to know the probability of our result replicating, we can use Prep. (Killeen, 2008; In Goode, 2008). Prep is P(d2|d1), however, it requires that study 2 is exactly identical to study 1. Thus, it is impossible for anyone to expect to replicate a study with probability= Prep if they do not know the exact details involved in the study. For example, “Many of the current conventions for sample size may be credited to Jacob Cohen’s (1988, 1992) tireless efforts to improve the quality of social science research. His voice is joined by a large group of methodologists deploring the failure of researchers to fully report the details needed to put the findings in context with the rest of the field. Therefore, an empirical study based on a limited sample is further weakened by superficial reporting that will not allow fellow scientists to replicate the complete study or even duplicate the calculations. This is a pointed cautionary note for novice researchers who have not yet grasped the nuances of the decisions to be made.” (137) (Peterson, 2008; in Goode, 2008).

Thursday, March 18, 2010

Proofs and refutations

"Delta, I am flabbergasted. You say nothing? Can't you define this new counterexample out of existence? I thought there was no hypothesis in the world which you could not save from falsification with a suitable linguistic trick."

Al Jazeera English - Europe - Electric torture show shocks France

Tuesday, March 16, 2010

An algorithm for existential abduction

The first part of existential abduction involves generating ill-defined elements of a space of possible explanations of a phenomenon. One should proceed as follows: Generate theories that seem most plausible. Keep generating theories until some function of the marginal plausibility of the next theory and the marginal cost to generating the next theory (cost is in units of time) are equal. This assumes that plausibility can be monetized (as time can easily be monetized) and that people are able to think of theories in descending order of likelihood.

Existential abduction

My thesis will cover two of the most fundamental points of science: 1) How we should and do develop structural causal models of the world (existential abduction; Haig, 2005) and 2) how we should and do test these models and evaluate the uncertainty associated with them (induction). Search algorithms (also Teddy’s mention of when a model should have an additional parameter from Jevons?) can help elucidate the first question, and Bayesian meta-analysis, the second.
Exploratory factor analysis is one of the only tools for existential abduction (Haig, 2005). I want to propose new tools and procedures. As we abduct and enumerate possible causal explanations for an effect (phenomenon), we assign probabilities to each one; these are our priors. How we create a well-defined, mutually exclusive and exhaustive set of abductive inferences (hypotheses) about a phenomenon is currently unknown, and this dissertation hopes to illuminate. Each element—well-definedness, exclusivity, and exhaustiveness—are extremely difficult to achieve on their own. Thus existential abduction is difficult.

Tuesday, January 12, 2010

Reporting Misconduct

I'm going over the newly required course on research conduct from the nsf (https://www.citiprogram.org/Default.asp), and I love the video on reporting misconduct. The upshot is that if you are a grad student or junior faculty and report misconduct, your career is over.

Monday, January 11, 2010

How are my ideas different from a traditional meta-analysis?

Traditional meta-analysis aggregates the results of several studies and weighs them according to the precision (inverse of variance) of the study, and if there is between-study heterogeneity (random effects model), the precision of the group of studies.

However, each study may also be a biased estimate of the desired outcome to be measured. RE meta-analysis treats all studies the same in this way: the greater heterogeneity among studies, the more each study is uniformly penalized. My interest, and recent statistical developments, point instead to estimating the bias of each study and correcting for it in two ways: 1) assigning the weight to each study according to its accuracy, and 2) attempting to correct for bias.

Now that this framework is clear, the question is: How do we estimate bias and correct for it? I am particularly interested in people's subjective judgments of bias. Can they tell whether a certain experiment or piece of evidence will be biased, and if so, how do they integrate this information into judgments and decisions? There are also empirical methods of estimating bias, such as quality scores (e.g., proper randomization) and meta-epidemiology.