Question Everything: models

Friday, July 6, 2012

Friday Links: DNA + dark matter, E.O. Wilson, skeleton racing

More Friday Links!

http://www.technologyreview.com/view/428391/revolutionary-dna-tracking-chamber-could-detect/

A really cool research venture: using DNA to detect dark matter. Deep sequencing technologies require you to compactly array DNA molecules (all of different sequences) on a solid surface, and biologists have standard techniques (PCR + sequencing) to uniquely identify a DNA molecule from any given spot in the array. Guess what? That's a great setup for detecting dark matter. The hypothesis is that the Earth should be plowing through dark matter as it revolves and/or rotates, assuming that dark matter is diffusely distributed.

Essentially, here's how the DNA dark matter detector works:

1) Earth rotates, brushing through dark matter in a predictable rhythm that varies directionally throughout the day

2) DNA molecules are arrayed on a gold sheet. Dark matter can knock gold nuclei out of the sheet and into the array of DNA molecules.

3) The gold nucleus cuts a swath through the forest of DNA molecules, severing them

4) severed DNA molecules fall from the array and are collected, then amplified by PCR and sequenced so the biologists can figure out exactly which DNA molecules were severed

5) since they know where each DNA molecule was anchored, they can put together the path that the gold nucleus took

6) Match the path with the direction from which you'd expect the dark matter to be coming from at the given time of day.

One word: awesome.

http://www.ted.com/talks/e_o_wilson_advice_to_young_scientists.html

E.O. Wilson, the famed evolutionary biologist who studies eusocial organisms, advocates the kind of cross-pollination exemplified by the above DNA-dark matter example. His message to scientists-in-training: learn broadly and collaborate broadly. Too many PhDs spend all their time doing experiments in a narrow field and never venture into other areas. But they are missing out: new discoveries are found in non-intuitive connections between different fields.

One idea I had relevant to MD/PhDs was inspired by E.O. Wilson's two strategies for doing good science:

1) Medicine shows the problems. Seek to learn all the problems (literally, ALL) then look for scientific phenomena that can explain the problems and provide a means to intervene

2) Science observes phenomena without necessarily knowing if they have consequences relevant to human well-being. Seek to learn all (literally, ALL) the phenomena then look for problems that might be linked to the phenomena and apply your knowledge to the problem

http://www.freakonomics.com/2012/06/29/how-to-make-a-better-athlete/

Every sport is a unique combination of agents (players + equipment) and rules that those agents follow (official rules + rules of physics). Fortunately, that's all you need to create a model of something, with the purpose of identifying the most important factors, and to predict and explain emergent properties.

Here, Freakonomics spotlights the Australian team in the skeleton, an Olympic sledding sport. They don't have career skeleton athletes (except one) and they don't have a chance to practice because, well, they're in Australia. You might think that the skill and practice time of the athlete are the two most important factors. In fact, that might be true in most sports. But if you carefully examine the rules of the game, you'll realize that there are two components to a race: the actual sledding, and the 30-meter sprint beforehand to get the sled going. Guess what? Australia has plenty of good sprinters. And as it turns out, using a little bit of modeling, you can show that the actual sledding is of minimal importance in terms of time. Skill might prevent you from wiping out, but as long as you stay on course you're not going to shave much time off with good sledding. So instead, they focused their efforts on finding really good sprinters and training them.

The result? An Australian qualified for the Olympics within 18 months, getting in about 1/10 as much practice time as a "career skeleton athlete." Very often, working smarter is 10x than working harder. Science knows best, and conventional wisdom fails miserably.

http://www.npr.org/blogs/thetwo-way/2012/07/05/156280450/kaboom-san-diegos-entire-fireworks-show-ignites-at-once

Haha all of San Deigo's July 4 fireworks goes off all at once:

Saturday, May 5, 2012

Blog launch; people aren't that racist

Hello! I heard it's a good idea to have a (flexible) purpose in mind when starting a new venture, so here are my reasons for starting a blog- I suspect these are common to many bloggers.

1) Share things I find interesting. Usually these involve things that defy conventional wisdom (e.g. segregation, see below)

2) Share things I'm working on to improve myself. Writing it out helps me think about it, keeps me honest, invites feedback, and maybe gives others some ideas. I have so many general goals, such as reading more, keeping a diary, getting comfortable talking to strangers, pre-defining my goals for the day/week/month, etc. But I hope that having a defined task such as a blog entry will help motivate me.

3) Improve my writing skills and stimulate my mind. Pushing myself to generate interesting blog posts will force me to look into things I'm unfamiliar with.

4) Eliminate any residual fear of having my ideas be judged by others.

So, now see below for my first real blog post!

__________________________________________________________________

Intro

In my free time I've been looking into the recent explosion of startups and non-profits offering online courses, including Coursera, Udacity, and edX. These offer full-semester-long courses, given by full professors at top colleges, complete with lectures, quizzes, problem sets, and final exams, and available to everyone for free. I'd argue these are going to be a lot more effective than traditional classroom learning, where half the students are on the Internet anyways NOT learning.

So right now I'm taking Model Thinking on Coursera, and I'd have to say the interface is a lot more engaging than the majority of in-person teachers I've had. The course is a personal project of Scott Page, a professor at the University of Michigan. And it's not just a video of the courses he normally teaches- you can see him directly talking to you, his pre-prepared slides, and the stuff he writes on the slides as he talks to you in real time. And yes, you can speed it up (2X max).

I hope that Model Thinking will help me think as a scientist and an intellectual, rather than just as a worm geneticist. I also like that models can lead us to unexpected conclusions, so I will share one of the first models presented in the class: Schelling's Model of Segregation.

The model

The question behind Schelling's Model of Segregation is: We all know that many (most) cities in the US are highly segregated, along lines of race, income, etc. Blacks might on average have 80% black neighbors, while whites might on average have 80% white neighbors. Why? Is it just because they are racist and like their own kind? You might think that if people want 80% of the people near them to look like them, and people are freely moving, then on average people will have 80% of their neighbors look like them. But let's do some modeling.

We have X number of people in a hypothetical city, and a grid of X homes that they can occupy. Each person is given the choice to move or stay, based on their neighborhood percentage of people who look like them, call that P. Let's say they will move to a new home if P is lower than a threshold T. In the real world, this means a person looks at his/her own neighbors, and gets a little spooked by the number of people who don't look like them, and moves. This is applied iteratively, since as one person moves that might cause others to move as well => this is a simulation.

Let's say that T is 30%. In a city of half whites and half blacks, that in fact is very much UN-racist. In fact, they would tolerate being in the MINORITY in their neighborhood. People only need 30% of their neighbors to look like them for them to stay. If you run the simulation (use a computer program), what do you get as the end result? >75% segregation. In other words, on average each person has 75% same-race neighbors.

How can this be? We already said that people don't mind being in the minority, and they are at most minimally-racist. How did we end up with major segregation without any other factors at play?

If we think carefully about the model, there are two tipping points that bias in favor of segregation. Essentially, the effect of any one person moving gets amplified.

Exodus Tip: A person moves out of a neighborhood. For their original same-race neighbors that may decrease their percentage below their Threshold T. For example 3/8 > 30% becomes 2/7 < 30%.
Genesis Tip: That person moves into a new neighborhood. For their new different-race neighbors, that may decrease their percentage below threshold T. For example 2/6 > 30% becomes 2/7 < 30%.

These are kind of common sense, but it's hard to fully appreciate the domino effect this can have.

Once one person moves, another moves, then another moves. The end result is segregation. In fact, if we only have the requirement that people don't want to be in the minority, i.e. people only want at least 50% of their neighbors to be of the same race, what happens? The end result is that 90% of a person's neighbors ends up being of the same race.

Finally, the kicker. What happens if people ARE really racist? What if they have a requirement that 95% of the people near them have to be of the same race? Here's another thing that completely defies conventional wisdom: you DON'T GET SEGREGATION!!

This blog post is getting long, so I'll let my readers figure out why HIGH LEVELS OF RACISM leads to situations where there is NO SEGREGATION.

Anticipated rebuttal

I'm sure you have a rebuttal to all this. That there are many other factors at play that might allow racism to be the primary explanation for segregation, i.e. racist laws, rent, gentrification, etc. Fair points- it is POSSIBLE for racism to explain segregation. But it is still valid to say that just because there is segregation arising from individual behavior, we shouldn't assume that individuals are racist.

Furthermore, this highlights the importance of models. This model left things like rent and gentrification out of the equation. It made some assumptions that may not be entirely true. But all that is besides the point. The act of laying out assumptions and taking them to their logical conclusion helps us think about a problem. In the end, we may end up keeping or throwing out some of our assumptions, and deciding that we need more data on the things we left out of the model. And in the end, we're all the better for it. We made progress.

___________________________________________________________

Conclusion

I hope that the models I learn the Model Thinking class will be fertile, i.e. they will be applicable to biology and medicine even though many of them were developed for economics and social science. A PhD is a fantastic time to explore all sorts of things that interest me and develop a variety of skills, because I have control over my own time. This will be an adventure, and I hope you'll join me.