This isn’t p-hacking

*Disclaimer: Bryan is not a weather vane.

For the past half a year or so, Bryan has been living only partially in Seattle, and is often gone more than he is here. When he is in town visiting or working, he often complains that it’s gray and rainy, even though we are now deep into summer and the oft-promised bone-dry weather has materialized in the way of 90+F degree heat waves. I mostly dismissed his complaints as biased, or anecdotal, since I had been largely experiencing summer bliss. But when I woke up this morning to a sniffly nose, hazy skies, and a negative attitude, I thought, let’s see what the data has to say about this.

The question I am trying to answer is whether it rains more when Bryan is in town.

Variable x: Bryan is in town
Variable y: It rains in Seattle

H0: The two variables are independent.
H1: The two variables are not independent.

Statistical testing

I will compute the Phi coefficient. At α = 0.05 and df = 1, the Chi square critical value is 3.84, above which I will reject the null hypothesis.

Data collection

The time window of interest was chosen as 2016.3.1 — 2016.8.29 (today), a span of 182 days. This corresponds to the time period in which Bryan began traveling away from Seattle.

Weather data was obtained from the NWS Seattle forecast office’s handy-dandy website. The value of interest is precipitation amount. I was able to sample this as a binary variable: whether or not it rained at all that day. The value “T” (meaning trace amount), was interpreted as rain.


Contingency Table

Rain No rain Total
Bryan here 32 31 63
Bryan gone 37 82 119
Total 69 113 182

Mosaic plot of contingency table

Probability of rain
When Bryan’s here: 51%
When Bryan’s gone: 31%

Amount of rain per rainy day
When Bryan’s here: 0.24 in/day
When Bryan’s gone: 0.15 in/day

Upon initial inspection, it certainly seems like it rains a lot more when Bryan’s in town! But is it statistically significant???

N: 182
Phi coefficient: 0.19
Chi square value: 6.79

The Phi coefficient is less than 0.3, indicating low to no association between the two variables. However, this interpretation is incomplete because neither variable is very evenly distributed. The Chi square value of 6.79 is much higher than the critical value of 3.84, so it looks like we can comfortably reject the null hypothesis at the current alpha.

Woohoo! It looks like it does statistically rain more when Bryan is in Seattle!

Notice that I did not equate correlation with causation; if I did, I would say something like: “Looks like having Bryan around really helps predict the weather in Seattle! Let’s keep him away so we can experience a 40% increase in sunny days!” Unfortunately, this kind of interpretation is far too common.

I’m not so eager to throw in the towel yet. I will keep fishing until I have enough to hint at causation 😉

Cognitive insight?

Using the Watson Personality Insights service demo, I analyzed all of my writing output from classes I took last quarter. Specifically, BIME 530: Introduction to Biomedical and Health Informatics, and BIME 535: the Clinical Care and Informatics class taught by Dave.

Now, these classes, although overlapping on quite a bit of material, were existentially different. BIME 530 was content heavy, and we strove to talk about what we learned and our thoughts about certain technologies. BIME 535, on the other hand, really tried to dig at our emotions and motivations. In fact, the majority of my writing samples for 535 were from a weekly journal, very loosely formatted entries about not only what we discussed in class, but the act of learning itself. These entries were “reflective,” or a consequence of meta-cognition. So I sort of think of 530 as an expression of my external self, the one I want to project in work and life, while 535 represents my internal state, perhaps an amalgam of things that are important to me deep down.

Parsing all this text (about 10,000 words each) through Watson provided the following “insights”:

BIME 530
BIME 535

The output for BIME 530 is on top and 535 on bottom. Some of the results are similar between the two: I’m right on the cusp between introversion and extroversion, I like being challenged and challenging authority, and I’m a bit self-important. And some of the differences are also expected, i.e. I express far more emotional range in 535 when I am asked to journal free form and talk about my feelings. But one of the surprising things to me is the difference in Needs. In 530, I emphasize my ideals. While in 535, I emphasize harmony and structure. I suppose I’ve always had a belief that in life, one should present a strong, united, progressive front, demanding more than one is perhaps willing to accept. Perhaps in doing so, I neglect my strong underlying need for everyone to just stop fighting and get along.

Of course, this is conditional on my belief in this type of analysis. It may place too much emphasis on the words I use rather than the context in which I use them. But then again, a person’s vocabulary probably says quite a bit about who they actually are.

Production or Expert

It seems that the assumptions we make in our attempts to model decision-making may be necessary but problematic. The notion of conditional independence for example, the idea of there being many properties and entities that do not affect the scope of our decision, is important not only for computational tractability, but for model modularity as well. Without it, none of the current techniques have much hope. However, the assumption is neither apparent in reality, nor particularly representative of the way we human make decisions.

Chaos theory suggests that even tiny changes or relationships between antecedent variables can lead to vastly different consequences. Therefore, it would seem that even small differences between our models and reality would lead to irreconcilable errors in decision output. On the other hand, decision networks don’t quite function like human cognition either. We seem to have come to a conclusion that heuristics cause biases and biases are bad. But I want to disagree. Heuristic-caused biases have developed in human cognition because they are computationally cheaper and quite successful for the most part. Why do we have availability biases and risk aversion? Because it makes sense to. In the mid-Atlantic states for example, when you see joint pain, especially in the summer, it’s good to think Lyme disease. There is a high likelihood of it! Even if the probability is overestimated by a physician, it’s better to overestimate the likelihood of a dangerous disease than underestimate, fail to treat and deal with negative consequences.

Getting back to decision making and expert systems, the simplifications we make, by minimizing the scope of a model as much as possible, or by collapsing the relationships between nodes into discrete types with no room for interpretation, we are perhaps sacrificing too much. Yes, I do think there needs to be a limit to the size and scope of a problem, and it would make no sense to attempt to replicate reality. Not only would such a thing be prohibitively expensive, it also wouldn’t necessarily give you the right results. After all, reality isn’t deterministic! So perhaps the harder question is how to correctly define the domain of all possible future states, in order to create a system that would be able to assign a probability to each of them.

Also, problem solving isn’t inherently rational. There is neither a correct solution, nor a correct way to obtain such a solution. I still find this to be a foundational issue. We are applying a rational method to an often irrational task. We ask, what is the right thing to do at this moment in time, with all of the information I currently have? And maybe that just isn’t the right question to ask. Maybe there is something in human cognition that is quite unique, this way of collapsing the past, present and future down into an understandable frame. And perhaps we can do something similar with machines, a way to code decision-making not only based on facts, logics and known probabilities, but a small degree of chance and intuition as well.

Note: Taken from my BIME 550 discussion posting

How many x-rays do I need?

Since working in a company that specializes in radiation therapy, I find myself more aware of radiation exposure through medical devices. I actively try to shield myself from exposure in the GammaKnife lab; I duck behind others in the CT room; and I avoid unnecessary x-rays and other forms of imaging using ionization radiation. Since imaging procedures are a source of substantive exposure to ionizing radiation,1 I prefer to reduce my cumulative exposure as much as possible.

Radiation from all of the imaging plus what I’ve received from the transatlantic flights I’ve taken probably still don’t add up to too much. Yet I have a hard time being comfortable around these machines. There is very little accessible information. How many procedures do we need to get enough information for diagnosis or treatment? Are images necessary to prescribe a course of action? I find that I am now unable to trust most of my medical providers or their administrative staff. Every recommendation for a diagnostic procedure or treatment sounds to me like a sales pitch, one leaving me always in the uncomfortable situation of having to refuse the offer.

I am just here for my routine procedure, I want to say. And no, just because my insurance company will pay the full cost of the procedure does not mean I’m any more interested in doing it. That’s a horrible way to think about it, that providers should do all they can to squeeze money out of an insurance company, that patients should do all they can to use their insurance benefits, and the insurance companies must fight the abuse from both sides tooth and nail by looking for mistakes and shirking on coverage. I do not want to be caught in the middle of this complicated financial war, where the benefits may not outweigh the cost.

I went to the dentist this morning at a new clinic (first time since moving to Seattle). Upon entering the exam room, I was told that I would be starting with a panoramic radiograph, essentially a CT for my mouth. Since I had never had one done before, I asked about relative radiation exposure. That was when I learned that the dentist also does a 7-image bitewing set in addition to the panorama, and that no one in the office really knew how much exposure I would have from the set of procedures. I decided to refuse all images until I had a change to investigate further, and suggested that I could have my previous set of bitewings (from 6 months ago) sent over if the dentist wanted the information.

Most dentists recommend bitewings for their patients once a year. Coincidentally, this is how often dental images are reimbursed by most health insurers. Do dentists prescribe images because the data supports that yearly exams are key, or because that is how often they are paid for them? I am not arguing against dental radiography, I simply think there is so little information out there to help me, the patient, decide which imaging procedures to get and which ones to refuse. After all, imaging does carry and innate risk, and dental x-rays in particular may contribute to increased incidence of brain cancer.2 I want to make sure the benefits I get from these procedures outweigh those risks.

So what are the risks? Here are the ranges of effective doses calculated by Ludlow et al in 2008.3 These values were significantly higher than previous suggested doses, which used lower ICRP (International Commission on Radiological Protection) proposed weighting factors for tissues. These days, with digital x-rays, the effective dose will be even lower for bitewings, somewhere around 10-20% of the values below. This reduction does not apply to the panaramas, which are already digital.

Effective dose* (mSv) Cost ($USD)
Single bitewing x-ray 0.005 15-25
Four bitewings 0.020 50-100
Full mouth series
(18 bitewings)
(0.388 for slower film speeds)
Panoramic 0.014-0.024 100-200

Not so bad. These doses are fairly reasonable. To put things in perspective, a single bitewing gives you excess radiation equivalent to what you would receive from a week in Denver. The full mouth series is approximately equivalent to a chest x-ray, although of course, the tissues exposed are different. Whether or not these values are precise or correct remains to be seen. It seems that each study produces some variation on these doses.

Ultimately, it is up to the patient to determine whether or not these are reasonable risks. Radiation exposure is cumulative, and one should be aware that the effects are seen over a lifetime. It may be prudent to assess the benefits of each imaging procedure, and realize that not every one may be necessary to proper treatment and care.

*What is effective dose? It is the average dose weighted by the types of tissues receiving that dose. Different tissues are more or less sensitive to radiation. For example, the gonads, thyroid, bone marrow, and other glandular structures are more sensitive and will be weighted more, and hard bone surfaces which are less sensitive are weighted less. The effective dose gives a better sense of how much increased risk accompanies the particular radiation exposure.

1Fazel, R. et al. “Exposure to Low-Dose Ionizing Radiation from Medical Imaging Procedures.” The New England Journal of Medicine. Vol. 361, pp. 849-857, 2009.

2Claus, E. B. et al. “Dental x-rays and risk of meningioma.” Cancer. Vol. 118, Iss. 18, pp. 4530-4537, 2012.

3Ludlow, J. B. et al. “Patient Risk Related to Common Dental Radiographic Examinations.” The Journal of the American Dental Association. Vol. 139, pp. 1237-1243, 2008.

Who’s job is it?

I have always operated on the principle that it is part of the physician’s job to educate the patient. Before applying any tests and treatments, my doctor needs to make sure that I understand what it is that I am above to experience and any possible consequences (including negative ones). However, is this asking too much?

Is it the physician’s duty to educate the patient, or just to provide the resources to help the patient educate him/herself?

We talk a lot about bedside manner, yet with all the things competing for a physician’s attention, does making the patient feel safe and supported really fall under the physician’s jurisdiction? As a patient, do I really want a physician who can talk the talk and make me feel like he/she’s best friend material? Or would I rather go with the quiet, slightly-autistic introvert who’s aced all their exams, know their procedures head to tail, and have the numbers to back up their cred.

Communication is important. But maybe it’s not the physician’s job to provide it.

Disagreements between AIs

Consensus formation between humans is achieved by making arguments supported by facts, observations, and in some cases, emotions (although one could say these are poor arguments). The consequence of disagreement may be:

  1. no party prevails: neither individual changes his/her belief state; no one is convinced
  2. one party prevails: an individual changing his/her opinion and acquiring new beliefs
  3. both parties prevail: both individuals abandon their beliefs in the creation of a new idea

How do AIs move pass these types of disagreements?

In hierarchical groups, this is relatively easy. The machine (or human) in the position of authority holds sway.

In parallel, non-hierarchical groups of three or more machines, consensus can be reached by voting.

However, consequences 1 and 3 above do not seem to be well represented. The example I can think of is two eyewitnesses disagreeing in court over a course of events. Both witnesses are convinced of what they saw or heard, which due to the imperfection of senses, may well be the case. Since the consequence of this disagreement matters (that is, it affects the sentence of a third party, the defendent), there must be a way to decide who’s version of the story is closer to the truth. We employ an “impartial” jury for this task. So perhaps a similar adjudication system is necessary to merge competing AI belief states.

Busted venn diagrams

“Is medicine an art or a science?”

This was a subject that came up during class discussion last week. Instead of debating whether or not this is true, I would instead argue that much of science is in fact an art.

Art — n.
1. The expression or application of human creative skill and imagination… (works produced by human creative skill and imagination)
2. A skill at doing a specified thing, typically one acquired through practice

Science — n.
The intellectual and practical activity encompassing the systematic study of the structure and behavior of the physical and natural world through observation and experiment

The above are abridged excerpts from the New Oxford American Dictionary, selected to bolster my argument. The second part of the definition of art draws its own bridge to science. Science is a skill, and is certainly helped by practice.

The more difficult comparison comes in the use of “creative skill and imagination.” I believe the formation of scientific theory involves a lot of grappling in the dark, in an attempt to formulate explanations for seemingly contradictory observations and natural behaviors. This guesswork, or hypothesizing, calls upon human ingenuity and creativity, and cannot be supported by rationality alone.

I am not suggesting that science is a subset of art, just that the theoretical backdrop may in fact fit inside art’s bubble. The practical whole of science has always felt much more universal, capable of being understood by alien beings in a way that art seems less likely to be. I venture that a significantly larger proportion of the intelligent life out there in the universe practice more of something like science than something like art.

Healthcare expenditure in the US and Co.

Sometime last week, I spent about one to two hours trying to answer a few questions. I wanted to know how much the US was spending on healthcare (HC) and what percentage of that HC spending was going into health IT (HIT) infrastructure. This was mostly for my own curiousity, an attempt to answer the question of whether we were spending a proportionate amount of our HC funds on IT as compared to other nations, or whether we had outrageously disparate statistics. My intention was not to do a cost analysis, but just to see what was out there, reported by various agencies. I assumed that a lot of the numbers would be public information, especially those pertaining to government investments in HC.

The entire process was a bit of a failure. This first table is straightforward enough; it shows that healthcare expenditure has been increasing steadily over the last decade.1 Since 2009, the percent of GDP spent on healthcare has stabilized around 17.3%. However, this is expected to increase this year and further in 2015 as more terms of the ACA go into effect. Some projections predict the per capita spending to increase to upwards of $13,000 by 2020.2

Year US population
HC expenditure
(trillions $USD)
of GDP
Per capita
2020 ~4.4 ~19.2 ~13,000
2017 ~3.7 ~18.4 ~11,000
2013 316 2.9 17.3 9,177
2012 313 2.8 17.2 8,915
2011 311 2.7 17.3 8,658
2010 309 2.6 17.4 8,411
2009 306 2.5 17.4 8,170
2008 304 2.4 16.4 7,936
2007 301 2.3 15.9 7,649
2006 298 2.2 15.6 7,264
2005 296 2.0 15.5 6,889
2004 293 1.9 15.5 6,508

But this was not the more interesting question for me; that was how much we spend on HIT out of all these trillions. These numbers are much harder to find. The US does not provide socialist healthcare, so HIT spending can only be tracked through hospital budgets and third-party provider revenues. I was not about to get into this, but I looked around and found a few reports generated by governmental entities and private consulting firms. Since these is no established way for calculating these numbers, a lot of them, especially the projections, vary greatly between sources. The $19.3 billion and $16.6 billion for 2017 and 2012 for example, come from a Deltek report generated in 2012, “Health Care and Social Services Market, 2012-2017.”3 A RNCOS report generated a year earlier states that HIT spending in 2011 could amount to $40 billion and was growing at the rate of 24% a year.4 The HIT spending for 2010 and 2011 in this table below also come from the RNCOS report.

The numbers are so disparate that it is hard to make the case for one over the other. But it does seem to confirm that HC expenditure is rising and at least some of that is an increase in HIT investment, due in part to the ACA and Hitech Act.

Year HC expenditure
(trillions $USD)
HIT spending
(billions $USD)
HIT per capita
Percent of HC
spending on HIT
2017 ~3.7 19.3 (40.0+) 58.8 (120.0+) 0.6 (1.2+)
2012 2.8 16.6 53.0 0.6
2011 2.7 8.2 26.4 0.3
2010 2.6 6.8 22.0 0.3

So how much do people around the world spend on healthcare and HIT in particular? Health Affairs published an article detailing health care spending (focused on HIT) in OECD (Organization for Economic Cooperation and Development) countries.5 I pulled out the numbers for the US, Canada, Germany, and the UK, countries with rather comparable standards of living. The numbers are for 2003.

US Canada Germany UK
Total HIT investment (billions $USD) 0.1 1.0 1.8 11.5
Per capita HIT investment ($USD) 0.43 31.90 21.20 192.80
Per capita healthcare expenditure ($USD) 5,635 3,003 2,996 2,231
% expended on HIT 0.008 1.0 0.7 8.6

When you look at this, it’s completely deceptive. That $0.1 billion (or $125 million) the US spends on HIT looks trivial compared to the other three nations! But this circles back to the fact that we are not providing socialist healthcare, so much of the true spending on HIT is occurring privately, on budgets that are less available for the public’s prying eyes. Those studies by Deltek and RNCOS probably hit closer to the mark, and show percentages of healthcare budgets that are more comparable to those of the three other countries above. We are still spending a lot more than them overall per capita, but a similar percentage of that spending seems to be going towards HIT in recent years.

Does this promise betterment? That’s not the question I’m trying to answer. I suppose I can say now that at least we’re spending the right amount of resources, but even that’s a bit of a stretch. Nonetheless, there is definitely a burst of activity waiting on the horizon come the next few years and outcomes should be interesting.

1 Centers for Medicare and Medicaid Services: “National Health Expenditure Tables” (2012)

2 Centers for Medicare and Medicaid Services: “National Health Expenditure Projections 2012-2022” (2012)

3 Deltek: “Health Care and Social Services Market, 2012-2017” (2012)

4 RNCOS Report: “US Healthcare IT Market Analysis” (2011)

5 GF Anderson et al. “Health Care Spending and Use of Information Technology in OECD Countries.” Health Affairs. Vol. 25, No.3, pp. 819-831, 2006.

Readings Week of 9/28

D Spivak, R Kent: “Ologs: A Categorical Framework for Knowledge Representation.” PLoS One, Vol 7, Iss 1, 2012.

This paper describes the olog (ontology log), a model for knowledge representation based on category theory. The olog offers a more “reusable, transferable, and comparable” data container which could potentially be shared more easily across platforms. The strict logical structure of an olog implies that the difficulty of its use lies primarily in its creation, but that once written, it offers a very precise representation of information and can be easily expanded or referenced.

A Tversky, D Kahneman: “Judgment Under Uncertainty: Heuristics and Biases.” Science, Vol 185, No 4157, 1974.

There is no place to hide from our biases. Even being aware that biases exist is often not enough to combat them. Availability bias for example, says that people are more likely to assess the likelihood of an event by the instances of similar events which come to mind. This means that events happening in recent history and those of particular intensity influence our beliefs more than the unbiased spectrum of our full experiences.

In my case, my recent experience after moving to Seattle illustrates an example of this. Two or three days after my arrival, my car window was smashed. A few days later during orientation, I received a quick 15 minute presentation from a UW campus police officer instructing us to be ever vigilant if we don’t want to be the victim of theft or violence. Now, I am instilled with the belief that Seattle, or at least the U-district, is a dangerous place full of people out to get me. I just moved here from Baltimore. It’s ridiculous that I now find myself more cautious here than I have ever been walking late night down Charles St. Even though rationally I can tell myself that my views have been skewed towards the conservative because of these two incidents, there is little I can do in the short term to actually change the way I feel and my behaviors.

How can I fight these same biases while doing science? Knowledge of their existence doesn’t seem to be enough. There must be constant vigilence during the scientific process to ensure that both the process and results are a reflection of the true state of the world.

Much of the rest of this week was spent reading and reflecting about medical errors and how to reduce them. Even though I am less intrigued by clinical informatics, I do think this country in particular will have a difficult time standardizing our electronic health record (EHR) systems compared to European and other developed nations. Most of these other nations have invested in government-funded single-payer healthcare, where the state can more easily control the structure of new technology. Here in the US, many versions of EHRs exist, operated by many different companies, many of which are very restrictive about sharing the ways in which their systems operate. How can patients be mobile in such a system? It seems the individual would have to take on a much greater administrative role in managing their own healthcare.

Readings Week of 9/21

Autumn quarter 2014 commenced on 9/24, highlights of my short week:

JC Letelier et al. “Organizational invariance and metabolic closure: analysis in terms of (M,R) systems.” Journal of Theoretical Biology, Vol 238, Iss 4, 2006: p. 949-61.

A review of Robert Rosen’s somewhat controversial work on metabolic closure, the question of how metabolism bootstraps itself. In living organisms, the enzymes responsible for most physiological activity must be generated and regenerated (following enzyme degradation) by the organism itself. “How must a system be organized if it is to continue in operation indefinitely?” Especially when even a small part of it can be as complicated as this.

N Benson, M Whipple, I Kalet. “A Markov Model Approach to Predicting Regional Tumor Spread in the Lymphatic System of the Head and Neck.” AMIA Symposium Proceedings, 2006.

Connecting back to the treatment planning work I was doing at Xcision last year, this paper describes a way to estimate the spread of subclinical disease into lymphatic chains in patients with head and neck cancer. It derives lymphatic anatomy from the functional model of anatomy (FMA) and models the probability of metastasis into particular regions based on the primary tumor site and its T-stage. This was primarily a proof of concept and no optimization training was done on clinical data.