This is an analysis of patient data from watsi.org. The link to data-set is : Watsi data. You can find the Rmd file used for generating this document on github

watsi.org brings kickstarter model to healthcare needs of individuals from developing/under-developed countries. Users can fund a patient in any denomination, once a target for the funds needed is reached, funds gets transferred for treating the patient through a medical partner in the field. 100% of proceeds from the user go directly to the patient’s medical needs.

We look at the patient data set publicly available and explore various parameters in this data set. We explore the following :

  1. Medical costs of the patients, medical costs grouped by country.
  2. Number of days it took to fund patients, this is an important factor, the shorter it is, the better for watsi and patient’s health outcome. We explore the same by seasonality (month of the year).
  3. Number of patients funded by month/year (seasonality)
  4. Patients by gender

Load the data set. The following are the columns in the data set:

##  [1] "Patient.Name"         "Case.ID"              "Profile.Url"         
##  [4] "Country"              "Medical.Partner"      "Cost"                
##  [7] "Date.Posted"          "Date.Funded"          "Funds.Transferred.At"
## [10] "Transfer.Receipt"     "Gender"

As we see, some of the variables in the data set like “Case.ID” might not be useful for us, but other variables like “Profile.Url” contain a lot more information in them. For example, from the profile, we can extract the gender of the patient, or if a patient is an adult or not. We will use this information later in the exploration.

We take a look at the cost distribution. Following are two histograms of cost with bin-width 500, 50 respectively. As we see majority of the “Cost” values are in range 0-500. i.e., most patient funding needed is less than USD 500.

With bin width 50, we see that the most costs are between USD 200-250 and a surprising peak at the end of the distribution at USD 1500 - 1550.

As we see below, the median is USD 512 and the mean is USD 604.5

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    70.0   225.0   512.0   640.5   980.0  3000.0

Now, we take a look at the patients grouped by country. As we see in the figure below, Cambodia has the highest number of patients funded at 1695 patients, followed by Kenya and Tanzania at 1183 and 640 patients respectively.

Now we take a look at the costs grouped by country. Cost is the total funds transferred at the end of successful funding of a patient. The total costs per country is sum of all funds transferred grouped by country.

The total costs funded per country closely resembles the previous graph, but Cambodia no longer has the highest funds transferred. Kenya tops the beneficiary list receiving USD 814,505, followed by Tanzania and Guatemala at USD 543,612 and USD 452,301 respectively.

Mean costs grouped by country indicates the mean funds transferred to a patient of a particular country at the end of a successful funding campaign.

A look at mean costs by country, shows Nigeria as the country with highest mean cost in this data set. At the same time, Cambodia has the lowest mean cost, as we saw earlier, also has the most number of patients benefited of any country in this data set.