This is an analysis of patient data from watsi.org. The link to data-set is : Watsi data. You can find the Rmd file used for generating this document on github

watsi.org brings kickstarter model to healthcare needs of individuals from developing/under-developed countries. Users can fund a patient in any denomination, once a target for the funds needed is reached, funds gets transferred for treating the patient through a medical partner in the field. 100% of proceeds from the user go directly to the patient’s medical needs.

We look at the patient data set publicly available and explore various parameters in this data set. We explore the following :

  1. Medical costs of the patients, medical costs grouped by country.
  2. Number of days it took to fund patients, this is an important factor, the shorter it is, the better for watsi and patient’s health outcome. We explore the same by seasonality (month of the year).
  3. Number of patients funded by month/year (seasonality)
  4. Patients by gender

Load the data set. The following are the columns in the data set:

##  [1] "Patient.Name"         "Case.ID"              "Profile.Url"         
##  [4] "Country"              "Medical.Partner"      "Cost"                
##  [7] "Date.Posted"          "Date.Funded"          "Funds.Transferred.At"
## [10] "Transfer.Receipt"     "Gender"

As we see, some of the variables in the data set like “Case.ID” might not be useful for us, but other variables like “Profile.Url” contain a lot more information in them. For example, from the profile, we can extract the gender of the patient, or if a patient is an adult or not. We will use this information later in the exploration.

We take a look at the cost distribution. Following are two histograms of cost with bin-width 500, 50 respectively. As we see majority of the “Cost” values are in range 0-500. i.e., most patient funding needed is less than USD 500.

With bin width 50, we see that the most costs are between USD 200-250 and a surprising peak at the end of the distribution at USD 1500 - 1550.

As we see below, the median is USD 512 and the mean is USD 604.5

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    70.0   225.0   512.0   640.5   980.0  3000.0

Now, we take a look at the patients grouped by country. As we see in the figure below, Cambodia has the highest number of patients funded at 1695 patients, followed by Kenya and Tanzania at 1183 and 640 patients respectively.

Now we take a look at the costs grouped by country. Cost is the total funds transferred at the end of successful funding of a patient. The total costs per country is sum of all funds transferred grouped by country.

The total costs funded per country closely resembles the previous graph, but Cambodia no longer has the highest funds transferred. Kenya tops the beneficiary list receiving USD 814,505, followed by Tanzania and Guatemala at USD 543,612 and USD 452,301 respectively.

Mean costs grouped by country indicates the mean funds transferred to a patient of a particular country at the end of a successful funding campaign.

A look at mean costs by country, shows Nigeria as the country with highest mean cost in this data set. At the same time, Cambodia has the lowest mean cost, as we saw earlier, also has the most number of patients benefited of any country in this data set.

Now, we explore Number of days it took to fund patients. Using the “Date.Posted” and “Date.Funded” we can calculate the number of days it took to fund each patient, we call this column “numDaysFunded”. Let us take a look at how “numDaysFunded” changes for various factors. We also add two more columns to the data set - “yearFunded” & “monthFunded”. This will help us explore number of days it took to fund a patient by seasonality(year, month).

Following is a summary of number of days it took to successfully fund patients. On average it took just 3.271 days to fund patients.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   2.000   3.271   5.000  77.000

Following is a consolidated view of all patients funded during Watsi’s existence, grouped by month. December seems to be most philanthropic month, which is not too surprising considering December is the biggest month for charities and other non-profit organizations when it comes to fund raising.

A bit surprising is September, which is the second most philanthropic month in this data set.

View of the same chart as above but grouped by year.

Considering that the year just started, looks like Watsi has got to a good start, especially considering the uptick in number of patients funded in February 2016 compared to previous year.

The upward trend of number of patients funded year-over-year is clear in the following chart.

Finally we look at gender data. We collected the gender data by counting pronouns from the profile pages of the patients. The exercise is straightforward, find the script at Github.

Following are the means for number of days it took to fund a Male and a Female patient. The mean number of days needed to fund a male patient is slightly more compared to that of female patients. The difference between the two means is 0.047.

One of the bigger surprises from watsi's data is the short periods of funding for each patient. 50% of the patients get funded within 2 days! Another surprise is that after December, September is the most philanthropic month.

Watsi is on a growth trajectory helping people and improving their quality of life.