Rechercher dans ce blog

mercredi 27 novembre 2013

Small Businesses cancelling Health Insurance plans for Employees


The US government promised that HealthCare.gov will be ready and run without glitch by the end of this week. However, CBS News reports that employees form small businesses are losing their insurance coverage. 

The government estimated that millions of workers would be dropped from their work insurance under the Affordable Care Act, it's already happening now.

Nancy Clark owns a small business in New Hampshire, she was featured last year in a White House video blog, said that things are not right for her plan. She said that her insurer will increase her rates by 39: starting next year. Insurance that will cost her an additional $30,000.

Because of this she decided to terminate the insurance she's offered her 8 employees and turn to Obamacare, but there's been one problem after another.

�We�re experiencing technical difficulties. That's the nature of the beast,� said Clark.

Betsy Atkinson owns a business in Virginia Beach is also cancelling company insurance because her plan doesn't meet new Obamacare requirements and she can't afford to offer employees one that does.

�They�re going to have to go find their own insurance,� she said. �I�m sorry.�

lundi 25 novembre 2013

Not only verbs but also believes can be conjugated

Following on from last week, where I presented a simple example of a Bayesian network with discrete probabilities to predict the number of claims for a motor insurance customer, I will look at continuos probability distributions today. Here I follow example 16.17 in Loss Models: From Data to Decisions [1].

Suppose there is a class of risks that incurs random losses following an exponential distribution (density \(f(x) = \Theta {e}^{- \Theta x}\)) with mean \(1/\Theta\). Further, I believe that \(\Theta\) varies according to a gamma distribution (density \(f(x)= \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha \,-\, 1} e^{- \beta x } \)) with shape \(\alpha=4\) and rate \(\beta=1000\).

In the same way as I had good and bad driver in my previous post, here I have clients with different characteristics, reflected by the gamma distribution. I shall call the gamma distribution with the above parameters my prior parameter distribution and the exponential distribution the prior predictive distribution.

The textbook tells me that the unconditional mixed distribution of an exponential distribution with parameter \(\Theta\), whereby \(\Theta\) has a gamma distribution, is a Pareto II distribution (density \(f(x) = \frac{\alpha \beta^\alpha}{(x+\beta)^{\alpha+1}}\)) with parameters \(\alpha,\, \beta\). Its k-th moment is given in the general case by
\[
E[X^k] = \frac{\beta^k\Gamma(k+1)\Gamma(\alpha - k)}{\Gamma(\alpha)},\; -1 < k < \alpha. \] Thus, I can calculate the prior expected loss (\(k=1\)) as \(\frac{\beta}{\alpha-1}=\,\)333.33.
Now suppose I have three independent observations, namely losses of $100, $950 and $450 over the last 3 years. The mean loss is $500, which is higher than the $333.33 of my model.

Question: How should I update my belief about the client's risk profile to predict the expected loss cost for year 4 given those 3 observations?

Visually I can regard this scenario as a graph, with evidence set for years 1 to 3 that I want to propagate through to year 4.

Read more �

Not only verbs but also believes can be conjugated

Following on from last week, where I presented a simple example of a Bayesian network with discrete probabilities to predict the number of claims for a motor insurance customer, I will look at continuos probability distributions today. Here I follow example 16.17 in Loss Models: From Data to Decisions [1].

Suppose there is a class of risks that incurs random losses following an exponential distribution (density \(f(x) = \Theta {e}^{- \Theta x}\)) with mean \(1/\Theta\). Further, I believe that \(\Theta\) varies according to a gamma distribution (density \(f(x)= \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha \,-\, 1} e^{- \beta x } \)) with shape \(\alpha=4\) and rate \(\beta=1000\).

In the same way as I had good and bad driver in my previous post, here I have clients with different characteristics, reflected by the gamma distribution. I shall call the gamma distribution with the above parameters my prior parameter distribution and the exponential distribution the prior predictive distribution.

The textbook tells me that the unconditional mixed distribution of an exponential distribution with parameter \(\Theta\), whereby \(\Theta\) has a gamma distribution, is a Pareto II distribution (density \(f(x) = \frac{\alpha \beta^\alpha}{(x+\beta)^{\alpha+1}}\)) with parameters \(\alpha,\, \beta\). Its k-th moment is given in the general case by
\[
E[X^k] = \frac{\beta^k\Gamma(k+1)\Gamma(\alpha - k)}{\Gamma(\alpha)},\; -1 < k < \alpha. \] Thus, I can calculate the prior expected loss (\(k=1\)) as \(\frac{\beta}{\alpha-1}=\,\)333.33.
Now suppose I have three independent observations, namely losses of $100, $950 and $450 over the last 3 years. The mean loss is $500, which is higher than the $333.33 of my model.

Question: How should I update my belief about the client's risk profile to predict the expected loss cost for year 4 given those 3 observations?

Visually I can regard this scenario as a graph, with evidence set for years 1 to 3 that I want to propagate through to year 4.

Read more �

mercredi 20 novembre 2013

Chao: Health Insurance Marketplace is Still Incomplete



Henry Chao the deputy chief information officer at the Centers for Medicare and Medicaid Services said that the federal health insurance marketplace is not yet complete. He said that they are still building the �back office systems." 

�we still have to build the financial management aspects of the system, which includes our accounting system and payment system and reconciliation system,� he said. "This part is still being developed and will be tested."

He admitted Tuesday that up to 40 percent of IT systems supporting the exchange still need to be built.

The Obama government completed the online system which allowed consumers to apply for insurance, compare health plans and enroll however, many parts of the system were still being repaired and were not performing as well as they had hoped.

�It�s not that it�s not working,� Chao told lawmakers at an Energy and Commerce Oversight and Investigations subcommittee hearing. �It�s still being developed and tested.�

Financial management tools are not yet done, he said, particularly the process that will deliver payments to insurers.

lundi 18 novembre 2013

Predicting claims with a Bayesian network

Here is a little Bayesian Network to predict the claims for two different types of drivers over the next year, see also example 16.15 in [1].

Let's assume there are good and bad drivers. The probabilities that a good driver will have 0, 1 or 2 claims in any given year are set to 70%, 20% and 10%, while for bad drivers the probabilities are 50%, 30% and 20% respectively.

Further I assume that 75% of all drivers are good drivers and only 25% would be classified as bad drivers. Therefore the average number of claims per policyholder across the whole customer base would be:
0.75*(0*0.7 + 1*0.2 + 2*0.1) + 0.25*(0*0.5 + 1*0.3 + 2*0.2) = 0.475
Now a customer of two years asks for his renewal. Suppose he had no claims in the first year and one claim last year. How many claims should I predict for next year? Or in other words, how much credibility should I give him?


To answer the above question I present the data here as a Bayesian Network using the gRain package [2]. I start with the contingency probability tables for the driver type and the conditional probabilities for 0, 1 and 2 claims in year 1 and 2. As I assume independence between the years I set the same probabilities. I can now review my model as a mosaic plot (above) and as a graph (below) as well.




Next, I set the client's evidence (0 claims in year one and 1 claim in year two) and propagate these back through my network to estimate the probabilities that the customer is either a good (73.68%) or a bad (26.32%) driver. Knowing that a good driver has on overage 0.4 claims a year and a bad driver 0.7 claims I predict the number of claims for my customer with the given claims history as 0.4789.


Alternatively I could have added a third node for year 3 and queried the network for the probabilities of 0, 1 or 2 claims given that the customer had zero claims in year 1 and one claim in year 2. The sum product of the number of claims and probabilities gives me again an expected claims number of 0.4789.




References

[1] Klugman, S. A., Panjer, H. H. & Willmot, G. E. (2004), Loss Models: From Data to Decisions, Wiley Series in Proability and Statistics.

[2] S�ren H�jsgaard (2012). Graphical Independence Networks with the gRain Package for R. Journal of Statistical Software, 46(10), 1-26. URL http://www.jstatsoft.org/v46/i10/

Session Info

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] Rgraphviz_2.6.0 gRain_1.2-2 gRbase_1.6-12 graph_1.40.0

loaded via a namespace (and not attached):
[1] BiocGenerics_0.8.0 igraph_0.6.6 lattice_0.20-24 Matrix_1.1-0
[5] parallel_3.0.2 RBGL_1.38.0 stats4_3.0.2 tools_3.0.2

Predicting claims with a Bayesian network

Here is a little Bayesian Network to predict the claims for two different types of drivers over the next year, see also example 16.15 in [1].

Let's assume there are good and bad drivers. The probabilities that a good driver will have 0, 1 or 2 claims in any given year are set to 70%, 20% and 10%, while for bad drivers the probabilities are 50%, 30% and 20% respectively.

Further I assume that 75% of all drivers are good drivers and only 25% would be classified as bad drivers. Therefore the average number of claims per policyholder across the whole customer base would be:
0.75*(0*0.7 + 1*0.2 + 2*0.1) + 0.25*(0*0.5 + 1*0.3 + 2*0.2) = 0.475
Now a customer of two years asks for his renewal. Suppose he had no claims in the first year and one claim last year. How many claims should I predict for next year? Or in other words, how much credibility should I give him?


To answer the above question I present the data here as a Bayesian Network using the gRain package [2]. I start with the contingency probability tables for the driver type and the conditional probabilities for 0, 1 and 2 claims in year 1 and 2. As I assume independence between the years I set the same probabilities. I can now review my model as a mosaic plot (above) and as a graph (below) as well.




Next, I set the client's evidence (0 claims in year one and 1 claim in year two) and propagate these back through my network to estimate the probabilities that the customer is either a good (73.68%) or a bad (26.32%) driver. Knowing that a good driver has on overage 0.4 claims a year and a bad driver 0.7 claims I predict the number of claims for my customer with the given claims history as 0.4789.


Alternatively I could have added a third node for year 3 and queried the network for the probabilities of 0, 1 or 2 claims given that the customer had zero claims in year 1 and one claim in year 2. The sum product of the number of claims and probabilities gives me again an expected claims number of 0.4789.




References

[1] Klugman, S. A., Panjer, H. H. & Willmot, G. E. (2004), Loss Models: From Data to Decisions, Wiley Series in Proability and Statistics.

[2] S�ren H�jsgaard (2012). Graphical Independence Networks with the gRain Package for R. Journal of Statistical Software, 46(10), 1-26. URL http://www.jstatsoft.org/v46/i10/

Session Info

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] Rgraphviz_2.6.0 gRain_1.2-2 gRbase_1.6-12 graph_1.40.0

loaded via a namespace (and not attached):
[1] BiocGenerics_0.8.0 igraph_0.6.6 lattice_0.20-24 Matrix_1.1-0
[5] parallel_3.0.2 RBGL_1.38.0 stats4_3.0.2 tools_3.0.2

lundi 11 novembre 2013

googleVis 0.4.7 with RStudio integration on CRAN

In my previous post, I presented a preview version of googleVis that provided an integration with RStudio's Viewer pane (introduced with version 0.98.441).

Over 80% in my little survey favoured the new default output mechanism of googleVis within RStudio. Hence, I uploaded googleVis 0.4.7 on CRAN over the weekend.

However, there were also some thoughtful comments, which suggested that the RStudio Viewer pane is not always the best option. Indeed, Flash charts and gvisMerge output will still be displayed in your default browser, but also if you work on larger charts and with smaller screen, then the browser might still be the better option compared to the Viewer pane - of course you can launch the browser from the Viewer pane as well.

Hence, googleVis gained a new option 'googleVis.viewer' that controls the default output of the googleVis plot method. On package load it is set to getOption("viewer") and if you use RStudio, then its viewer pane will be used for displaying non-Flash and un-merged charts. You can set options("googleVis.viewer" = NULL) and the googleVis plot function will open all output in the default browser again. Thanks to J.J. from RStudio for the tip.

The screen shot below shows a geo chart within the RStudio Viewer pane of the
devastating typhoon track of Haiyan that hit Southeast Asia last week.



Session Info

RStudio v0.98.456 and R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base

other attached packages:
[1] googleVis_0.4.7 XML_3.95-0.2

loaded via a namespace (and not attached):
[1] RJSONIO_1.0-3 tools_3.0.2

googleVis 0.4.7 with RStudio integration on CRAN

In my previous post, I presented a preview version of googleVis that provided an integration with RStudio's Viewer pane (introduced with version 0.98.441).

Over 80% in my little survey favoured the new default output mechanism of googleVis within RStudio. Hence, I uploaded googleVis 0.4.7 on CRAN over the weekend.

However, there were also some thoughtful comments, which suggested that the RStudio Viewer pane is not always the best option. Indeed, Flash charts and gvisMerge output will still be displayed in your default browser, but also if you work on larger charts and with smaller screen, then the browser might still be the better option compared to the Viewer pane - of course you can launch the browser from the Viewer pane as well.

Hence, googleVis gained a new option 'googleVis.viewer' that controls the default output of the googleVis plot method. On package load it is set to getOption("viewer") and if you use RStudio, then its viewer pane will be used for displaying non-Flash and un-merged charts. You can set options("googleVis.viewer" = NULL) and the googleVis plot function will open all output in the default browser again. Thanks to J.J. from RStudio for the tip.

The screen shot below shows a geo chart within the RStudio Viewer pane of the
devastating typhoon track of Haiyan that hit Southeast Asia last week.



Session Info

RStudio v0.98.456 and R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base

other attached packages:
[1] googleVis_0.4.7 XML_3.95-0.2

loaded via a namespace (and not attached):
[1] RJSONIO_1.0-3 tools_3.0.2

Goodbye Trans Fat?

The U.S. Food and Drug Administration said last Thursday it will phase out trans fats that
would eliminate artery-clogging from our diet. This will give manufacturers and restaurant some problems and may cause prices to shoot up but will be beneficial to our health.

Dr. Margaret A. Hamburg, commissioner of FDA said that this move will prevent 20,000 heart attacks and 7,000 deaths each year. But critics say that this is just a political move since the real threat to our health is not trans fat but the increased use of pesticides on our food and genetically engineered foods. Trans fats are identified and labeled on our food but in the case for genetically engineered foods and pesticide laden food no information are given to consumers.

Chris Shanahan of Frost & Sullivan market research firm said that if FDA bans trans fat "in the long term, prices of certain foods will increase and different foods will be discontinued."


-----------------------

Check out my friends blog http://maverikmaven.blogspot.com/ the blog covers a variety of topics including consumer goods.

jeudi 7 novembre 2013

Will You Buy Twitter IPO shares?

New York Stock Exchange welcomes Twitter on Thursday November 7, 2013. They will have the symbol TWTR and their share is priced at $26 each to raise around $2.1 billion.

Twitter is part of our everyday life for most of us and it has a lot of media attention which is why there is a great demand for its shares. However, be very cautious remember Facebook? a lot of people were burned by it. The best thing to do is wait. There's no guarantee the stock will trade higher, and if you would look at several recent social media IPOs, the stocks actually dropped like facebook.

lundi 4 novembre 2013

Display googleVis charts within RStudio

The preview version 0.98.441 of RStudio introduced a new viewer pane to render local web content and with that it allows me to display googleVis charts within RStudio rather than in a separate browser window.


I think this is a rather nice feature and hence I have updated the plot method in googleVis to use the RStudio viewer pane as the default output. If you use another editor, or if the plot is using one of the Flash based charts, then the browser is still the default display.

The behaviour can also be controlled via the option viewer. Set options("viewer"=NULL) and googleVis will plot all output in the browser again.

Of course shiny apps can also run in the viewer pane. Here is the example of the renderGvis help page of googleVis. For more information about the new viewer pane see the online RStudio documentation.


For the time being you can get the next version 0.4.6 of googleVis from our project site only. Please get in touch if you find any issues or bugs with this version, or add them to our issues list.

Is this a step in the right direction? Please use the voting buttons below.

Session Info

R Under development (unstable) (2013-10-25 r64109)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] googleVis_0.4.6

loaded via a namespace (and not attached):
[1] RJSONIO_1.0-3 tools_3.1.0

Display googleVis charts within RStudio

The preview version 0.98.441 of RStudio introduced a new viewer pane to render local web content and with that it allows me to display googleVis charts within RStudio rather than in a separate browser window.


I think this is a rather nice feature and hence I have updated the plot method in googleVis to use the RStudio viewer pane as the default output. If you use another editor, or if the plot is using one of the Flash based charts, then the browser is still the default display.

The behaviour can also be controlled via the option viewer. Set options("viewer"=NULL) and googleVis will plot all output in the browser again.

Of course shiny apps can also run in the viewer pane. Here is the example of the renderGvis help page of googleVis. For more information about the new viewer pane see the online RStudio documentation.


For the time being you can get the next version 0.4.6 of googleVis from our project site only. Please get in touch if you find any issues or bugs with this version, or add them to our issues list.

Is this a step in the right direction? Please use the voting buttons below.

Session Info

R Under development (unstable) (2013-10-25 r64109)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] googleVis_0.4.6

loaded via a namespace (and not attached):
[1] RJSONIO_1.0-3 tools_3.1.0