miércoles, abril 13, 2016

Machine learning vs. Statistics



Conceptos básicos


Máquina de aprendizaje: algoritmo que puede aprender de los datos sin depender de la programación basado en reglas.


Modelización estadística: formalización de las relaciones entre las variables en forma de ecuaciones matemáticas.


Diferencias
Tabla 1. Listado de algunos métodos
Estadísticos (Statistics) Máquinas de aprendizaje (Machine learning) 
Prueba de hipótesis Máquinas de Vectores de Soporte
Diseño experimental Redes neuronales
Método ANOVA Árboles de decisión
Regresión lineal y logística Reglas de inducción
Modelos lineales generalizados Métodos de agrupación
Análisis de componentes principales Reglas de asociación
Análisis factorial  Algoritmos genéticos
Análisis discriminante, etc. Selección de características, etc.



Criterio de expertos

Brendan O'Connor: I know that I’m interested in quantitative information science, including statistics and data analysis. Machine learning has many strengths, but it is definitely an odd way to go about analysis. But there’s a good case that statistics, as traditionally defined, is only going to have a smaller role in the future. “Data mining” sounds more relevant, but does it even exist as a coherent subject?

Simon Blomberg: From R's fortunes package: To paraphrase provocatively, 'machine learning is statistics minus any checking of models and assumptions'.


Andrew Gelman: In that case, maybe we should get rid of checking of models and assumptions more often. Then maybe we'd be able to solve some of the problems that the machine learning people can solve but we can't!



Referencias
[1] Srivastava, Tavish. (2015). Difference between Machine Learning & Statistical Modeling. Analytics Vidhya. Disponible en:
http://www.analyticsvidhya.com/blog/2015/07/difference-machine-learning-statistical-modeling/

[2] O'Connor, Brendan. (2009). Statistics vs. Machine Learning, fight! AI and Social Science. Disponible en: http://brenocon.com/blog/2008/12/statistics-vs-machine-learning-fight/