What is the difference between a ROC curve and a precision-recall curve?

November 2, 2017 no comments Posted in Análise de Dados

Remember, a ROC curve represents a relation between sensitivity (RECALL) and specificity(NOT PRECISION). Sensitivity is the other name for recall but specificity is not PRECISION. 

Figure By kakau (Selbstgepinselt mit PowerPoint) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/)], via Wikimedia Commons

Read more at (1) What is the difference between a ROC curve and a precision-recall curve? When should I use each? – Quora

Not Even Scientists Can Easily Explain P-values | FiveThirtyEight

July 18, 2017 no comments Posted in Análise de Dados

What I learned by asking all these very smart people to explain p-values is that I was on a fool’s errand. Try to distill the p-value down to an intuitive concept and it loses all its nuances and complexity, said science journalist Regina Nuzzo, a statistics professor at Gallaudet University. “Then people get it wrong, and this is why statisticians are upset and scientists are confused.” You can get it right, or you can make it intuitive, but it’s all but impossible to do both.

Source: Not Even Scientists Can Easily Explain P-values | FiveThirtyEight

thesis – The difference between literature review and background sections of a dissertation – Academia Stack Exchange

July 17, 2017 no comments Posted in Análise de Dados

I was assuming these two were the same. What is the difference between them? How does one relate to another? The context here is an MA (Master of Arts) level social science dissertation in UK education system. Should background talk about relevant research done? If so why does literature review exist? Or should the background chose only a few literature and discuss them throughly?

Source: thesis – The difference between literature review and background sections of a dissertation – Academia Stack Exchange

The Value of Empirical Evidence for Practitioners and Researchers

July 17, 2017 no comments Posted in Análise de Dados

The empirical software engineering research community has two general aims: 1 To understand how software is actually developed and maintained; and 1 To understand what improvements should be made to software development and maintenance, and how those improvements should be implemented. Empirical software engineering research, therefore, is about both contemplation and action. It is a discipline which attempts to understand phenomena whilst at the same time trying to change those very phenomena (in order to improve them). And it is a discipline that, by definition, promotes empirical evidence as the primary source of reliable knowledge for achieving these two general aims.

Austen Rainer

Source: The Value of Empirical Evidence for Practitioners and Researchers

Streaming data processing is a big deal in big data these days, and for good reasons.

July 17, 2017 no comments Posted in Análise de Dados

Streaming data processing is a big deal in big data these days, and for good reasons. Amongst them:

  • Businesses crave ever more timely data, and switching to streaming is a good way to achieve lower latency.
  • The massive, unbounded data sets that are increasingly common in modern business are more easily tamed using a system designed for such never-ending volumes of data.
  • Processing data as they arrive spreads workloads out more evenly over time, yielding more consistent and predictable consumption of resources.

Despite this business-driven surge of interest in streaming, the majority of streaming systems in existence remain relatively immature compared to their batch brethren, which has resulted in a lot of exciting, active development in the space recently.

read more at

Source: The world beyond batch: Streaming 101 – O’Reilly Media

Leitura Recomendada: “Reading and Understanding More Multivariate Statistics”

July 15, 2017 no comments Posted in Análise de Dados, Publicando e Lendo

Esbarrei com esse livro na biblioteca da Universidade do Minho em Portugal e estou gostando muito.

Combinação perfeita entre clarificação de conceitos,  algum rigor e referências para se completar os estudos.

Source: Reading and Understanding More Multivariate Statistics: 9781557986986: Medicine & Health Science Books @ Amazon.com