The analysis of raw data requires us to find and
understand complex patterns in that data.
We all have a toolbox of techniques and methodologies that we use; the
more tools we have, the better we are at the job of analysis. Some of these tools are well known, data
mining for example. This talk covers some of the less well-known techniques
that are still directly applicable to this kind of analytics.
Last year at Sqlbits I gave a two hour session on four
such topics:
-
Monte Carlo simulations (MCS)
-
Nyquist’s Theorem
-
Benford’s Law
-
Simpson’s paradox
I will not be assuming that you attended last year’s
talk; although if you did and enjoyed it then it is highly likely that you will
enjoy this one! This session will focus
on more of these invaluable techniques.
For example, we’ll talk about:
-
Dark Data
-
Probability calculations
-
RFI
In each case I try to give you an understanding, not of
the maths behind these techniques, but of how they work, why they work and
(most importantly) why it is to your advantage to know about them. I have genuinely chosen only techniques that
I have found invaluable in my commercial work.