Keep up to date with every new upload!

Join free & follow O'Reilly Design Podcast - O'Re
Share
  • 2 years ago
Using Apache Spark to predict attack vectors among billions of users and trillions of events

Using Apache Spark to predict attack vectors among billions of users and trillions of events

The O’Reilly Data Show podcast: Fang Yu on data science in security, unsupervised learning, and Apache Spark.In this episode of the O'Reilly Data Show, I spoke with Fang Yu, co-founder and CTO of DataVisor.

We discussed her days as a researcher at Microsoft, the application of data science and distributed computing to security, and hiring and training data scientists and engineers for the security domain.DataVisor is a startup that uses data science and big data to detect fraud and malicious users across many different application domains in the U.S. and China. Founded by security researchers from Microsoft, the startup has developed large-scale unsupervised algorithms on top of Apache Spark, to (as Yu notes in our chat) "predict attack vectors early among billions of users and trillions of events."

Several years ago, I found myself immersed in the security space and at that time tools that employed machine learning and big data were still rare. More recently, with the rise of tools

Comments