Fault Prediction in the Crowd

Context and Role: Dissertation/Thesis for Big Data master's at Warwick. Sole contributor.

Background: The nice people at Cisco let me have some network device event data. There is a Github page, a PDF of the thesis/dissertation, and a Prezi.

Abstract: An investigation was conducted into a 40 GB, 326 million record event dataset. This dataset contained anonymized event information representing performance, availability, and security issues of 172,000 network devices from approximately 150 Cisco Systems customers. It was hypothesized that network device event data gathered from one customer environment could be used to predict events in another customer environment. After analysis of the dataset, a binary model was developed to predict when a process might request too much compute resources on a device. The model was developed on one set of customer data and tested on another unseen set of customer data. The Matthews correlation coefficient for the model on the unseen test data was 0.66, the F1 score was 0.72, and the False Negative rate was 27%. This was a substantial improvement over a model with no skill.

687474703a2f2f6e696c7370656465722e706169727365727665722e636f6d2f6172742f4361707475726531312e504e47

More selected projects

COVID and Air Quality
Warwick

Politics, Elections and Geography
Warwick

Australia Wildfires
Warwick

One Internet with Many Perspectives
Warwick

Writing Samples
Writing

Competitive Analysis
VMware

Information Architecture
Cisco

vCenter 6.5 Install
VMware

Dashboards
General

Timeline
General

Application Network Manager
Cisco

Solaris Management Console
Sun

Next Gen VPN Client Prototype
Cisco

UX Process
General