Pingar Home

Enron Demo

This demo uses the Pingar API to analyze a public dataset consisting of thousands of emails circulated by Enron employees in the year prior to the company's collapse. We use the API to track which topics were trending and visualize how the key talking points within Enron were changing over this period. For more information you can read this blog post or click here to learn more!.
Number of Emails Sent Time
Keywords for month
Click a point on the graph to get keywords.
Where did the dataset come from?
Enron was a large US energy corporation that went bankrupt in late 2001. The Enron dataset was made public by the Federal Energy Regulatory Commission. We used a pre-cleaned Enron MySQL dump released by the University of Southern California.
How is the data scaled and what is a TrendScore?
The graphs are scaled based on the average frequency of a keyword over all months. This average is represented as 1.0 and is marked by a green line. The graph appears below the green line if a keyword is used less than usual and above if it is used more. The TrendScore is the relation of the current month's frequency with the average. For example, a TrendScore of 2.5 means that a keyword was used 2.5 times more frequently than its average. If two keywords are compared, then the first keyword's average is considered as 1.0 and the TrendScores of both are compared in relation to it. Keyword frequencies are normalized by the total number of keywords extracted from all emails in that month.
How can I interact with this demo?
The demo allows you to view trending keywords for a particular month, to analyze how the TrendScore for a given keyword has changed over time, and to compare the TrendScore graphs for two given keywords. If a particular keyword searched for by the user is not found, this means that it was not identified as a keyword within the email dataset.
How is the Enron data processed using Pingar API?
For each email, we extract keywords using the GetEntities method of the Pingar API. When searching for a given keyword in the emails, we return query-based summaries using the DocumentPreview method.
How are the graphs generated?
The graphs were generated using the Flot plotting library for jQuery.