articleMay 1, 2012Closed access

The Science of Guessing: Analyzing an Anonymized Corpus of 70 Million Passwords

University of Cambridge

Indexed incrossref

Abstract

We report on the largest corpus of user-chosen passwords ever studied, consisting of anonymized password histograms representing almost 70 million Yahoo! users, mitigating privacy concerns while enabling analysis of dozens of subpopulations based on demographic factors and site usage characteristics. This large data set motivates a thorough statistical treatment of estimating guessing difficulty by sampling from a secret distribution. In place of previously used metrics such as Shannon entropy and guessing entropy, which cannot be estimated with any realistically sized sample, we develop partial guessing metrics including a new variant of guesswork parameterized by an attacker's desired success rate. Our new…

Citation impact

685
total citations
FWCI
127.11
Percentile
100%
References
64
Citations per year

Authors

1

Topics & keywords

Keywords
  • Password
  • Computer science
  • Dictionary attack
  • Entropy (arrow of time)
  • Computer security
  • Password cracking
  • Set (abstract data type)
  • Password strength
No related works found for this paper.

Funding