On the Semantic Patterns of Passwords and their Security Impact


We present the first framework for segmentation, semantic classification, and semantic generalization of passwords and a model that captures the semantic essence of password samples. Researchers have only touched the surface of patterns in password creation, with the semantics of passwords remaining largely unexplored, leaving a gap in our understanding of their characteristics and, consequently, their security. In this paper, we begin to fill this gap by employing Natural Language Processing techniques to extract and leverage understanding of semantic patterns in passwords. The results of our investigation demonstrate that the knowledge captured by our model can be used to crack more passwords than the state-of-the-art approach. In experiments limited to 3 billion guesses, our approach can guess approximately 67% more passwords from the LinkedIn leak and 32% more passwords from the MySpace leak.

Proceedings of the 2014 Network and Distributed System Security Symposium (NDSS)