The emergence of storage buckets provided by the likes of Google Cloud Storage, Amazon AWS, and various S3 compliant alternatives has been something of a revolution. Designed to be scalable, "inherently secure", and cost-effective, these platforms have become foundational to businesses looking to leverage the cloud for data management. Yet, we will uncover that these storage solutions, while robust, are not impervious to misconfiguration issues, particularly concerning Access Control Lists (ACLs). This kickoff post for our OSINT series aims to dissect the nuanced benefits and potential pitfalls of storage buckets.
A Closer Look at Storage Buckets
Imagine storage buckets as vast, flexible containers in the cloud, architected to house data. Their allure lies in their durability, security (when correctly configured), and their adaptability to various business needs.
Navigating the Major Players
- Google Cloud Storage: Stands out for its scalability and performance capabilities.
- Amazon AWS S3: Noteworthy for its robust security measures and reliability.
- Other S3 Compliant Buckets: Includes platforms like IBM Cloud Object Storage and Oracle Cloud Storage, which adhere to the S3 standard for enhanced compatibility and interoperability.
The Upside of Embracing Storage Buckets
1. Scalability These platforms allow for seamless scaling of storage needs, accommodating business growth without the hassle.
2. Conditional Security While these platforms are built with security in mind, the real test lies in how they are configured and managed.
3. Economic Efficiency The pay-as-you-go model eliminates the need for upfront investment in physical storage infrastructure.
4. Universal Accessibility Providing anytime, anywhere access to data, these platforms support the dynamics of remote work and global operations.
The Hidden Perils: Misconfigured ACLs
Despite their advantages, storage buckets can become a source of cybersecurity vulnerabilities if ACLs are not meticulously managed.
Deciphering ACLs
At their core, ACLs dictate who gets access to what data. A slip-up here could inadvertently expose sensitive data to the public. This is commonly overlooked.
The Ripple Effects of Misconfiguration
- Data Breaches: Such incidents not only lead to financial loss but can severely damage a company's reputation.
- Compliance Violations: Missteps here can result in hefty fines under data protection laws.
- Intellectual Property Risks: Accidental exposure of proprietary data could give competitors an unfair advantage.
Identifying Misconfigured Buckets
While the landscape is dotted with numerous tools designed for this purpose, we will focus on a select few that stand out for their efficacy and ease of use.
The Power of urlscan.io
One of the first tools in our arsenal is urlscan.io, a versatile platform that allows for the scanning and analysis of URLs (and that we will be covering extensevely on this series). What makes urlscan.io particularly powerful is its support for regex searches.
For instance, to identify subdomains of s3.amazonaws.com, we can employ the regex search page.domain:/.*s3\.amazonaws\.com.*/
. This query helps in pinpointing buckets that might be publicly accessible or misconfigured. To further refine our search, we can add criteria to look for specific types of files, such as those containing passwords. A full query might look like this: page.domain:/.*s3\.amazonaws\.com.*/ AND page.url:/.*password\.*.*/
. This methodology allows us to unearth buckets that not only are misconfigured but also contain potentially sensitive information. You can play with this query until you hit the specific kind of result that you are looking for.

Leveraging grayhatwarfare.com - The GOAT
Another invaluable resource in our toolkit is grayhatwarfare.com. This platform stands out for its ability to search buckets across various cloud providers, not just AWS, making it a versatile tool for OSINT investigations. grayhatwarfare.com supports regex but also allows for searches based on specific filenames or even company names, providing a targeted approach to identifying vulnerable buckets (and making it easier if you are on a rush).
What sets grayhatwarfare.com apart is its API and the option for free lookups directly through their website. This functionality enables researchers to automate their searches or perform manual queries with ease, depending on their specific needs. They also offer a Top Keywords page where you can see the most searched ones.
For this occasion, we will use the easy search engine to search just for "accounts xlsx" giving us results of just files with the word "passwords" and filename extension "xlsx" (Countless times I've seen MSPs and other "IT" vendors use xlsx documents to save plaintext passwords of their clients).
We have a hit - last modified 08-12-2023

A look into the file:

This is obviously publicly available information but I have redacted part of it.
A Honorable Mention: S3Scanner
A tool that has gained popularity among security researchers for identifying misconfigured S3 buckets - S3Scanner. This tool is adept at scanning for open S3 buckets. While we won't be using S3Scanner in our demonstrations here, its utility in the security community makes it a resource worth mentioning.
The Implications of Misconfigured Buckets
I believe it's crucial to underscore that the information revealed through these tools is publicly accessible, underscoring the importance of proper bucket configuration and security hygiene. In the wrong hands, such data can be exploited to inflict significant damage on a company, from data breaches to compliance violations and beyond.
By leveraging tools like urlscan.io and grayhatwarfare.com, security professionals and researchers can gain valuable insights into the state of storage buckets across the web. This knowledge not only aids in securing one's assets but also contributes to the broader understanding of common misconfigurations and how to avoid them.
The Road Ahead for Storage Bucket Security
As the dependency on cloud storage grows, so does the sophistication of potential threats. Businesses must adopt a forward-thinking, proactive stance on security, continually updating and refining their strategies. The advent of AI and machine learning technologies promises new avenues for automating certain security aspects, potentially minimizing human error but it can also exploit human error and we will cover this on our series as well.
-See you on the next chapter of our OSINT series-