I often use a search engine to explore and review my clients' websites and check if anything is untoward. The other weekI came across a report on one client's site that was obviously intendedfor internal consumption only. I immediately rang my client to warnthem. They explained that the report had been posted to the publicwebsite instead of the internal intranet by mistake and they'd removedit as soon as the error was discovered. Obviously they were quitealarmed that I could still access it more than a week later.
This is a great example of the all-consuming nature of Web searches, Google searches in particular. Google takes a snapshot of each page itssearch crawlers examine and caches it as a backup. It's also theversion used to judge if a page is a good match for a query. Myclient's report was only on the Web for about three hours and yet acopy of it ended up stored in Google's cache and was still availablefor anyone to read. The fact that sensitive information that getscrawled can remain in the public domain means data classification andcontent change processes are vital to prevent this type of data leakagefrom occurring.
Unfortunately, private or sensitive business information makes its way onto the public Internet all too often. In this tip, we'll discussreasons why this happens, and some strategies to help enterprises keepprivate or sensitive data off the Web.
Problems that can cause website information leaksThe incident noted above gave me the opportunity to address with my client some specific information security problems that led to thereport being posted on its website. The first problem was that theorganization didn't properly classify its data and documents.Implementing a system of data classification and clearly labellingdocuments with that classification would make such an incident far lesslikely.