With the recent coverage of the exposure of personal information within Google+ and the announcement of its closure, one might be tempted to discuss how the existence of vulnerabilities and subsequent failure to disclose them can lead to the closure of your service… except that many suspected, and Google has even said, that the real reason was poor user adoption of the social network...
…so instead, we focus on what is interesting about the story: the reason that Google is unable to determine how many users were impacted is that they only retain API logs for two weeks. This begs the questions of what the best practices are for log management, and whether they change in various contexts. To address the implications of answers to those questions, we’ll also discuss what the business case is for attributing storage to retain those logs, and for what duration they should be stored.
It’s common knowledge that certain records (of banks, other tax/income related documents) are required to be retained for seven years. Is this how long you should retain logs from your servers? Security solutions? The answer is that it depends on what you need the records for. What are those use cases? You can use log data to…
1) Prove it:
To an auditor or outside body, in which case it doesn’t have to be online storage, and not all logs need to be stored. This is going to be mostly dictated by the governing body that you need to comply with. Operations stakeholders will want to consider implementing a ‘cushion’ of retention that exceeds the minimum mandated, to achieve a level of redundancy in keeping with your organizational risk mitigation strategy. At its core, this is a compliance-driven use case.
2) Investigate it:
Investigating a threat a year later is probably not going to be the best use-case to dictate your retention policy. To determine how long a log is typically useful for such investigations, look at how long you have leveraged logs in the past, and try to anticipate future use cases. The most recent study by the Ponemon Institute indicates that the Mean Time To Identification (MTTI) of a breach is 197 days. Keep in mind that is just the average – the spread could be far broader, including times much longer. Also, that this doesn’t account for the amount of time that threats can lie dormant within your environment, so if you are going to conduct forensics on a long-lurking breach, a longer log retention period will serve you well. Of course, you’ll need a SIEM or other Log Management solution to be able to use those logs effectively.
If you subscribe to an MDR service like ours, where logs are prioritized, examined, and acted-upon immediately, their retention after the fact becomes less of a concern for using them. As discussed above, and below, there are other reasons you may need to store logs. Of course, we adhere to the specific requirements of our clients when there are such reasons.
3) Trend it:
Making sweeping assertions about performance, changes or impact will require “MetaData,” so if you don’t need the log for other reasons, at least keep a log which counts the logs you’re discarding. It’s typically organizations who have a web app, SaaS, or another software product that this is a common use-case for, though if you are collecting and analyzing sensitive data from multiple endpoints (healthcare comes to mind), this use-case may also be relevant to your business.
Ultimately, if these reasons to retain logs dictate an extended retention period for your business, you should invest in the storage, backup, and security solutions required to ensure you can use them when and how you need to. Consulting your senior Ops leaders about how much redundancy should be in place can help you reduce operational risk and comply with governing bodies. Speaking of which, perhaps enterprises which control vast amounts of personal, proprietary, or otherwise sensitive data should be regulated by those governing bodies in such a way that the “prove it” reason above covers specific “use it” and “trend it” scenarios? We aren’t lobbyists – we are just trying to help you secure your environment, and select the right log retention strategy for your business. Through questions like these we aim to start a discussion, given the recent impact Google’s policy concerning logs may have had.
So, how long do you retain your logs?
What are your business, regulatory, and analysis reasons for doing so?
What outcomes have you been able to drive as a result of your policies?