The dark data represents the whole collected, treated and stored information by companies. These data come from sources that generate data from an automatic way, without being reused. They are for the most part incomplete and so are forgotten by companies (but stored).
It can also scan paper documents, photos, videos or neglected information because they don’t seem to be important at first.
What is the dark data?
The dark data is the whole unused, unknown and untapped data of a company, generated by the daily user interactions. Also generated by countless machines and data systems and unstructured data derived from social networks.
These data can be considered too old to have a value, partial, repetitive or being under an inaccessible format with the tools at disposition. Many cases, their existence is not even known. Nevertheless, the dark data can also be an important untapped resource of a company.
In addition, these datas can’t be protected. They are unranked in the information system and can be victims of cyber-attacks. Besides, the law imposes companies to delete some data. This principle of limited conversation of data is provided for by the RGPD and the IT and Liberty law.
Overall, dark data represent about 55% of stored data in the world. For a company, this can represent about a million or billion of files. In the world, companies work to reduce digital carbon emissions, but often, they don’t take into account those of dark data.
What is the carbon impact of the dark data?
Dark data has an ecologic weight very important, which include not only huge layers of data but also a great energetic consumption.
Indeed, even data non used also requires to be stored. Dark data takes up some space on servers and the environmental impact of data centers is important because they consume a lot of electricity and need to be cooled 24h/24 and 7d/7.
According to a study by Veritas, the data stored and never used are responsible for 6.4 million tons of CO2 in 2020. This is the equivalent of the carbon footprint of a car traveling 575,000 times around the Earth, every year!
How to reduce the impact of dark data?
Within a company, we can reduce or even delete the dark data. We can start by the formation of teams on this issue by establishing a policy of data management.
In order to reduce more effectively the dark data impact, you can also follow the next tips of Veritas:
Identify all storage places:
Discover and map data. These are the firsts to understand the flow of data within a company. Win in visibility on data storage centers, their access and their storage time is a first key base to manage the dark data.
Put in light the dark data:
Adopt a proactive approach to Data Management. It allows companies to win in visibility on data, their relative approach for the data visibility needs to integrate archiving solutions, of save and security. In order to prevent the data loss and guarantee that their storage must be in adequation with existing policies.
Minimize and control data:
Minimize the data and limitations of their use. This allows companies to reduce the quantity of stored data and to make sure that these ones are kept to answer at the initial objective of collection. This approach allows in addition to reduce the use of data storage and so reduce the digital carbon footprint.
Ensure continued compliance with compliance standards:
Rules of compliance like those of the RGDP impose to all countries to report some types of data violation to the control authority and in some cases, to the concerned persons. Companies need to evaluate their capacity to find the possible breaches and prompt reporting procedures to ensure compliance.
Conclusion:
According to a study of Veritas, the dark data would represent 52% of stored data by companies and organizations. The volume of global stored data reached 33 zettabytes in 2018, and will reach 175 zettabytes in 2025, of which 91 zettabytes of dark data.
The 6.4 million tons of generated CO2 by the storage of these data correspond to the emissions of a country such as the Ivory Coast.
Companies need so to understand the data of the dark data, and the policies of storage that surround them to avoid generating more carbon emissions. We can ask ourselves the question of the use of some data treatments. Approaches of digital sobriety and of responsible digital pushes us to question the utility of IT services. Is the development of IA would help the data sector and so seriously reduce its carbon emissions?
Even if Greenoco doesn’t intervene on the dark data, our solution of reduction the carbon footprint of websites allows to reduce the disk space used by a website, and also of its saves. In this sense, Greenoco allows to reduce the storage of useless data and contribute to reduce the digital pollution.