RESEARCH ON DARK DATA ANALYSIS TO REDUCE DATA
COMPLEXITY IN BIG DATA
Big data is a large amount of data which is hard to handle by
traditional systems. It requires new structures, algorithms and techniques. As
data increases, dark data also increases. In such way there is one portion of
data within a main data source which is not in regular use but it can help in
decision making and to retrieve the data. This portion is known as “Dark Data”.
Dark data is generally in ideal state. The first use and defining of the term
"dark data" appears to be by the consulting company Gartner. Dark
data is acquired through various operational sources but not used in any manner
to derive insights or for decision making. It is subset of Big Data. Usually
each big data sets consists average 80% dark data of whole data set. There are
two ways to view the importance of dark data. One view is that unanalyzed data
contains undiscovered, important insights and represents an opportunity lost.
The other view is that unanalyzed data, if not handled well, can result in a
lot of problems such as legal and security problems. In this phase solution for
side effects of dark data on whole data set is introduced. Dark data is
important part of Big Data. But it is in ideal state so it may cause load on
system and processes. So it is important to find solution such that dark data
should remain same and also can’t affect rest of data.
Regards!
Librarian
Rizvi
Institute of Management
No comments:
Post a Comment