Big Data

Big Data Analytics – mastering the deluge of data

The discussion about Big Data is about much more than the clever and profitable analysis of Internet data. In the age of Industry 4.0 and the emergence of cyber-physical systems and ultimately highly integrated Digital Ecosystems, the issue is to generate actually tangible added value for companies and individuals from the potential availability of a seemingly endless stream of data. When it comes to Big Data in a Digital Ecosystem, data from classical embedded software systems and business information systems as well as from humans as a third central source of data play a key role. In this context, it is important, on the one hand, not to place the functional safety of the involved embedded systems into jeopardy and, on the other hand, to assure data security within a Digital Ecosystem in the long term. This also forms the prerequisite for getting humans accepted as central figures in a Digital Ecosystem . On the one hand, Big Data does, of course, raise many questions regarding how to deal with the users of Internet services. On the other hand, there is no doubt that the analysis of data on the Internet allows making better offers that are more appropriate for the respective target customers. To the same extent that digitalization increasingly permeates all areas of our lives, more data is generated, meaning that Big Data also has a cross-cutting role.

Big Data applications – Our research topics

Decision models

Nowadays, a multitude of data is collected across a huge variety of systems, but it is often not clear how this data can be used in a meaningful way in the context of strategic orientation and considering data security. Fraunhofer IESE is working on how decision models must be constructed systematically based on business goals and how data must be aggregated, resp. condensed, accordingly in order to permit efficient decision-making.

Data quality and information quality

Another basic problem of Big Data Analytics is trust in the data and the information derived from it. The data may stem from widely different systems of various organizations, and details about the collection method used and the quality assurance performed are not always known. The explicit modeling of the data, incl. the requested quality characteristics, such as completeness, consistency, or up-to-dateness of the data, is based on approaches from the area of software quality modeling, which is one the areas that Fraunhofer IESE is working on.

Data visualization

Not all decisions in a Digital Ecosystem are made on the basis of a control system that can be automated; rather, many decisions are made by humans. Visualization mechanisms are thus indispensable to support decision makers. Here, a rough distinction is made between approaches for the condensation and user-appropriate exploration of Big Data and approaches for the efficient algorithm-based implementation of the visualization of large amounts of data. At Fraunhofer IESE, one of the main areas of research concerns the question of which visualization mechanisms are suitable for Big Data and how users must interact with these in order to be able to make decisions efficiently.

Acceptance

An overarching issue regarding the usage of Big Data in Digital Ecosystems is the acceptance of such systems by humans as the central users. While guaranteeing the security of the data provided, an attempt must be made to integrate humans as a data source as reliably and transparently as possible while not overloading them with the wealth of information, but rather providing them with the amount of information that is appropriate for making the respective decision.

Big Data data protection and data usage control

Central issues for companies in the context of Digital Ecosystems are data sovereignty and data protection. Even though sensitive data such as production and quality data might offer great potential for scenarios such as cross-company production data analysis in the context of automotive manufacturing, this data is not made freely available by the stakeholders. Research in the area of data protection for Big Data in Digital Ecosystems is therefore particularly important, as it eliminates a core obstacle for cross-company analyses. Research performed at Fraunhofer IESE in the area of data usage control already allows effective protection of data leaving a company’s premises by making use of correspondingly formulated rules (so-called policies) and a modified infrastructure environment (enforcement framework). These frameworks are being optimized successively for the needs of Big Data technologies. Here it must be investigated whether the required performance in the analysis of the data can be achieved by combining existing usage control technologies with Big Data technologies.

Big Data – the biggest challenges

It is imperative to master some central aspects of Big Data in Digital Ecosystems if such systems are to become reality. One of the challenges in practice will be, for example, to find suitable infrastructures:

In Digital Ecosystems, companies of various sizes are collaborating in an ecosystem. For the analysis of the data, a Big Data infrastructure is provided in these companies with appropriate computing power. This entails several challenges for companies.

Small and medium-sized companies, in particular, might not be willing or able to purchase a dedicated infrastructure. Here, new solution strategies must be found:

On the one hand, the trend towards storing data in the Cloud offers first approaches for Big Data analytics. To examine sensitive data, temporary infrastructure lease approaches are also conceivable. Both approaches can be thought of as Big Data analytics “on-demand”.

Another problematic issue continues to be the progressive establishment of different, partly incompatible technologies for Big Data Analytics:

Currently, a heterogeneous landscape of Big Data providers is developing, which manifests itself in different ways in the Digital Ecosystems of different companies. For cross-company analyses, however, it is necessary to establish de-facto standards in an ecosystem, compatible interfaces between provider systems, and a suitable, highly performant intermediate layer for data exchange.

Establishment of innovative business models

It is a generally accepted fact that innovative business models are a central issue in Digital Ecosystems. In order to realize these, new, partly not yet existent service providers would need to become established. One example would be the provision of the above-mentioned “on-demand” analyses. However, potential stakeholders in such an innovative business model in the Digital Ecosystem are often faced with the question of risk management, respectively the viability of such a business model. If research develops simulation environments for such Digital Ecosystems in the sense of rapid-prototyping environments, the question of added value can be answered better. Once processes and Big Data technologies are more mature and can guarantee data protection for analytics and its data, the inhibiting barriers will come down.

Standardization

To achieve efficient data exchange among companies as well as between the analysis results obtained with different technologies, standardization of the data, of their modeling processes, and of the specification of data qualities would be very helpful. Many standardization processes are nowadays taking place in specific domains, e.g., in mechanical engineering, in the automotive industry, or in the financial sector. However, it is characteristic of the Digital Ecosystems of tomorrow that stakeholders from a wide variety of domains are involved in an ecosystem, with the number of stakeholders in the ecosystem varying widely. This can make it very hard to achieve standardization quickly and thus constitutes one of the greatest challenges.

The numerous conceivable application scenarios in a Digital Ecosystem can no longer be regarded independent of each other, but should rather be considered an excerpt of a continually evolving system. In this system, new services and organizations are added and replace others over time. This intertwining can already be observed today in the energy sector and in the area of electromobility, where electric vehicles consume energy on the one hand, but can also be used for decentralized energy storage on the other hand. Another example is the interconnection between production technology and smart mobility systems. Here, the goal is to reduce the transportation and waiting times of goods and to be able to react flexibly to the re-planning of production processes.

 

 

Big Data application examples and projects

 

Automation in production and plant engineering

Gebr. Pfeiffer SE

Fraunhofer IESE supported Gebr. Pfeiffer in the development and implementation of intelligent algorithms for mill protection and predictive maintenance. 

PRO-OPT

BMWi project for the exploration and evaluation of Big Data solutions for companies aimed at production optimization while complying with data security aspects.

Become a Data Scientist

Data Scientist Basic Level (certified) – Fraunhofer offers a three-level certificate program to become a Data Scientist. 

Seminar: Analyzing the Potential of Big Data

Tips for the use of an AI language model

Do you want to benefit from generative AI such as ChatGPT and develop innovative solutions? Read our blog post for the 8 best tips on LLMs. [in German]