Data Intensive Software Technologies and Applications (DISTA)

The research group Data Intensive Software Technologies and Applications studies data-driven approaches, such as machine learning, artificial intelligence, and big data, to automate and improve software development stages. DISTA is a core research area within the Linnaeus University Centre for Data Intensive Sciences and Applications (DISA).

Our research

The research group Data Intensive Software Technologies and Applications (DISTA) studies data-driven approaches such as:

Machine learning (ML) and artificial intelligence (AI) use (big) data to automate tasks such as reasoning, planning, deciding, and predicting.
Software and information analysis consider the IT systems as data and reflects on their quality.
Scalable computing technologies allow coping with the volume of big data sets and the velocity of big data streams.

Altogether, data-intensive technologies are enablers for turning data into information and actionable knowledge, that automate the implementation of smarter systems or even generate components of these systems.

Machine learning and artificial intelligence

With the concept of decision algebras, DISTA has suggested a unifying theory for classification approaches. With the concept of aggregation that condenses multi-dimensional, partially ordered data to a totally ordered score, DISTA has opened a new branch of unsupervised machine learning.

Our research also focuses on applying technologies such as statistical concepts, supervised and unsupervised learning, classification, regression, clustering, etc., to real-world problems in engineering, science, and society.

Software and information analysis ...

... assesses software engineering artifacts such as processes, specifications, documentation, and code. This may be estimating the effort of developing and maintaining software, getting new users to understand software structure and behavior, finding dependencies and changing impact of sub-systems, assessing the software quality, and finding the relevant components for re-engineering, etc.

DISTA pioneered the application of measurement and testing techniques to assure information quality (IQ). With our industry partners, Sigma Technology and Softwerk, DISTA have applied this quantitative approach to IQ to numerous real-world documentations.

Scalable computing technologies ...

... such as parallel computing, refers to technologies enabling scalability of systems to large problems or data sets, e.g., by executing a program on more than one processor or core. We are interested in high-performance computing, distributed computing, and stream processing as used in e.g. scientific and technical data mining.

With context-aware composition, a technique for self-optimizing software systems based on profiling data and ML, and with the first ever fully block-free garbage collector, we have significantly contributed to scalable computing.

Connection to Linnaeus University Centre for Data Intensive Sciences and Applications (DISA)

The Linnaeus University Centre for Data Intensive Sciences and Applications (DISA) is one of the university’s profiled research areas. DISA studies open questions in the collection, analysis, and use of large data sets. With its core in computer science, it takes a multidisciplinary approach and collaborates with researchers from e.g. forestry, mechanical engineering, and e-health. The DISTA research group contributes to the DISA core technologies enhancing e.g. machine learning and scalable computing.

We directly contribute to the DISA research areas of

AdaptWise with context-aware composition,
e-health with computer vision and deep learning for the analysis of human motions used in eldercare and high-performance sports, and with efficient data selection preprocessing methods for medical researchers,
Forestry with computer vision and deep learning for the analysis of ancient remains to avoid forestry activities there, and for the identification of log fingerprints, and with deep learning for strength grading of sawn timber,
Smart industry with data analysis and machine learning for predictive maintenance, and
Digital humanities with the Nordic Tweet Stream providing high-quality data for research in the languages used in Scandinavia.

As an essential part of this research, we build and maintain the High-Performance Computing Center (HPCC) of DISA.

Connecting with industry and society's research needs, we have established and organized the industry graduate school Data Intensive Applications (DIA).

Connection to Smarter Systems

Smarter Systems is a complete knowledge environment focusing on systems and systems engineering. Engineering modern computing systems is complex and operates in uncertain and continuously changing environments. Both systems and the way we engineer them must become more intelligent. That is, they need to adapt and evolve through a perpetual process that continuously improves their capabilities, to deal with the uncertainties and change they face.

Humans learn from experience – machines from data. Hence, data-intensive technologies are core to make systems more intelligent. In addition to the contributions to these technologies, e.g. in machine learning and scalable computing, DISTA contributes to three challenges of Smarter Systems:

Software technology processes and tools putting together data-driven technologies for smarter engineering of smarter systems.
Verified guarantees of applications based on data-driven models regarding the accuracy, performance, response time, safety, etc., persisting over time.
Understanding data-driven models for mastering data-driven applications and for turning artificial back into human intelligence.

Connection to education

DISTA is responsible for the following courses:

Bachelor courses

2DV516, Introduction to Machine Learning, 7,5 credits
2DV605, Parallel Computing, 7,5 credits
2DV50E, BSc Thesis, 15 credits

Master courses

4DV507, Code transformation and interpretation, 5 credits
4DV652, Project in Data Intensive Systems, 10 credits
4DV657, Parallel Computing, 5 credits
4DV660, Statistical and Machine Learning, 5 credits
4DV661, Deep Machine Learning, 5 credits
4DV50E, MSc Thesis, 15 credits
5DV50E, MSc Thesis, 30 credits

Data Intensive Software Technologies and Applications (DISTA)

Our research

Machine learning and artificial intelligence

Software and information analysis ...

Scalable computing technologies ...

Connection to Linnaeus University Centre for Data Intensive Sciences and Applications (DISA)

Connection to Smarter Systems

Connection to education

Projects

Current projects

Concluded projects

Current

News

Publications

Staff

External doctoral students