Big Data 2019
Welcome to the 5th annual Big Data Conference at Linnaeus University.
This year's host for the conference is Linnaeus University Centre for Data Intensive Sciences and Applications (DISA). During the conference days you will meet invited speakers from other universities and industry, learn about results and ongoing research within DISA. On the evening of December 5th we will gather for a conference dinner.
Please note...
that an Open Science with Jupyter, Zenodo and Binder (tutorial) will be arranged in close connection to the Big Data-conference (December 4th, 1.30 - 5 pm)!
About the conference
On December 5th the conference opens with Coffee and registration, poster mingle in the open area. During the two days longer talks with an academic or industry a focus will be mixed with poster mingles during meals. We invite everyone that has an interest in Big Data and data intensive applications. The presentations during the conference will be held in English. In the evening there will be a social activity.
Programme
Thursday, December 5
09.30 Conference Introduction
09.45 Keynote 1: Flaminio Squazzoni, Professor of Sociology at the University of Milan, Italy: When ready-made data must be tailored and repurposed. The challenge of creating big confidential dataset in science in a public-private partnership.
10.45 Coffee break
11.15 Fast-forward + poster session about Ongoing research
This session will give you an idea of ongoing research related to data intensive sciences and applications; you will hear quick presentations from different researchers and research fields. Directly after these presentations you will be able to go out and meet talk to these researchers in the poster exhibition.
Ongoing research:
- From population bomb to production paradigm: 50 years of scientific literature on how to feed the world Lucia Tamburino, Giangiacomo Bravo, Yann Clough, Kimberly A. Nicholas
- Data-intensive tools for effective carbon mitigation in forestry Jorge L. Zapico, Rafael M. Martins, Johan Bergh, Örjan Vorrei
- Data analysis for the ALTO project at Linnaeus University Mohanraj Senniappan, Yvonne Becherini, Michael Punch, Tomas Bylund
- ”A Quantitative Benchmark for the Evaluation of Dimensionality Reduction” Rafael M Martins
- Visual Learning Analytics of Multidimensional Student Behavior in Self-Regulated Learning Marcelo Milrad, Rafael M. Martins
- Data analysis leading to the discovery of two Active Galaxies in the VHE gamma-ray range Tomas Bylund, Yvonne Becherini, Michael Punch, Mohanraj Senniappan
- Using Data Mining Techniques to Assess Students’ Answer Predictions Alisa Lincke, Marc Jansen, Marcelo Milrad and Elias Berge
- Less interest in election results and a bandwagon effect due to poll exposure; an online experiment Mike Farjam
- Demo of a Hybrid Asymmetric Collaborative Immersive Analytics System Nico Reski, Aris Alissandrakis, Jukka Tyrkkö, Mikko Laitinen, and Andreas Kerren
12.30 Lunch
14.00 Keynote 2: Giuseppe La Rocca, Customer and Technical Outreach Manager for the EGI Foundation, The Netherlands: The ascent of Open Science and the European Open Science Cloud
15.00 Coffee break
15.30 Fast-forward + poster session about New ideas
This session will focus on new research, plans or simply new ideas related to data intensive sciences and applications; you will hear quick presentations from different researchers and research fields. Listen carefully because the researchers might be looking for new partners and others that are interested in the same questions. Directly after these presentations you will be able to go out and meet talk to these researchers in the poster exhibition.
New Ideas:
- Skeleton Avatar camera Technique as measurement of functional ability in elderly persons Amanda Hellström, Sofia Backåberg, Welf Löwe,
- Feature selection in machine learning compared to statistical analysis performed on real-world data Olof Björneld
- New Methods for Community Detection and Analysis of Big Social Networks Masoud Fatemi, Jonas Lundberg, Pasi Franti & Mikko Laitinen
- New tools for measuring weak and strong social ties in social media?
Mikko Laitinen & Masoud Fatemi - Analyzing the effect of political trust on hashtag activism and protests
Elizaveta Kopacheva - Datasets available for machine learning with ALTO Mohanraj Senniappan, Yvonne Becherini, Michael Punch, Tomas Bylund
- Neural networks for ALTO: prospects for energy reconstruction Tomas Bylund, Yvonne Becherini, Michael Punch, Mohanraj Senniappan
- Data challenges in the COMET project Yvonne Becherini, Michael Punch, Mohanraj Senniappan, Tomas Bylund
- Heat and electricity production in the region of Kalmar - reducing greenhouse gas emissions by machine learning methods Fredrik Ahlgren
- Towards a Methodology for Intepretable Dimensionality Reduction in Exploratory Visual Analysis Rafael M. Martins
- Bayesian Regression on segmented data using Kernel Density Estimation Sebastian Hönel, Morgan Ericsson, Anna Wingkvist, Welf Löwe
18.30 Conference dinner
Friday, December 6
09.00 Day intro
09.15 Keynote 3: Antonina Danylenko, Head of Applied Machine Learning at Nordic Entertainment Group: Machine Learning for better entertainment recommendations: A Nordic perspective
10.15 Coffee break
10.45 DISA group presentations (15' per group)
Here you get a chance to learn more about the seven research groups within DISA, ongoing research and plans for the coming year.
- 10.45 Computational Social Sciences @DISA 2019: highlights of the year
Giangiacomo Bravo, Professor Social Sciences (https://lnu.se/en/research/searchresearch/computational-social-sciences/) - 11.00 Gamma-Ray Astronomy at DISA. Yvonne Becherini, Associate Professor Astrophysics
- 11.15 DISA Digital Humanities – new answers to old questions. Mikko Laitinen, Professor of English Linguistics. (https://lnu.se/en/staff/mikko.laitinen/)
- 11.30 Aims, Progress, and Preliminary Results of Visual Analytics Research at DISA. Andreas Kerren, Professor Computer Science
- 11.45 Research on eHealth at Linnaeus University – improving data to and from patients. Tora Hammar, Senior Lecturer eHealth (https://lnu.se/en/research/searchresearch/ehealth--improved-data-to-and-from-patients/)
- 12.00 Advances with forestry, wood and building technology. Magnus Persson, Doctoral Student, Forestry and Wood Technology
(https://lnu.se/en/research/searchresearch/forestry-wood-and-building-technologies/) - 12.15 Data-driven software quality - Year in review. Morgan Ericsson, Associate Professor Computer Science (https://lnu.se/en/research/searchresearch/data-driven-software-and-information-quality/)
12.30 Lunch
14.00 Keynote 4: Tobias Wagenknecht, Head of Data & Analytics at Aftonbladet: The false truth about everything being data-driven
15.00 Coffee break
15.20 Open Science Hack
As a result of the 4th annual Big Data Conference we organized the first Open Science Hack in April 2019 with more than 50 participants a mix of students, business professionals and researchers came together came together to programme and be inspired by each other. Learn more about the Open Science Hack, the winning work for the best technical execution by Drop table user team and for the best idea by A-Team. And don’t miss the launch of the Open Science Hack 2020. Morgan Ericsson, Associate Professor in Computer Science and Media Technology, Linnaeus University
15.50 Closing of the 5th Annual Big Data Conference – Welf Löwe, Professor in in Professor Computer Science and Media Technology, and Director of DISA at Linnaeus University
Conference speakers
Keynote 1: Flaminio Squazzoni (University of Milan, Italy): When ready-made data must be tailored and repurposed. The challenge of creating big confidential dataset in science in a public-private partnership.
Abstract: Research on science relies on available data. However, while we have plenty of data on publications and citations, which help measure the prestige of scientists and their institutions, we lack data on internal processes of peer review at journals and funding agencies.
These data are crucial to understand whether allocation of resources and merit in science is biased and assess if science is still a cooperative, civilized game between disinterested experts or a corrupted race towards hyper-competition and the 'publish-or-perish' mentality. In this talk, I will share my experience as leader of a large-scale European project that developed a protocol for data sharing of journal data with a group of publishers representing the vast majority of the current scholarly communication market. This experience testifies to the nexus of technological, legal and organisational aspects involved in data sharing between stakeholders, the power of hybridization of data sharing models and the beauty of the digital age. And it tells you that science is not corrupted!
Bio: Flaminio Squazzoni is full professor of Sociology at the University of Milan, Department of Social and Political Sciences, where he teaches Behavioural Sociology. He is the head of BEHAVE (www.behavelab.org), and also editor of JASSS-Journal of Artificial Societies and Social Simulation, co-editor of Sociologica -International Journal for Sociological Debate and member of the editorial board of Research Integrity and Peer Review, Sistemi Intelligenti and Socio-Cognitive Systems. He is advisory editor of the Wiley Series in Computational and Quantitative Social Science and the Springer Series in Computational Social Science. He is former President of the European Social Simulation Association (Sept 2012/Sept 2016) and former Director of the NASP ESLS PhD Programme in Economic Sociology and Labour Studies (2015-2016). E-mail: flaminio.squazzoni@unimi.it
Keynote 2: Giuseppe La Rocca (The EGI Foundation, The Netherlands): The ascent of Open Science and the European Open Science Cloud
Abstract:
The notion of a distributed infrastructure offering advanced resources and services for data-intensive processing in research and innovation has been part of the EGI mission and vision since the EGI design and implementation that started with the DataGrid project back in 2000 under the leadership of CERN. Distributed processing of data supported by a pan-European broadband network infrastructure, solutions for trust and identity management and the Grid middleware, have been the enablers of two Nobel prizes in Physics (2013 and 2017), and many more data-driven scientific discoveries in high energy physics, astronomy and astrophysics, health and medicine, and earth sciences resulting in more than 3,000 open access scientific publications enabled each year.
EGI has been supporting Open Science by enabling the sharing of data, by federating peer digital infrastructures in North and South America, Africa and Arabia, and the Asia Pacific regions with Europe, and by supporting open access to a rich portfolio of scientific data analytics tools. The services provided by the EGI Federation are now at the heart of the EOSC-hub project.
In this talk, it will be introduced how EGI supports big data based Open Science and contribute to the implementation of the EOSC vision. (Abstract as pdf)
Bio: Giuseppe La Rocca works as Technical Outreach Expert at the EGI Foundation since December 2016. His main focus is assisting research communities to connect compute and data intensive applications with EGI services in order to reach more scalable, sustainable and secure setups on top of international infrastructures. Since January 2018 Giuseppe leads training and skill development activities in the EOSC-hub H2020 project. Before joining EGI Giuseppe worked as technologist for the Italian National Institute of Nuclear Physics (INFN) division of Catania. Since 2004, both at National and European level, he has been involved in several projects co-funded by the European Commission. During these years, he has matured strong skills and competences on Grids and Clouds technologies, working on ICT scientific developments for supporting both emerging and already established VRCs. Giuseppe holds a MSc in Computer Science Engineering from the University of Catania (Italy).
Keynote 3: Antonina Danylenko (Nordic Entertainment Group): Machine Learning for better entertainment recommendations: A Nordic perspective
Abstract: The entertainment industry is transforming at a rapid rate. This is driven by new trends, growing customer expectations and AI technologies allowing for more innovation, disruption and opportunities for growth. At the same time, the industry is getting increasingly crowded – as the use of streaming services is on the rise, and the Nordic region spends more time online than ever before. Nearly four out of ten people watch video content on a daily basis, with three-quarters of the 16-24 year-old age bracket streaming that content from subscription-based services. We are seeing a new phenomenon emerge known as ‘stacking’ behaviour – where households typically subscribe to more than one service, just to keep their options open when it comes to deciding what to watch. With so many options out there, people can be paralysed by what’s known as the ‘paradox of choice.’ Personalising every aspect of the customer journey has become our main focus in the recommendation space, as consumers of entertainment have never been more spoilt for choice. Serving up relevant content recommendations at the right time is key to making the decision process as easy as possible. However, building and maintaining the lifecycle of recommender systems to capture customers' behavior and use different algorithms to guide them towards something they will enjoy watching is not easy. In this presentation, I will outline the end-to-end process of building a recommender system utilising Big Data and Machine Learning to address this challenge. (Abstract as pdf)
Bio: Antonina Danylenko holds a PhD in Computer Science from Linnaeus University, Sweden where she wrote a dissertation on the topic of “Decision Algebra: A General Approach to Learning and Using Classifiers”. After several years working at IKEA within Solution Architecture and Data Science domains , she joined the Nordic Entertainment Group—where she is now the Head of Applied Machine Learning. The Nordic Entertainment Group offers video-on-demand streaming, linear TV channels and radio broadcasting – probably best known for their Viaplay, Viafree & Viasat platforms. They’re responsible for connecting over 1.4 million subscribers to the content they love, with more than 1900 employees across the Nordics and the UK.
Keynote 4: Tobias Wagenknecht (Aftonbladet Hierta AB): The false truth about everybody being data-driven"
Abstract: Everybody is stressing out, they all feel the urgency to become data-driven. Established businesses disappear and unicorns disrupt the market and question well established work-flows. There is a hysteria about the need to change and to do it all at once over each and every business area. This presentation is supposed to put things into perspective, I will speak about my own mistakes and how the general perception of everybody else succeeding tricks us into feeling bad. In the end you will realise that you are not alone and that changes take time - no matter how fast paced we have become.
Bio: Born in Germany, raised in Spain, migrated to Sweden in 2011 - I consider myself a European data-nerd, who loves the beauty of numbers and charts as much as the satisfaction of being able to come up with an actionable decision instead of just another report. I spent almost half of my life within travel & hospitality and learned a lot about the eternal struggle of making a conservative industry more data-driven. It is a story about many failures, learnings and iterations - so let's have a talk and then try again!
Call for papers
Two fast-forward (FF) + poster (P) sessions will be organized as part of the Big Data Conference 2019. Each session will be introduced by fast-forward presentations where each participants will have 3 minutes to briefly summarize her/his research (max 3 slides incl. the title one). Directly after the FF part, participants will be redirected to their posters, where there will be possibility to interact with the interested public. There will be two FF+P sessions
- Ongoing research. The first of the two FF+P session will focus on research just published or at an advanced stage of elaboration. The main goal here is to present research results of general interest for the public and eventually receive feedback on ongoing works.
- New ideas. The second FF+P session will focus on future research, plans, or simply new ideas. The goal here is to share with the public their own plans, receive feedback and possible find synergies to develop future research together.
To submit to either session participants should send to Diana Unander <diana.unander@lnu.se> a 500-word abstract briefly presenting the research by October 20, 2019. Each participant can submit at most one abstract for each of the two sessions.
If accepted, max three slides outlining the research and a poster presenting it in more details should be sent to the same address by November 24, 2019.
Registration
The Big Data Conference is free of charge, but if you register and don't show up you will be charged a fee that covers food etc.
Registration has closed!