Title: Building metro map of scientific topics using hierarchy alignments
Speaker: Ian Jeantet
Presentation of the my joint work done with the Griffith University during my mobility in Australia. I’ll explain how we ended up to build metro maps of scientific topics to study the evolution of science through time.
Place and Time: 14 January 2020
Title: TBA (EGC presentation)
Speaker: Constance Thierry
Place and Time: 21 January 2020
Title: Feedback from the Shonan Meeting on Crowdsourcing/Future of Work
Speaker: David Gross-Amblard
Place and Time: Rennes -Aurigny(D165) and 17 December 2019
Title: Crowdsourcing the database course with HEADWORK
Speaker: Adrien Wacquet (2019 Summer Internship)
Place and Time: Rennes -Aurigny(D165) and 03 December 2019
Title: Web crawler & and the DIFFIX attack
Speaker: Antonin Voyez
Place and Time: Rennes -Aurigny(D165) and 19 November 2019
Presentation of a web crawler made for the PROFILE project and a short presentation of the current work done for my upcoming thesis with ENEDIS : linear reconstruction applied to the DIFFIX system.
Title: The anonymization of personal data: myth, limits, and successes
Speaker: Tristan Allard, Joris Duguépéroux, Tompoariniaina Andriamilanto
Place and Time: Rennes -Oleron(A008) and 12 March 2019
Title: Overlapping hierarchical clustering
Speaker: Ian Jeantet
Place and Time: Rennes -Oleron(A008) and 12 February 2019
Agglomerative clustering methods have been widely used by many research communities to explore hierarchical structures in their data. The produced cluster hierarchies contribute to understanding the hierarchical structures that are present in complex data. However the agglomerative methods necessarily result in a tree structure, where one has to make a split decision too early in the construction process, that can affect the conclusions one can make about the obtained hierarchical structure. In various settings, one needs a richer hierarchical structure to describe the clusters of the data. Moreover, clusters might also overlap. In this paper, We propose a framework that enables to compute hierarchical structures represented as directed acyclic graphs rather than trees. Our bottom-up method creates clusters with density-based merging criteria, such that the various clusters can overlap.
Title: Integrating uncertain data using user feedback in crowdsourcing applications
Speaker: Marion Tommasi
Place and Time: Rennes -Oleron(A008) and 22 January 2019
Crowdsourcing applications are used in many domains to perform tasks which are difficult for computers or to gather knowledge using a crowd of people. To execute a task in a crowdsourcing application, human workers by performing some micro-tasks and the resulting data is integrated into the system to proceed with the completion of the global task. However, the data provided by workers is uncertain as human workers can make mistakes or eventually intentionally give a wrong result. We want to use the feedback of other workers to evaluate the trust in the data at any time of the workflow. Ultimately, we want to use this trust to have a workflow which adapts itself depending on the data ant the perceived trust in it to improve data quality. I will first present a model for crowdsourcing applications then present the model for user feedback.
Title: Data-Centric workflow for Complex Crowdsourcing Applications
Speaker: Rituraj Singh
Place and Time: Rennes -Oleron(A008) and 15 January 2019
Crowdsourcing has emerged as a major paradigm for accomplishing work by paying a small sum of money and alluring the worker whole across the globe. However, the targeted tasks at crowdsourcing platforms are relatively simple, uncomplicated and are independent. In this work, we propose a novel data-centric workflow model for the design of complex crowdsourcing tasks with dependencies. The model allows orchestration of simple tasks, handles data and crowd workers, allows concurrency, and in addition provides high-level constructs allowing decomposition of complex tasks into orchestrations of simpler subtasks. We first define the syntax and semantics of the model, and then consider its formal properties, starting with the question of termination of a complex workflow (i.e., whether a system has non-terminating runs). Unsurprisingly termination is undecidable even for the simplest models. However, upon restrictions that are sensible in the context of crowdsourcing (namely that a crowd worker only has a bounded number of contributions in a workflow ), termination becomes decidable. We then extend the termination question to address the correctness of a workflow, i.e. the question of whether a terminating workflow always satisfies a constraint depicted in terms of the relation between the input of the workflow and its output.