Skip to content
Institute for Logic and Data Science
Menu
  • Home
  • Research
    • Research Projects
    • Scientific Seminars
  • Events
  • People
  • Fellowships
  • Partners
  • About
    • About Us
    • Support us
    • Executive Board
    • Contact
Menu

Data Science Seminar in 2022-2023

Below you can see all the talks held at the Data Science Seminar in the 2022-2023 season. For forthcoming talks, see the main page.


Wednesday, April 12, 2023 at 14:30 (Popa Tatu 18)

Mihai Cucuringu (University of Oxford)

Machine learning on signed networks and time series analysis with applications to finance

Abstract:

We discuss scalable spectral methods for detecting hidden structures in large signed/directed networks, with an eye towards robustness under sampling sparsity and noise perturbation. As an application, we consider the problem of propagating news sentiment in a financial network. When considering the universe of SP500 instruments (stocks), only about one third of the instruments have news sentiment released on a typical trading day. This raises the question of how does the disseminated news sentiment impact the remaining set of instruments. We proposes fast algorithms for understanding how news sentiment propagates through a financial correlation network. Our approaches are broadly applicable to instances where one has available a sparse signal (e.g., news sentiment, for a subset of nodes) and would like to understand how the available signal measurements propagate through the network to the remaining nodes. We formulate this problem as an instance of the group synchronization problem over Z2 with anchor information. Time permitting, we discuss potential extensions that leverage directed graph clustering algorithms from the lead-lag detection literature.


Thursday, March 16, 2023 at 11:00 (FMI, Hall 214 “Google”)

Eduard C. Drăguț (Temple University)

Continuous, Gradual Entity Mining from Web Data Streams

Abstract:

Named Entity Recognition (NER) is a key component in many intelligent systems like knowledge graphs, question answering, information retrieval, and early prediction of emerging events. NER systems have been studied and developed for decades, nevertheless NER is a continuous, neverending learning process because language and its usage evolves over time. For example, the emergence of social media with colloquial user content exposed the previous state-of-the-art NER that expected long documents written in formal language. In this talk, I present our work on entity mining from microblog streams, where we advocate for continuous, gradual entity mining with revisits. It needs to be continuous because the system stays with a topic for its duration in a social media stream. It is gradual because the system begins with easy instances, which can be labeled with high accuracy, and then it gradually labels more challenging instances. The system revisits difficult instances that were encountered ahead of easy instances in a stream. If these three conditions are met than (near) real-time NER can be achieved over microblogs. I will also introduce our work on recognizing entities that follow or closely resemble a regular expression (regex) pattern, their applications to other (unexpected) domains, and how we use it to seed our work on human-in-the-loop mining.

Follow us

Subscribe to our RSS feed.

Subscribe

Support us

Looking for ways to support our research? Check out all the different opportunities!

Contact us

Interested in logic and/or data science research? Send an email to contact@ilds.ro

Institute for Logic and Data Science
Str. Popa Tatu nr. 18
010805 Bucharest, Romania
contact@ilds.ro
  

© 2025 Institute for Logic and Data Science | Powered by Minimalist Blog WordPress Theme