Community in Focus: Professor Tim Weninger on how we consume and curate information

By Chris Walker

In addition to publishing technical articles and tutorials, here at Overleaf and ShareLaTeX we like to share some of the interesting work and innovative research people in our community are involved with.

Today, we'd like to introduce you to Tim Weninger, who is an assistant professor in the Department of Computer Science and Engineering at the University of Notre Dame. Amongst other things (which we’ll get into), Tim is working to make AI assistants like Siri and Alexa smarter, and has given a TED talk on social media manipulation.

But first, let’s take a quick look at where it all began.

In the beginning…

Tim grew up in a small town in Kansas, United States, and after high school went to Kansas State University. He enjoyed building things, but because he was “too impatient to wait for concrete to dry”, Tim chose to major in computer science. The year after graduating Tim stayed in Kansas and was introduced to machine learning and data mining. After seeming to have a knack for the subject, Tim joined the University of Illinois Urbana-Champaign to work with Jiawei Han—who is considered by those in the field to be one of the fathers of data mining.

Four years later, around the time his son Caleb was born, Tim joined the faculty of the University of Notre Dame where he is now an assistant professor in the Department of Computer Science and Engineering.

An atypical computer scientist

Tim describes himself as “an atypical computer scientist who doesn’t study programming languages or compilers or circuits”. Instead, Tim says his goal is to “employ the tools and methods developed in computing to help understand and solve some of our most pressing challenges.”

When I asked about his work, Tim told me “My students and I study how people consume and curate information”.

With the sheer volume of information created and shared every day, the way we consume and curate information has become a core part of society. Looking at what we do with the masses of available data helps us to understand, and then to shape and define, how we interact with each other and understand the broader world around us.

Tim has three specific projects which are aiming to tackle this topic from different points of view:

  • knowledge graphs
  • social media manipulation
  • building blocks of a network

and we’ll look at each of these in turn.

Knowledge graphs

The goal of Tim’s first project is to use modern machine learning tools to discover whether information is missing from a knowledge graph, and where it is, to identify what is missing and to fill the gaps.

Wikipedia makes a great case study, and is the subject of this project. Tim and the team are looking to learn the underlying meaning behind the relationships that are encoded in the knowledge graph.

Tim gave me a simple scenario as an example, “If a Wikipedia editor correctly identifies 47 of the 50 US state capitals, but incorrectly enters Chicago as the capital of Illinois and forgets Alaska and Hawaii, then our system will learn a model of what it means to be the capital of a US state.

The system will then reason about the knowledge graph to fix the incorrect relationships and fill-in the missing state capitals.

This is especially useful for services like Siri or Alexa that need to have reasoning capabilities and a more complete view of the state of world.” If you’d like to dig a little deeper into this, you might find these papers interesting:

Social media manipulation

Tim’s next project investigates how the design of social media platforms might impact the spread of information. As a part of this work, the team are investigating how and why people upvote, like, or retweet news on social media.

Tim presented some of his work at a TED event a few years ago, which you can see here. As Tim explains in his talk, he found that just one quarter of one percent of viewers on Reddit determine what the rest of the site’s readers see.

For anyone who is familiar with Reddit, that might seem a surprising conclusion to draw about a platform which at the very least gives the impression that every user is able to equally influence the potential exposure of a post (if this is you, watch the video where Tim explains his work).

Building blocks of a network

Behind every social, technological, and biological system is an intricate network that encodes the relationship between the objects in the system. Extracting the useful and interesting building blocks of a network is critical to the understanding of many systems.

And so Tim’s third project investigates a newfound relationship between graph theory and formal language theory wherein, “for lack of better terminology, we parse a graph into its small-pieces.”

“This project is directly analogous to how a high school student might learn to parse a sentence into its subject, predicate, nouns and verbs, except, instead of discovering the parts-of-speech, we parse a graph into triangles, squares and other shapes to understand how the graph is constructed.”

The focus of this work is on developing and evaluating principled techniques that learn the building blocks of real world networks. The team have put together a paper with preliminary results, which you can find here: Growing Graphs from Hyperedge Replacement Graph Grammars. Proc. of CIKM, 2016.

Consume and curate

As Tim said earlier, his studies are focussed on how people consume and curate information. The understanding coming out of work like this has a direct impact on the world around us.

Tim explained:

“Reasoning about knowledge graphs will make AI assistants like Siri and Alexa smarter. A principled understanding of the structure of graphs will provide valuable insight into the networks that govern natural phenomena, and a deep understanding of the role that social media plays in our world is critical to the maintenance of our society.”

Our thanks to Tim

I’d like to thank Tim for his time in talking to me and sharing some details of his fascinating research projects.

We’re keen to highlight other interesting work from anyone using our platforms over the coming months. Please get in touch if you have a story to share.

If you're interested in reading more about connections within collaborations, check out this research report on collaboration we published with Digital Science earlier this year.

Chris Walker

Chris Walker

Marketing and Content Development

Chris develops new marketing content and ideas for Overleaf. With a technical background as a former systems engineer, he has a data-driven approach and is always looking to test new ideas before diving in.