Hal Daumé III is a Language Person

As a Volpi-Cupal Professor of computer science and language science at the University of Maryland and a senior principal researcher at the machine learning and fairness groups at Microsoft Research NYC, Hal Daumé III thinks a lot about how machines learn language. He also thinks a lot about trust: he leads TRAILS, the Institute for Trustworthy AI in Law & Society. Roadmap spoke with him about all of the above—plus béchamel sauce, toddlers, and The Terminator.

This interview has been edited for length and clarity.

What are you working on right now?

In a professional capacity, I’m trying to understand how we can make AI systems work better for people and work better for society. A lot of that is trying to understand the technical piece, the traditional computer science stuff—what I’ve been trained to do the longest. More recently, it’s also trying to understand how people make sense of technology and how technology is reshaping society in various ways. 

Outside of work, I have an almost three-and-a-half-year-old, so that’s a big part of how I spend my time. I’m a language person, I really love language. And with a three-and-a-half-year-old, I’m super excited to see all the language development. My kid is growing up speaking French, which I don’t really speak, so I’m trying to learn it at the same time he is. That’s been a bit of a challenge for me—it’s now getting to the point where his French is surpassing mine in some places, vocabulary in particular. He’ll say things, and I’m like, I don’t know if that’s a nonsense word or if it’s just French. 

What does work mean to you?

The first thing that comes to mind is what do I do to earn a paycheck. I’m in a pretty fortunate position where [that’s] also what I really like doing. It also comes with downsides, in the sense that sometimes it can be hard to separate yourself from work. I do really like most aspects of my professional career.

Work at home is kind of similar, right? I love my kid, by the time it’s 2 p.m. and I’ve been on Zoom all day, I’d much rather go pick him up from school than—sorry, it’s 4 p.m. and we’re having a Zoom!

It’s okay, I have a stroller ready at the door too. It’s what I’m doing as soon as we sign off.

I certainly enjoy it, but it is also certainly work. There are things that have to get done because that’s the responsibility of being a parent. So maybe there’s some notion of responsibility [to how I think about work]. I have a responsibility to the people who pay me and to the students who take my classes and to the research organizations that fund me as well as to my colleagues. I also have a responsibility to my family.

Do you have a framework (or ideals or goals) for your career, or for the kind of work you do? Have those goals changed over time?

For a while, to be honest, I just wanted to do things out of intellectual curiosity.

Broadly I work in AI. More narrowly, I work in natural language processing, which everyone now knows as ChatGPT. Up until a year ago, no one on Earth had any idea what it was that I did. When I was in grad school forever ago, what I did was [because] I love language, I love math, I love computers. It put these things together in a fun way. It was like, “Let’s do fun stuff.”

What type of work can I do because I am in this position of relative freedom?

At some point in the last five to 10 years, the technologies that I had been building for fun started having impacts in the world. I still do things just because I’m intellectually curious, but it’s increasingly important to think about the broader questions of what this technology means for society as a whole. Are we developing [technologies] that are promoting human well being? Or not? 

As an academic, in particular, I’m not beholden to some company bottom line or to shareholders. I remember when I first started as a professor, I was talking to this much more senior AI guy about this project I wanted to do that was totally out there, off the beaten path. I didn’t know if I should [pursue it].

His advice was: Anything that a company is doing, don’t bother doing because they’re going to do it faster. They have different incentives. You’re an academic. Do the things that are interesting. Do the things that are quote unquote “good,” whatever that means. 

I’ve thought about that a lot over my career. What type of work can I do because I am in this position of relative freedom? A lot of that comes down to stuff that companies won’t do because it doesn’t move the bottom line. But a lot of that stuff is important.

One of the questions you pose in your research is “How can we get computers to learn through natural interaction with people/users?” How do you define “learn”? 

I could spend an hour talking about this one question. Fundamentally, learning is about having a system where past interactions impact its future behavior. Microsoft Word doesn’t learn, it behaves the same no matter what. Spotify learns what kind of songs you like: it adjusts its behavior through experience.

It’s easy to slip into thinking about memory in human rather than computer terms. There’s a conceptual link, but we’re not talking about a theory of mind, here.

There’s a big debate between the behaviorists and cognitivists. The behaviorists are like, “Well, if it acts smart, it’s smart.” The cognitivists say, “Well no, it has to think like people think.” These debates are kind of silly—I honestly just don’t care. What I want to do is build technology that helps people do things that people want to do. Whether it’s quote unquote “intelligent,” whether it’s understanding all these things—that is just irrelevant to me. It turns out that a pretty effective way to help people do the things they want to do is to have a system that can learn rather than something that I just try to pre-program and anticipate everything that could happen. I say I’m an AI person, but I actually don’t care about intelligence per se.

You brought up Spotify as an example. Is that what you mean by “natural interaction”? A user interacting with a piece of technology in the normal course of their lives?

I used to do a lot of work on simultaneous translation, trying to build systems that would translate in real time from one language to another as someone is speaking. Suppose you’re speaking German, and I’m speaking English. I say something, and [the software tries to] translate it into German, but it comes out as gobbledygook because the translation system messed up. Then you’re going to say, “Oh, sorry, say that again?” or “What did you say?” Then I’ll repeat, or maybe paraphrase what I said, and then maybe it comes through. While you are communicating with me, in principle the system could say “Ah ha, she asked him to repeat himself. That suggests I got it wrong last time.” [The instruction is] implicit. 

What I want to do is build technology that helps people do things that people want to do. Whether it’s “intelligent,” whether it’s understanding all these things—that is just irrelevant to me.

Using Spotify, you explicitly tell the system: “I like this song,” “I don’t like this song.” We’re pretty good at learning from explicit feedback. Recommender systems are probably the most significant form of AI technology that people interact with on a daily basis, but we’re not perfect at it. When you give feedback to these systems, it tends to be very short term: “I don’t want to listen to this song RIGHT NOW.” It’s important [information], and you want to prevent people from going away from YouTube or Spotify right now. But what you actually care about is whether they are going to come back a month later. Even when you get this very explicit feedback, the challenge is how [to tie] this short-term context into longer-term engagement. This is also a challenge for implicit feedback, but bigger is recognizing when someone’s natural behavior with a system is indicative of that system working or not working. 

One thing that’s really cool about language is that we can talk at lots of levels of abstraction. I’m into cooking so I can say, step one of this recipe is “make béchamel sauce.” For some that’s a totally reasonable first step. But for other people [that recipe needs to be]: “melt butter, add flour, add heavy cream.” And you can go lower level: “Take the butter out of the fridge. Cut off one tablespoon. Put it in a pan. Turn on the heat.” At a super low level you might even say something like: “Move your hand six inches to the right. Grab the yellow stick in front of you.” Language allows us to communicate in all these ways based on what we both know in common. In human communication, if I tell you to “make béchamel sauce,” and you don’t know how—you’ll just ask. 

If you give a little digital assistant some instruction that’s the equivalent of “make béchamel sauce,” this high level thing decomposes into all of these other things that [the assistant] may or may not know how to do. If it doesn’t know the recipe, we need to build up that common ground. If it doesn’t know what butter is, then we need to build up even more common ground. We’re not going to have a deployed system doing these things any time soon, but [based on our and others’ research] this sort of approach could work. 

Language allows us to communicate in all these ways based on what we both know in common.

A dream that lots of AI people have is to have a little robot in your house to help your elderly parent with physical tasks. My elderly parent is going to want the system to do different things than what your elderly parent is going to want, and they both need to be able to teach the robot to do new things. But my parent is not an AI expert, right? So they’re going to need to be able to teach the system to do new things in ways that are natural to the person—not the AI.

Another one of your research goals is minimize harm in the learned models, which you’ve defined as promoting fairness, transparency, and explainability. What does that look like to you at this moment in time?

If I could wave my magic wand and change one thing, I would get people to stop worrying about quote unquote “autonomous AI systems” and start thinking more about building systems that help people. This is often called augmented intelligence instead of artificial intelligence.

If you go back to the 1950s, when AI was starting as a field, all of the goals of the two dozen people or however many at Dartmouth are like, “We want fully automated systems that can do intelligent-sounding things like play chess.” It turns out that playing chess is not actually that hard. Making a cup of coffee is much harder than playing chess.

If I could wave my magic wand and change one thing, I would get people to stop worrying about “autonomous AI systems” and start thinking more about building systems that help people.

I get it. If what you’re interested in is [the concept of] intelligence and understanding—then yeah, thinking about autonomy maybe makes sense. But I’m interested in systems that help people do the things that they want to do, and that fundamentally means not autonomous. It means a system working with a person to solve some task.

In some ways a lot of the technology required for automation and augmentation is similar, but the emphasis is not on trying to fully automate a system. It’s on trying to find some need that people have and then to automate the parts of it that they don’t like, or help them with the parts that are hard but not fun. It’s important to remember that people actually like doing hard things. 

Like making béchamel sauce.

And once you start thinking about augmented intelligence rather than autonomous systems, you immediately confront this question: What do people actually want? Is this system making people’s lives better or worse? If your objective is to help people, you take different steps than what you’d take if you’re trying to prove that you’ve made an “intelligent” system, whatever that means.

The recent drama at OpenAI has again brought the doomsday-like concerns—a.k.a. the Terminator scenario—commonly associated with the effective altruism movement into mainstream conversation. But this is a very different kind of danger than the one represented by, say, algorithmic bias. (To say nothing of energy-use concerns, or AI’s potential to impact wages and job stability.) How do you think about these many different types of alarms bells, and how do you weigh their likelihood (or priority)?

I definitely don’t subscribe to the existential risk camp.

Some things are undeniably true: People are being harmed today by AI systems. We need to do something about that, and by we I mean everyone. There are technical challenges, regulation challenges, civil society challenges. These are well-known [and long-standing] issues. Some of this requires continued research. Some, committing to best practices. Some of it, probably some form of regulation—or enforcing official rules that are already on the books.

[As for] the “may happen someday in the future in my science fiction novel” stuff? I see zero evidence that we’re even remotely close to some situation where some rogue AI decides we are an impediment to its continued progress and therefore kills us all. It boggles my mind that somehow people look at ChatGPT and jump from there to killer robots. We can barely get a robot arm to pick up a plate.

It boggles my mind that somehow people look at ChatGPT and jump from there to killer robots. We can barely get a robot arm to pick up a plate.

The real concern is a bad guy with an AI, not bad AI. Everyone talks about deep fakes, misinformation, the national security risks of hacking: these are totally realistic concerns that have nothing to do with some rogue AI system.

Where I do agree with the existential risk crowd is that we know that the AI systems we’ve built today are often pretty underconstrained. If you are using an image generator tool, you write like five words, and it produces an amazing image. That tool did a lot of things that you didn’t tell it to do, including following a lot of social norms. If you ask for “a wintery scene with a fireplace” but don’t say “don’t put a cat in it,” you are probably okay if a cat is in the picture. But if you don’t say, “don’t put a naked person in it,” you probably aren’t okay if a naked person is in the picture. If I had to tell the tool all the things I expect from it, what’s the point? This goes back to the issue of shared common ground, or the alignment problem. How do you communicate common ground to machines in a way they can adhere to it?

Tell me about your work with TRAILS.

TRAILS is this new institute at the University of Maryland, George Washington University, Morgan State University, and Cornell that looks at trustworthy AI.

So what is trust? Roughly, it has something to do with me being willing to turn over something to someone else when there is a real risk to me. When I use Google Maps I’m trusting that it won’t drive me into a river. If there is no downside to me, then there’s no requirement for trust. I have to put myself in a vulnerable position. 

Trust is a really personal thing, risk is a really personal thing. You can’t ignore the person. We’re trying to understand what the things are that, if people knew about them, would enable them to make better personal decisions about AI systems—whether to make themselves vulnerable to those systems. Can that system communicate when it’s not sure about something so the person can make a better informed decision? Can the system demonstrate that it doesn’t behave erratically? 

Trust is not just the person using the technology, not just the technology itself, but the social infrastructure that ensures some kind of accountability.

People are very unlikely to trust things that they have no say in. So how do we involve society writ large in the development of AI? OpenAI is made up of a couple hundred pretty demographically homogeneous people who are making a lot of decisions that impact a lot of people. Especially when we’re talking about these questions of norms, common ground, expected and unexpected behavior—doing that without broad input is concerning. 

But trust is more than just the AI system. When I’m driving down the highway and there’s an 18-wheeler passing me, I trust that it isn’t going to switch lanes and hit me. Why do I trust that? I don’t know the driver, but I do know that there’s a licensing process, there’s a number on the back of the truck I can call if something goes wrong, there’s car insurance, there are regulations around the construction of the truck. All of this stuff that we’ve built up as a society means that I am willing to let this truck pass me.

Trust is not just the person using the technology, not just the technology itself, but the social infrastructure that ensures some kind of accountability. And it’s built up over time: We didn’t used to have stop lights. We didn’t used to have licensing. We didn’t used to have seatbelts. It’s harder for infrastructure like this to keep up with AI, but it’s what we need.

We talked a little bit about how labor efforts—this remarkable past summer of successful union-led strikes—have been more effective in setting guardrails around AI’s usage than any kind of government intervention. How do you think the regulation of AI, whether by labor or by our elected officials, will proceed going forward?

This is very U.S. centric. The EU has the EU AI Act. China has a data privacy law. The fact that the U.S. hasn’t even passed anything remotely like a data privacy law makes me extremely pessimistic about where we’re going to pass anything that looks like an AI law. I was honestly shocked by the Biden executive order. I was surprised that anything came out, then I was surprised by what it actually ordered. Of course [those orders] require money, which is controlled by Congress, so I remain somewhat pessimistic.

When I’ve read interviews with people who are striking, it’s made me realize that all of these professions are a lot more [complex] than what these automated systems that claim to replace them will do. Okay, a self-driving truck can drive, but it turns out that the job of someone who drives trucks is also to repair those trucks. Not to harp on this point, but this [is what happens when] you’re trying to build a system to automate someone away versus help someone. And if you are trying to help them, you should probably understand what their job actually is.

What about your work—or your field—is most surprising to you right now?

Image and video synthesis blows my mind at how good it is. If you asked me two years ago if this is where we’d be, I’d be like “No way.”

What is the toughest challenge you—or your field—currently face? 

How do we build AI systems that help people rather than replace people?

This interview was published by Roadmap.


Previous
Previous

Why Regulating AI Will be Difficult — or Even Impossible

Next
Next

Evaluating Machine Translation Systems in Emergency Rooms