Linguistics in AI-Powered Text Analytics (feat. Michelle McSweeney)
Alumni Aloud Episode 33
Michelle McSweeney is director of data quality at Converseon, a social media analytics and consulting agency. Michelle earned her PhD in linguistics at the Graduate Center in 2016.
In this episode, Michelle tells us about how to negotiate a job offer and earn what you’re really worth; the rewards and drawbacks of shifting your professional identity after academia; and how following your interests—even when they might seem incompatible at face value—can ultimately empower and even transform the value of your work and help you stand out on the job market.
Listen to the episode below, download it, or stream it in Apple Podcasts (or your preferred podcast player).
VOICE-OVER: This is Alumni Aloud, a podcast by Graduate Center students for Graduate Center students. In each episode we talk with a GC graduate about their career path, the ins and outs of their current position, and the career advice they have for students. This series is sponsored by the Graduate Center’s Office of Career Planning & Professional Development.
ANDERS WALLACE, HOST: I’m Anders Wallace, a PhD Candidate in the Anthropology program at The Graduate Center. In this episode I sit down with Michelle McSweeney who is Director of Data Quality at Conversion, a social media analytics and consulting agency. Michelle earned her PhD in linguistics at The Graduate Center in 2016. In this episode Michelle and I talk about how to negotiate a job offer and earn what you’re really worth, the rewards and drawbacks of shifting your professional identity after academia and how following your interests, even when they might seem incompatible at face value, can ultimately empower and even transform the value of your work and help you stand out on the job market. So what’s your name and what are you currently doing for a living?
MICHELLE McSWEENEY, GUEST: So my name is Michelle McSweeney and I am the Director of Data Quality at a small social media analytics company called Conversion. And I oversee most of our data pipelines, our annotation processes and do some of the backend engineering to make sure that we get the most out of our data in preparation for machine learning.
WALLACE: And this kind of data is social media data?
McSWEENEY: It’s primarily social media data so it comes from like Twitter and Reddit and all kinds of different places and we do get some longer form like news articles and things like that but it’s all data that you’d find on the internet.
WALLACE: And it’s for companies that enroll you to do this for them?
McSWEENEY: Yes so my company develops machine learning algorithms. I often joke that we annotate the internet or are assigning sentiment to the internet. We have a real problem with negativity. *laughs*
WALLACE: *laughs* Yeah I was going to say.
McSWEENEY: But what we sell are these algorithms and machine learning approaches to annotate data. To identify what is customer experience or what is trust or to do sentiment analysis. So we get all of this data in, we get it all labeled and then we use those labels to develop our algorithms. We do some engineering on the other side as well.
WALLACE: Can you tell me a bit more about the path that led you to this job?
McSWEENEY: So I finished my PhD in 2016 in linguistics with a certificate in interactive technology and pedagogy, which was the best thing I ever did for my education was to combine the two. So I finish that and then I had a postdoc at Columbia for 2 years where I kind of was trying to figure out if I wanted to stay in academia.
WALLACE: And that was in linguistics?
McSWEENEY: That was not in linguistics that was in architecture.
WALLACE: Interesting. How did that happen?
McSWEENEY: I applied. *laughs* Quite honestly Matt Gold connected me with the job opening. And I have a non-negligible technical background which I developed completely at The Graduate Center: Python and GIS. So I learned Python in the linguistics program. I took one course and then was largely self-taught. And then I learned GIS because I really wanted to do this project about languages in the subway for my ITP project. So I had both of those technical skills and they actually complement each other really nicely. Python for text analysis and GIS. Not because they help each other, but because they give you a really well rounded view into technology because you know you need to know web design for both of them. And the text analysis gives you this world into like machine learning and programming for a purpose and the GIS gives you this view into like how software packages work like this. So combining the two was like really just incredibly beneficial and that’s how I ended up at Columbia, is because I had this breadth rather than depth of that technical experience.
WALLACE: As far as architecture goes I can see a similarity in mapping.
McSWEENEY: It’s a little funny because quite honestly, I was a linguist in and architecture department. But they had a grant for architecture urbanism in the humanities and I brought a humanities bend on to a lot of the projects, considering language and considering like text analysis. I did my first project there on language in the city which was really great. And my second big project there was about gun control and immigration and how these politically polarizing topics are framed in the media. So I was able to use my text analysis skills there and having the freedom to do that project was amazing.
WALLACE: So the postdoc was really nice as far as giving you a lot of freedom to explore your interests.
McSWEENEY: Absolutely. The postdoc also gave me the time to figure out what I was going to do; if I wanted to stay or if I wanted to go.
WALLACE: Tell me more about the transition out of that postdoc.
McSWEENEY: So towards the end of it I realized that even though I loved some parts of an academic career… it’s not that I don’t love an academic career, it’s that I wanted to know what other things felt like you know. I had been in higher education at this point for 15 years. I worked as a something for 15 years in higher ed. Not all was my dissertation, there are other things. But I was like I have no idea what the experience of the majority of people is. Like the people I pass on the street, I have no idea what their day would look like or like how business works or how the private sector works at all. So in order to come back to academia I felt that I needed to have that experience.
WALLACE: That’s very honest to admit that you were genuinely curious.
McSWEENEY: Like if you’re doing a masters or PhD, you’re a curious person so yeah I just really wanted to know what life looked like.
WALLACE: Finding your fit with this company makes a lot of sense given your skills and your interests in language. Did it take a lot of work for you then to find this opportunity at Conversion?
McSWEENEY: Somehow, if a lot of work looks like setting up an alert for every possible search combination you can imagine. I had to set up alerts for key words I was interested in data science, natural language processing, linguistics, I had all of these terms. And then I was searching those from like Glassdoor and Indeed and Google so I was getting tons of emails every day. But I think I only applied to two private sector jobs.
WALLACE: Just two?
McSWEENEY: Just two. I had also applied to a data science bootcamp. So it was one of the free ones for PhD is to have a math background and I got accepted and I was planning to do that. And then I got offered this job at Conversion and I was like alright and I decided that I would rather work for some time and develop my skills there rather than just do a boot camp that was going to be like 8 weeks and then at the end of it you have a bunch of projects. They’re kind of a head hunters boot camp type of thing if you have a PhD in some math background, they train you to be a data scientist. And ultimately, even though I have this really cool job right now I probably want to go more into the data science track rather than the data engineering.
WALLACE: Interesting, what’s the difference?
McSWEENEY: Yes so data science is more research and super cool super interesting. Data engineering is a lot of cleaning and preparing the data to build algorithms with. So I get to ask fewer questions but I get to work with languages like Thai and Vietnamese and Arabic to prepare all of that data to build like sentiment classifiers which is you know unheard of in a job. So even though this is a cool data engineering job I would prefer to be more on the like research side.
WALLACE: Yeah that makes sense. And can you say the name of this boot camp, I imagine people might be interested.
McSWEENEY: Yes, Insight Data Science.
WALLACE: This opportunity with Conversion worked out in the end.
McSWEENEY: I mean this also speaks to why this was a really good decision. Is that I have under me an entire research team right now. And it is three Master’s students in the computational linguistics program who we’ve hired on as our interns. And we have them building out classifiers and like working with data and it’s directly applicable to what they’re learning. And being in this position I have the freedom to build up this team and do really cool things.
WALLACE: So there’s still an educational, mentoring component?
McSWEENEY: Totally, and that is probably the most meaningful part of my job is guiding one of my interns to build a neural net in Chinese. Which is you know the first neural net that we’ve built at this company and it’s really exciting to watch her develop those skills as she’s building it.
WALLACE: Also give me a bit more of an overview about your academic background; I noticed that you did a BA in chemistry?
McSWEENEY: My Bachelor’s is in chemistry and math. My final project, my bachelor’s thesis, was an art installation on the relationship between 0 and god. I did chemistry in part because I wanted to become a doctor but in part because I thought that like snowflakes were really beautiful right. And like I really wanted to like understand how snowflakes came to be so I wanted to go into the hard sciences too. Learned that thing. Very cool. I encourage you to look it up; this is pre-Wikipedia. *laughs* Yes so then after my bachelor’s I worked at the local community college and I taught math in a program for first generation low-income college students and loved that job, so cool. And then I went to Peace Corps where I taught English and wrote a grammar for Lusaka which is an East African Bantu language. But before that I had this like full-time job and I was you know I’m in my mid-twenties and did not need the money that came with a full-time job at all, which is a great problem to have.
So I took classes at night at the local university. And I took a Tagalog class and I bring my Tagalog instructor this like grid and I’m like look what Japanese does, look what French does, look what Tagalog does, look what English does. What is going on with Tagalog and its verb conjugations? I was like, “this doesn’t fit any of the patterns” and he looked me and he’s like, “oh I thought you taught math.” I was like, “yeah I teach math.” And he’s like, “not linguistics?” I kid you not I was like, “what’s linguistics?” So then I took the intro to linguistics classes and was like, “oh my gosh this is the next question. Like now I know how snowflakes happen, now I want to know how language happens right. So then when I went to Peace Corps I had that frame of mind. And I was like, “I’m going to do a PhD in linguistics when I get back.”
WALLACE: So the light bulb had gone off and then you’re off to this fascinating place for linguistic diversity.
WALLACE: And then you came to your PhD.
McSWEENEY: So I came to do my PhD. I was really committed at the beginning to study the syntax-semantics interface of East African Bantu languages. You don’t need to know what that is to know that it’s dry. And I came to this point where I was like you know I’ve been back in the states for a while. I feel weird working on a language that is not mine. Like I have no claim to this language whatsoever. It didn’t feel appropriate anymore so I decided to work on text messaging. *laughs* Because… it was a few years ago and this was before emoji’s had different skin tones and understanding how bilinguals text because it’s computationally a really interesting problem. So I took the intro to programming class in the linguistics department. I started the ITP program and I was like re-inspired by all of these questions and making these questions more contemporary, situated in New York, situated in what the lived experience of the people around me was.
WALLACE: Came alive for you in the context and with the tools that you were learning.
McSWEENEY: Totally and was like deeply inspired by the ITP program. And not only in terms of like there’s all these questions but also we don’t necessarily need to answer the questions in the most technologically advanced way possible. Sometimes the question should be answered with pencil and paper and that’s a technology. Sometimes a question should be answered in more advanced ways, computer is actually better. That’s something that I’ve carried with me, really at the front of my mind ever since then. Even at this company, we built this neural net and that’s a very new it’s like the very edge of AI. So it’s like the new exciting thing, they’re a couple years old, everyone’s developing them like everyone’s excited about them. But they don’t actually solve every problem very well. Sometimes you need what we’ve been doing for 15 years; that’s actually the better model. So that kernel has really guided the past 5 years of my career.
WALLACE: Now could you walk me through your job now? Can you tell me for instance like a typical day?
McSWEENEY: So I’ll give you a typical month. So we’re currently building out classifiers in Hindi, Japanese and Arabic. So I have hired a ton of freelancers who code this data. So last month I hired freelancers who speak some of these languages and I work with them sending data back, getting data back from them. I’ve written a bunch of programs to arbitrate these data and stop statistical reliability scores from them by comparing how much is agreed upon, how much is not. So a lot of my day is just passing data back and forth, that’s probably about 30 percent of my month, is just sending this data around. Another 30 percent of my month is meetings. There are so many meetings. Right before this I was in a meeting and it was about two hours long and it happens every two weeks, it’s our development meeting. And it’s a great meeting because there’s one person from the business side who’s selling the products, the developers and then the data science/data pipeline people. So it’s that full range so we can all get on the same page together as we’re building out this product.
WALLACE: This is a meeting to share what you’ve been doing and where you’re heading to align yourselves.
McSWEENEY: Yeah and it’s also meeting to say okay we need this feature, how are we going to get to this feature? If we build it this way we have to make those sacrifices and if we build it that way we have to make these sacrifices. And being able to have that dialogue is really invaluable because it means that we can quickly change directions if something breaks or something isn’t working. And we can all be on the same page about how things are going to get built out. And I think that something like that can only really happen at a small company because otherwise it would just be all the senior level people coming together interpreting what the people below them said. So about 30 percent is meetings and then the other 30 percent is language engineering. So I am extremely lucky that I have a lot of autonomy and I’ll notice or one of our report writers will say, “there’s something crazy going on with German, like German sentiment is all out of whack. And you know I can go into it and be like “oh I know how we should fix that, like let’s try this, let’s run these three experiments and see what we get out of it.” That’s the most fun, problem-solving, fixing things.
WALLACE: And you were talking about this as well that you have a smaller company that you’re at now. What’s the atmosphere like at your company, can you talk more about that?
McSWEENEY: It’s an interesting company, there’s a pretty large C-suite. C-suite being like the chief people and then there’s two people at the director level that I’m at and then there’s pretty flat after that in terms of structure like hierarchy. Now it’s really cool being a small company because I get to talk to everybody. Like I understand the problems and the pain points that everyone’s having and it does feel like we’re all trying to achieve this thing together. You know there’s a real feeling of like, “all right we’re all going to work together to like reach this goal and make this thing exist,” which is really awesome. That said, some of the drawbacks are that there can be cases where other people around me say “stay in your lane.” It’s something that I don’t say because I haven’t been in business long enough to say things like this. So this idea of staying in your lane like, “I shouldn’t be doing things that you know the person who is basically my assistant should be doing.” But then in a really small company, you just are trying to get things. So we’re kind of in an awkward spot where there is hierarchy, like it’s not a start-up, not new. But you know that hierarchy is hard to preserve because there’s just so much to do.
WALLACE: What do you find the most rewarding about your job or the most enjoyable parts of your job?
McSWEENEY: Have you ever talked to anybody who gets paid to make machine learning classifiers in languages like Thai and Vietnamese and Hindi and Arabic and like German, French, Spanish, I could go on we have 15 languages. So definitely aside from working with my interns I finds building classifiers in all these languages so rewarding.
WALLACE: Because it’s the intellectual stimulation?
McSWEENEY: It’s half like it’s just really interesting because I get to learn a lot about… Vietnamese intensification can be done with doubling the adjective. I’m “happy happy” today. I didn’t know that! But what I think is more interesting even then that is that by working in these languages and building out ways to “listen” in these languages, while there’s a real problem with the quantification of language and the quantification of communication, ultimately building out classifiers or building out any kind of machine learning for languages that are less commonly supported democratizes who we’re listening to. So now the companies that come to us, the companies that we work, with are now listening to Vietnamese speakers, they’re listening to Thai speakers. And before those voices were not being heard at the same level. I like, full eyes open, they’re doing this because they believe there’s a market there. But I think it’s really valuable to get to work in all of these languages and try to give more people a voice.
WALLACE: People who use Twitter, many of them would like that the company would listen to what they’re saying.
McSWEENEY: Exactly. The reason that I value that I think comes a lot from what I learned in the ITP program. And just like putting these things into context about who’s being listened to and the hopes of the 1997 internet right. Like in 1997 we thought that the Internet was going to democratize the world and it didn’t. But it feels like building out these algorithms that can listen to more people, and even though it aggregates all of these people together, listening still as a nod to that hope.
WALLACE: That’s fascinating! So what about the challenges, what are some things that frustrate you?
McSWEENEY: I miss having my autonomy, I cannot even tell you. I really miss being in control of my day. I didn’t think that I would because when I was in an academic setting I worked a 9-to-5. But now that I have to it feels different.
WALLACE: It’s more of an effect than you thought it was.
McSWEENEY: Yeah and there’s a loss of identity. I’m still adjuncting, so I adjunct at Pratt and here at the GC. Which I find really valuable because I feel like I have more to bring to the classroom now than I ever had before.
WALLACE: Because of the context of real world work?
McSWEENEY: Absolutely, because I fully recognize that the vast majority of my students are not going to become academics. And now I can contextualize everything that I’m talking about. I feel like I’m such a better teacher. So I have one foot in but there’s still this like loss of identity of being “an academic.”
WALLACE: Do you feel like it’s part of taking on a new professional identity that you’re in the middle so to speak?
McSWEENEY: Absolutely, the awkward teenage years. *laughs* I do feel like I’m transitioning that identity. I mean that’s what I wanted anyhow like I wanted to fully understand and I don’t think that I would understand it without accepting the transition. I’ve been at the company about 6 months, 8 months something like that. And about a month ago something shifted for me and I stopped seeing myself even in my classroom as being an academic who went into private sector and rather as a tech professional who also has these things to bring back to the classroom. And I still have some like… you know I have a book contract right now and like I have a podcast. So like I have one big foot in but I’m starting to actually feel like a professional who has a foot in academia as opposed to an academic who has a foot in the professional.
WALLACE: That’s very encouraging to students who would want to make that shift and feel apprehensive about just that loss of identity. That there is another horizon once you get there.
McSWEENEY: Yeah, it was hard. It was really hard. It was probably one of the like top 10 hardest transitions I’ve made in my life.
WALLACE: Any particular aspects of that that were challenging?
McSWEENEY: When you give your opinion on something and you are a PhD student or a PhD candidate or a postdoc at xyz, you have that entire institutional affiliation behind you, validating whatever you say right. It lends a certain credibility of course and people reach out to you to ask your opinion. And I still have a little bit of that because of the podcast that I have but it’s not like it was before. So losing that position in society, basically not being a pundit, it felt like a loss of status.
WALLACE: What do you think are some important keys to be successful? You’ve been in your job now for 6-8 months, have you noticed any elements of the transition that you feel like are keys to thriving?
McSWEENEY: Make sure you like the culture that you are coming in to. Really pay attention to that. I think I got really lucky and I don’t think that I paid attention to it as much as I would suggest others to. But honestly there’s not a single day that goes by that I’m not like “wow I am so lucky to get to do this stuff.” And I think finding a job that you’re genuinely interested in is essential. The other thing that I’m going to say may or may not be totally related but research what you’re worth and research what the position is worth. Because you know when I went into this job I had an idea in my mind of what this position was worth and I was offered considerably less than that and I negotiated back up to something that I thought was fair. And then recently I just got another raise after again positioning myself, saying this is what this position is worth, I need you guys to pay me this. And knowing that I feel like has given me a lot more confidence. Now coming out of academia, that number seemed astronomical. Like people make that much? But like you know I’d never made much. I was like a post doc and I was lucky to have my post doc. But understanding what my fair market value was, was an essential thing to transitioning.
WALLACE: And a sense of giving you the confidence to ask for what you were worth. That’s good news for people to hear. Did you have any mentors or relationships that helped you in your career switch?
McSWEENEY: You know somehow I think it was kind of hard for my committee, which sounds so weird. When I was making the decision I think it was hard for my dissertation committee because like they just hadn’t had that experience you know. And they wanted to be really supportive and they were and I was so grateful for that, I can’t even tell you. But I don’t think that they had had that experience. And then I also taught a course with a woman who had started her own business. And you know it’s just a really small business, Data Dozen, and she’s great. I drew a lot of strength from her because you know she had gone out on her own, she didn’t have a PhD, she didn’t come from an academic background. But she was really inspired by this, went out and did it and made it happen. Teaching people Tableau, teaching people about data and data visualization and like workshops.
And seeing her do that was actually really inspiring because you know she was the first person, I’m almost embarrassed to say, that I knew who worked in the private sector. Everyone that I know works either in the public sector or nonprofits or academic, except one brother-in-law had done the PhD data science position. And he was incredibly helpful and incredibly supportive. I cannot say that I actually had a mentor and it kind of made it terrifying. I had people around me who were really courageous and really strong. And seeing them live that this life that they had wanted to create in this particular way you know, like I’ve always had people around me who like have this internal locus of control and are like shaping their life to be what it is but not in this exact direction. Having these people, these 2 people, as inspiration was really helpful. Spent a lot of time on forums.
WALLACE: Was there anything that you felt was particularly helpful in your job search? Either an action that you took or resource or channel that you used?
McSWEENEY: Honestly setting up all the filters and all the searches was like really… that’s what did it. Aside from that yeah it was kind of brazen of me to only apply for 2 jobs and I definitely would’ve applied for more. But I also didn’t apply for anything I knew I wasn’t going to get because I knew I was coming out of my PhD program, I had all this like language experience, but I didn’t really have data science experience. My weakest link there was definitely like my programming skills. So going into the private sector I kind of understood that like what I was bringing was like insight, understanding, maturity, all these things. I was not necessarily a 25-year-old computer science major. That’s never what I’m going to be and that’s cool you know. So I wasn’t going to apply to things that we’re looking for that.
WALLACE: Are there any specific skills that you got from your PhD that have been especially helpful in your new job?
McSWEENEY: My entire dissertation! *laughs* I think I got really, really lucky.
WALLACE: *laughs* Other than hitting the nail on the head topic-wise.
McSWEENEY: Aside from that you know and this is something I’ve seen at a couple other places since then… is that a lot of companies need to hire for maturity. They need to hire someone who has like some technical experience and some research experience but not necessarily be the expert in the technical experience or the expert in the domain. But rather they need to hire somebody who can like manage the team that is going to either build the thing or manage the team who’s going to be the expert in the domain. And I think that coming out with a PhD, that’s someone who has that maturity. That’s someone who may understand the technical stuff and be able to do some of it very slowly. Research code is not development code. Or might understand the domain but like just not be a hyper-narrow expert or be willing to like take that entry-level position. So I think coming out of the PhD, the one thing that it taught me is how to see how things connect really well and the maturity to whether basically any storm. Like there can be drama and there can be problems but it’s like, “okay we’re just going to stay the course.” I think we will address the problems as we can.
WALLACE: Maturity and systems level thinking. Obviously mentoring a team and project management.
McSWEENEY: Yeah, yeah. And those are definitely skills I learned in the PhD.
WALLACE: Yeah, wonderful. What do you know now that you wish you’d known as a graduate student? Any perspectives on how you might have done things differently if you could have spoken to your 10-years-younger self?
McSWEENEY: That outside of an academic setting, nobody has a PhD. It’s actually a really like impressive credential once you get outside of your PhD program. It is a credential that people are like, “oh you know things, you can think through things.” Also that my skills are worth way more than I thought they were. I definitely think in a PhD program you are surrounded by people who are exceedingly skilled, amazingly like out-of-this-world competent people right. And you don’t need to be that competitively, intensely competent and skilled to do a job really well. And I wish I would have known that because I was terrified. I was like, “I’m not going to be able to hack it outside of an academic institution.” And that wasn’t it at all.
WALLACE: So that’s like an imposter syndrome, many people will relate.
McSWEENEY: Totally, yeah, it’s like the canonical imposter syndrome. So I wish I could talk to my younger self and say that. And the other thing that I would say it would be, you know enjoy this. Take the classes that you’re interested in because you’re interested in them. That thing that you know you want to develop, don’t push it to be second rate. Figure out some way to manipulate your projects, manipulate your research, so that that thing that you want to develop about yourself is center stage. Because never again do you get the freedom, even if you stay in an academic setting, like you never get the freedom that you have in a PhD program to shape whatever you’re going to research. You know you have your committee and all of these things like yeah.
WALLACE: Take advantage of the time and space
WALLACE: Not get bogged down too much with the imposed wisdom of the requirements of the perceived expectations.
McSWEENEY: Exactly. And don’t try to be strategic about your dissertation in terms of getting you a job. Be strategic about your dissertation or your research or your projects based on what you’re most interested in and excited about.
WALLACE, VOICE-OVER: That does it for this episode of Alumni Aloud. I want to thank Michelle for coming on the show to share what it’s like to apply your research and data analysis skills in the private sector. Remember to stay tuned for more episodes of Alumni Aloud, published every 2 weeks during the fall and spring semesters. Subscribe on iTunes and you’ll automatically be notified of new episodes. Also check out our Facebook, Twitter and career planning website at cuny.is/career plan for more updates from our office or to make appointments with our career counselors. Thanks for listening and see you next time!
This entry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.