This instrument is a guitalele!
Giannis Karamanolakis

Ph.D. Student in CS @ Columbia

Hello, world!

I am a third-year Ph.D. student in Computer Science at Columbia University under the supervision of Prof. Luis Gravano and Prof. Daniel Hsu. My research interests lie in the fields of Machine Learning, Information Extraction, and Natural Language Processing.

Currently, I am developing weakly supervised learning frameworks for knowledge extraction from text. I am interested in training deep neural networks for real-world tasks with limited or no training labels using alternative supervision signals, such as noisy/proxy labels and logical rules. I have applied our frameworks for sentiment analysis, topic extraction, and mining social media for rare events related to public health.

In the past, I have worked on human emotion recognition from conversational speech data, music information retrieval, and multimodal word embeddings grounded to the visual and auditory sense modalities.

In addition to doing research, I love playing the bass guitar, windsurfing, taking photos and traveling.
Update: Last summer in Seattle I got my sailing certificate!

For more information about me, see my CV or contact me.


Columbia University

Ph.D. in Computer Science

Columbia University

M.Sc. in Computer Science

National Technical University of Athens

M.Eng. in Electrical and Computer Engineering

Professional Experience

Amazon Research (Product Graph Team)

Applied Scientist / Machine Learning Scientist Intern

Behavioral Signals

Machine Learning Engineer


For more details, please see my full CV (PDF).


A Yelp restaurant review discussing about food poisoning.

Information Extraction from Social Media for Public Health

Joint work with Tom Effland and Lampros Flokas

Advised by Luis Gravano and Daniel Hsu

We are collaborating with the NYC Department of Health and Mental Hygiene (DOHMH) on processing social media data for public health applications. Our current focus is on detecting and acting on foodborne illness outbreaks in restaurants by building systems that track user complaints on social media (e.g., Yelp reviews, tweets).

Very recently, we proposed weakly supervised deep learning models for this task (see Papers).

[project page] [paper]

A product review with manual aspect annotations.

Weakly Supervised Aspect Detection in Online Reviews

Advised by Luis Gravano and Daniel Hsu

We are developing deep learning models that annotate online reviews (e.g., product reviews, restaurant reviews) with aspects (e.g., price, image, food quality). Manually collecting aspect labels for training is expensive, so we propose a weakly supervised learning framework, which only requires from the user to provide a few descriptive keywords (seed words) for each aspect.

[paper1] [paper2]

Deep Learning for Personalized Item Recommendation

Joint work with Kevin Cherian and Ananth Narayan

Advised by Tony Jebara

We developed deep learning models for recommending items (e.g., restaurants, movies) to users in online platforms. In our recent paper, we show how to extend Variational Autoencoders (VAEs) for collaborative filtering with side information in the form of user reviews. We incorporate user preferences into the VAE model as user-dependent priors.

[link] [paper] [slides]

Transfer Learning for Style-Specific Text Generation

Joint work with Katy Ilonka Gero

We trained deep language models (LSTMs) for generating text of a specific literary style (e.g., poetry). Training these models is challenging, because most stylistic literary datasets are very small. In our paper, we demonstrate that generic pre-trained language models can be effectively fine-tuned on small stylistic corpora to generate coherent and expressive text.

[link] [paper]

"Sobrite" Mobile Health App

Joint work with John Bosco, Mark Chu, Lampros Flokas, and Fatima Koli

We are developing a mobile app that is powered by Machine Learning and provides holistic tools to patients receiving treatment from opioid addiction, as an effort to help them maintain sobriety beyond formal treatment. We were one of the winning teams in the "Addressing the Opioid Epidemic" challenge (Columbia Engineering, 12/2017).

[link] [Android app] [iOS app]

The NAO humanoid robot demonstrating dance skills.

NAO Dance! CNNs for Real-time Beat Tracking

Joint work with Myrto Damianou, Christos Palivos, and Stelios Stavroulakis

Advised by Aggelos Gkiokas and Vassilis Katsouros

We embedded real-time beat tracking and music genre classification algorithms into the NAO humanoit robot. While music plays, NAO's choreography dynamically adapts to the genre and the dance moves are synchronized with the output of the beat tracking system. We submitted our system to the Signal Processing Cup Challenge 2017.

[demo] [paper]

Content-based Music Similarity and Auto-tagging

Advised by Alexandros Potamianos

We embedded audio clips and the corresponding descriptive tags into the same multimodal vector space by representing tags and clips as bags-of-audio-words. In this way, we can easily (1) annotate audio clips with descriptive tags (by comparing audio vectors to tag vectors), or (2) estimate the similarity between audio clips or music songs (by optionally enhancing audio vectors with semantic information, and comparing audio vectors).


What comes to your mind when you read the word 'guitar'?

Grounding Natural Language to Perceptual Modalities

Advised by Alexandros Potamianos

We created multimodal word embeddings as an attempt to ground word semantics to the acoustic and visual sensory modalities. We modeled the acoustic and visual properties of words by associating words to audio clips and images, respectively. We fused textual, acoustic, and visual features into a joint semantic vector space in which vector similarities correlate with human judgements of semantic word similarity.

[paper 1] [paper 2]

Urban Soundscape Event Detection and Quality Estimation

Advised by Theodoros Giannakopoulos

We collected hundreds of recordings of urban soundscapes, i.e., sounds produced by mixed sound sources within a given urban area. We developed Machine Learning algorithms that analyze audio recordings to (1) detect acoustic events (e.g., car horns, human voices, birds), and (2) estimate the soundscape quality in different urban areas.



Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training

Giannis Karamanolakis, Daniel Hsu, and Luis Gravano
EMNLP-IJCNLP 2019, Hong Kong, China (Oral Presentation)
[PDF] [Slides]

Weakly Supervised Attention Networks for Fine-Grained Opinion Mining and Public Health

Giannis Karamanolakis, Daniel Hsu, and Luis Gravano
EMNLP-IJCNLP 2019, 5th Workshop on Noisy User-generated Text (W-NUT 2019), Hong Kong, China (Oral Presentation)
[PDF] [Poster] [Slides]

Training Neural Networks for Aspect Extraction Using Descriptive Keywords Only

Giannis Karamanolakis, Daniel Hsu, and Luis Gravano
ICLR 2019, 2nd Workshop on Learning from Limited Labeled Data (LLD 2019), New Orleans, LA
[PDF] [Poster]


Transfer Learning for Style-Specific Text Generation

Katy Ilonka Gero, Giannis Karamanolakis, and Lydia Chilton
NeurIPS 2018, Workshop on Machine Learning for Creativity and Design, Montreal, QC, Canada
[PDF] [Poster]

Item Recommendation with Variational Autoencoders and Heterogenous Priors

Giannis Karamanolakis, Kevin Cherian, Ananth Narayan, Jie Yuan, Da Tang, and Tony Jebara
RecSys 2018, 3rd Workshop on Deep Learning for Recommender Systems (DLRS 2018), Vancouver, BC, Canada (Oral Presentation)
[PDF] [slides]


Audio-Based Distributional Semantic Models for Music Auto-tagging and Similarity Measurement

Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, and Alexandros Potamianos
EUSIPCO 2017, Multimodal processing, modeling and learning approaches for human-computer/robot interaction (Multi-Learn) workshop, Kos island, Greece (Oral Presentation)

Sensory-Aware Multimodal Fusion for Word Semantic Similarity Estimation

Georgios Paraskevopoulos, Giannis Karamanolakis, Elias Iosif, Aggelos Pikrakis, and Alexandros Potamianos
EUSIPCO 2017, Multimodal processing, modeling and learning approaches for human-computer/robot interaction (Multi-Learn) workshop, Kos island, Greece (Oral Presentation)


Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings

Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, and Alexandros Potamianos,
INTERSPEECH 2016, San Fransisco, California (Oral Presentation)
[PDF] [slides]


E-mail: <x>, where x=gkaraman.
Office: Mudd 406, Data Science Institute (map).

Extra: My first name (Giannis) is pronounced as y aa n ih s. You can also try this: