About

Who Am I?

Hi! I'm Falaah. I'm a Engineer/Scientist by training and an Artist by nature, broadly interested in Reliable and Responsible AI. Towards this end, I conduct fundamental research on Robust Deep Learning, Fairness in Machine Learning and AI Ethics and create scientific comics (and other technical art) to disseminate the nuances of this work in a way that is more democratic and accessible to the general public. I'm currently an Artist-in-Residence at the Center for Responsible AI at NYU and at the Montreal AI Ethics Institute (MAIEI) . I'm extremely lucky to get to do two things I absolutely love to do: fundamental research and creating scientific comics!

At NYU, I run the 'Data, Responsibly' Comic series , along with Prof Julia Stoyanovich . The first volume, titled 'Mirror, Mirror', deals with questions such as What work are we funding? Who are we building models for? What problems are we solving? We delve into Digital accessibility and the impact of poorly designed systems on marginalized demographics. We tie these insights with broader issues in the Machine Learning landscape such as problems with operationalizing fairness, misguided incentive structures in scholarship, exclusionary discourse and questions of culpability when things go wrong and conclude by recommending an approach that grounds the design of automated systems in the people most affected by them, in a way that is democratic and equitable.

At MAIEI, I conduct creative explorations into the socio-political underpinnings of data-driven technology. I'm specially interested in the role of Power and how it influences the design of Ethical AI. My first piece, 'Decoded Reality', is an artistic exploration of the power dynamics that shape the design, development and deployment of ML systems. We present dystopian realizations of how algorithmic interventions manifest in society in order to provoke the viewer to think critically about the socio-political underpinnings of each step of the engineering process.

I also run the 'Superheroes of Deep Learning' comic series with Prof Zack Lipton, which documents the thrilling tales and heroic feats of ML's larger-than-life champions.

I recently concluded a Fellowship at the Bhasha group of the CVIT Lab at the International Institute of Information Technology, Hyderabad, where I worked on Neural Machine Translation of Indic Languages, supervised by Prof CV Jawahar and Prof Vinay Namboodiri.

Previously, I've worked as a Research Engineer at Dell EMC, Bangalore where I designed and built data-driven models for Identity and Access Management (IAM). I worked on behavior-based Authentication, online learning for CAPTCHA design and (Graph) Signal Processing for Dynamic Threat Modelling.

My work in the industry showed me firsthand the pressing challenges of building 'production-ready' models. Contrary to the media narrative around AI, we are yet to have figured out how to build models that are robust, dependable, unbiased and designed to thrive in the wild. These challenges have informed my interest to explore the theoretical foundations of generalization and robustness and to translate these insights into algorithms with provable guarantees. I’m also interested in critically assessing how AI impacts, and is in turn impacted by, the underlying social setting in which it was formulated.

Curriculum VitaeGoogle Scholar

News

News

Jan 2021: RDS Comics, Vol 1: 'Mirror, Mirror' has been translated into French!!!

Jan 2021: Our tutorial titled 'Fairness and Friends' has been accepted to FAccT 2021!

Dec 2020: I've been invited to speak at the 'Beyond the Research Paper' workshop @ICLR 2021!

Dec 2020: I'll be facilitating the MAIEI x RAIN-Africa collaboration 'Perspectives on the future of Responsible AI in Africa' workshop.

Dec 2020: The Spanish edition of RDS Comics, Volume 1: 'Mirror, Mirror' is out now!!!

Nov 2020: I'll be facilitating the 'Privacy in AI' Workshop , by MAIEI and the AI4Good Lab

Nov 2020: Excited to be speaking at the 'Ethics in AI Panel' by the McGill AI Society

Nov 2020: I'll be giving an invited talk on 'Ethics in AI', based off of the Decoded Reality piece, at the TechAide Montreal AI4Good Conference + Hackathon

Nov 2020: The amazing Julia Stoyanovich and I speak about our 'Data, Responsibly' Comic books at the Rutgers IIPL Algorithmic Justice Webinar!

Oct 2020: 'Mirror, Mirror' and 'Decoded Reality' have been accepted to the Resistance AI Workshop at NeurIPS 2020!

My Work

Meta-Security Research

The fundamental research problem was to investigate the efficacy of a novel “who I am/how I behave” authentication paradigm. Conventional authentication works on a “what I know” (username/password) or “what I have” (device) model. Our system would study the user’s behavior while typing his/her username and use the activity profile as the key against which access was granted. This eliminated the need for the user to remember a password or have access to a registered device. Conversely, even if a password is cracked or a device is stolen, the bad actor would not be able to penetrate the system because his behavior would intrinsically differ from that of the genuine user.

Paper: Arif Khan F., Kunhambu S., G K.C. (2019) Behavioral Biometrics and Machine Learning to Secure Website Logins

US Patent: Arif Khan, Falaah, Kunhambu, Sajin and Chakravarthy G, K. Behavioral Biometrics and Machine Learning to secure Website Logins. US Patent 16/257650, filed January 25, 2019

CAPTCHAs, short for Complete Automated Public Turing Tests to tell Computers and Humans Apart, have been around since 2003 as the simplest human-user identification test. They can be understood as Reverse Turing Tests because in solving a CAPTCHA challenge it is a human subject that is appearing to prove his/her human-ness to a computer program.

Over the years we have seen CAPTCHA challenges evolve from being a string of characters for the user to decipher, to be an image selection challenge, to being as simple as ticking a checkbox. As each new CAPTCHA scheme hits the market, it is inevitably followed with research on new techniques to break these challenges. Engineers must then go back to the drawing board and design a new and more secure CAPTCHA scheme, which, upon deployment and subsequent use, is again, inadvertently subject to adversarial scrutiny. This arduous cycle of designing, breaking and then redesigning to strengthen against subsequent breaking, has become the de-facto lifecycle of a secure CAPTCHA scheme. This beckons the question; Are our CAPTCHAs truly “Completely Automated”? Is the labor involved in designing each new secure scheme outweighed by the speed with which a suitable adversary can be designed? Is the fantasy of creating a truly automated reverse Turing test dead?

Reminding ourselves of why we count CAPTCHAs as such an essential tool in our security toolbox, we characterize CAPTCHAs in a robustness-user experience-feasibility trichotomy. With such a characterization, we introduce a novel framework that leverages Adversarial Learning and Human-in-the-Loop, Bayesian Inference to design CAPTCHAs schemes that are truly automated. We apply our framework to character CAPTCHAs and show that it does in fact generate a scheme that steadily moves closer to our design objectives of maximizing robustness while maintaining user experience and minimizing allocated resources, without requiring manual redesigning.

US Patent: Arif Khan, Falaah and Sharma, Hari Surender. Framework to Design Completely Automated Reverse Turing Tests. US Patent 16/828520, filed March 24, 2020 and US Patent (Provisional) 62/979500, filed February 21, 2020

Threat modelling is the process of identifying vulnerabilities in an application. The standard practice of threat modelling today involves drawing out the architecture of the product and then looking at the structure and nature of calls being made and determining which components could be vulnerable to which kinds of attacks.

Threat modelling is an extremely important step in the software development lifecycle, but emerging practice shows that teams usually only construct and evaluate the threat model before deploying the application. Industrial offerings also cater to this approach, by designing tools that generate static models, suitable for one-time reference. The major drawback in this approach is that a software is not a static entity and is subject to dynamic changes in form of incremental feature enhancements and routine re-design for optimization. Threat modelling, hence, should also be imparted the same dynamism and our work attempts to enable this.

Application logs are used to model the product as a weighted directed graph, where vertices are code elements and edges indicate function calls between elements. Unsupervised learning models are used to set edge weights as indicators of vulnerability to a specific attack. Graph filters are then created and nodes that pass through the filter form the vulnerable subgraph. Superimposing all the vulnerable subgraphs with respect to the different attacks gives rise to a threat model, which is dynamic in nature and evolves as the product grows.

The event-based search engine is an enhancement to conventional image searches. When performing a search for an object, such as a person, an image search using facial recognition may not yield many results, especially if there are relatively few pictures of the person. We fix this limitation by indexing objects based on their occurrence at events. Bipartite graphs are used for search optimization and complexity minimization, while propensity scoring models are used to maximize the precision of information retrieval performed on the graph.

As an example, a server hosting a search engine may receive a search query and determine a searched time interval, a searched object, and a searched event. The server may select, based on the searched time interval, a portion of an object-event bipartite graph that was created using information gathered from social media sites. The server may compare attributes of individual events in the portion with attributes of the searched event to identify a set of relevant events. The server may determine objects associated with the relevant events and compare attributes of individual objects with the attributes of the searched object to identify a set of relevant objects. The search engine may provide search results that include the set of relevant objects ordered according to their similarity to the searched object.

US Patent: Arif Khan, Falaah, Mohammed, Tousif, Gupta, Shubham, Dinh, Hung and Kannapan, Ramu. Event-Based Search Engine, US Patent 16/752775, filed January 27, 2020

Stuff

Articles, Talks and More!

Visit my blog https://thefaladox.wordpress.com/ for the entire archive of essays.

November 11, 2020 | Interview

RIIPL Algorithmic Justice Webinar Series

We (the amazing Julia Stoyanovich and I ) sat down with Ellen Goodman, from the Rutgers Institute for Information Policy and Law, to discuss the comedic treatment of AI bias, normativity and exclusion, in the context of our 'Data, Responsibly' Comic books!

September 30, 2020 | Interview

MetroLab "Innovation of the Month" Feature

"Mirror, Mirror" was featured as the MetroLab Network+ Government Technology "Innovation of the Month". In this interview we discuss the origins of the project, our creative process and the future of Data, Responsibly Comics!

September 15, 2020 | Article (Satire)

Hope Returns to the Machine Learning Universe

According to witnesses, Earth's been visited by the ***Superheroes of Deep Learning***. What do they want? What powers do they possess? Will they fight for good or for evil? Read to learn more!.

June 11th, 2020 | Interview

Interview with AI Hub

I sat down with the folks at AIHub to chat about my work and art. We talk (meta-)security, scientific comics and demystifying the hype around AI.

(BONUS!) What din't make it into the transcript: Ideating how we would conduct a global (Reverse) Turing Competition where its GANs vs artists and Pondering which problem humanity will solve first- creating AGI or disposing of the media hype

February 20, 2020 | Talk

The Impossibility of Productizable AI: Problems and Potential Solutions

In my talk at the Sparks Tech Forum at Dell, Bangalore, I present a social and technical perspective on the most pressing problems in Machine Learning today, the sources of these problems and some potential solutions.

Slides
July 11th, 2020 | Article (Satire)

What is Meta-Security?

In this seminal essay, I explain the hottest up and coming sub-field of Machine Learning - Meta-Security!

January 4, 2020 | Article

Deep Learning Perspectives from Death Note: Another Approximately Inimitable Exegesis

Masked under a binge-worthy anime lies an adept critique of the ongoing deep learning craze in the industry. Here’s my commentary on the technical symbols in Death Note.

March 24, 2020 | Talk

We Don't Need No Bot Infestation: Machine Learning for Cyber Security

In this talk for Dell's Technology and Innovation Pillar, I explore the applicability of machine intelligence and data-driven modelling for enterprise security and illustrate the best approaches to building 'intelligent security'.

May 28th, 2020 | Talk

The Hitchhiker's Guide to Technology: A Conversation on Careers in Tech

In this invited talk for the CETI Group's Career counselling initiative I share some friendly advice to undergraduate students from India on how to navigate the current industrial landscape, with special emphasis on prospects in AI/ML research.

Slides
Get in Touch

Contact

Get it touch if you want to collaborate on an interesting project, want some custom cartoons for your presentations or some personalized art for your thesis/book covers or simply want to discuss something wonderfully esoteric!