The fundamental research problem was to investigate the efficacy of a novel “who I am/how I behave” authentication paradigm. Conventional authentication works on a “what I know” (username/password) or “what I have” (device) model. Our system would study the user’s behavior while typing his/her username and use the activity profile as the key against which access was granted. This eliminated the need for the user to remember a password or have access to a registered device. Conversely, even if a password is cracked or a device is stolen, the bad actor would not be able to penetrate the system because his behavior would intrinsically differ from that of the genuine user.
CAPTCHAs, short for Complete Automated Public Turing Tests to tell Computers and Humans Apart, have been around since 2003 as the simplest human-user identification test. They can be understood as Reverse Turing Tests because in solving a CAPTCHA challenge it is a human subject that is appearing to prove his/her human-ness to a computer program.
Over the years we have seen CAPTCHA challenges evolve from being a string of characters for the user to decipher, to be an image selection challenge, to being as simple as ticking a checkbox. As each new CAPTCHA scheme hits the market, it is inevitably followed with research on new techniques to break these challenges. Engineers must then go back to the drawing board and design a new and more secure CAPTCHA scheme, which, upon deployment and subsequent use, is again, inadvertently subject to adversarial scrutiny. This arduous cycle of designing, breaking and then redesigning to strengthen against subsequent breaking, has become the de-facto lifecycle of a secure CAPTCHA scheme. This beckons the question; Are our CAPTCHAs truly “Completely Automated”? Is the labor involved in designing each new secure scheme outweighed by the speed with which a suitable adversary can be designed? Is the fantasy of creating a truly automated reverse Turing test dead?
Reminding ourselves of why we count CAPTCHAs as such an essential tool in our security toolbox, we characterize CAPTCHAs in a robustness-user experience-feasibility trichotomy. With such a characterization, we introduce a novel framework that leverages Adversarial Learning and Human-in-the-Loop, Bayesian Inference to design CAPTCHAs schemes that are truly automated. We apply our framework to character CAPTCHAs and show that it does in fact generate a scheme that steadily moves closer to our design objectives of maximizing robustness while maintaining user experience and minimizing allocated resources, without requiring manual redesigning.
US Patent: Arif Khan, Falaah and Sharma, Hari Surender. Framework to Design Completely Automated Reverse Turing Tests. US Patent 16/828520, filed March 24, 2020 and US Patent (Provisional) 62/979500, filed February 21, 2020
Threat modelling is the process of identifying vulnerabilities in an application. The standard practice of threat modelling today involves drawing out the architecture of the product and then looking at the structure and nature of calls being made and determining which components could be vulnerable to which kinds of attacks.
Threat modelling is an extremely important step in the software development lifecycle, but emerging practice shows that teams usually only construct and evaluate the threat model before deploying the application. Industrial offerings also cater to this approach, by designing tools that generate static models, suitable for one-time reference. The major drawback in this approach is that a software is not a static entity and is subject to dynamic changes in form of incremental feature enhancements and routine re-design for optimization. Threat modelling, hence, should also be imparted the same dynamism and our work attempts to enable this.
Application logs are used to model the product as a weighted directed graph, where vertices are code elements and edges indicate function calls between elements. Unsupervised learning models are used to set edge weights as indicators of vulnerability to a specific attack. Graph filters are then created and nodes that pass through the filter form the vulnerable subgraph. Superimposing all the vulnerable subgraphs with respect to the different attacks gives rise to a threat model, which is dynamic in nature and evolves as the product grows.
The event-based search engine is an enhancement to conventional image searches. When performing a search for an object, such as a person, an image search using facial recognition may not yield many results, especially if there are relatively few pictures of the person. We fix this limitation by indexing objects based on their occurrence at events. Bipartite graphs are used for search optimization and complexity minimization, while propensity scoring models are used to maximize the precision of information retrieval performed on the graph.
As an example, a server hosting a search engine may receive a search query and determine a searched time interval, a searched object, and a searched event. The server may select, based on the searched time interval, a portion of an object-event bipartite graph that was created using information gathered from social media sites. The server may compare attributes of individual events in the portion with attributes of the searched event to identify a set of relevant events. The server may determine objects associated with the relevant events and compare attributes of individual objects with the attributes of the searched object to identify a set of relevant objects. The search engine may provide search results that include the set of relevant objects ordered according to their similarity to the searched object.
Visit my blog https://thefaladox.wordpress.com/ for the entire archive of essays.
According to witnesses, Earth's been visited by the ***Superheroes of Deep Learning***. What do they want? What powers do they possess? Will they fight for good or for evil? Read to learn more!.
I sat down with the folks at AIHub to chat about my work and art. We talk (meta-)security, scientific comics and demystifying the hype around AI.
(BONUS!) What din't make it into the transcript: Ideating how we would conduct a global (Reverse) Turing Competition where its GANs vs artists and Pondering which problem humanity will solve first- creating AGI or disposing of the media hype
In my talk at the Sparks Tech Forum at Dell, Bangalore, I present a social and technical perspective on the most pressing problems in Machine Learning today, the sources of these problems and some potential solutions.Slides
In this invited talk for the CETI Group's Career counselling initiative I share some friendly advice to undergraduate students from India on how to navigate the current industrial landscape, with special emphasis on prospects in AI/ML research.Slides
In this talk for Dell's Technology and Innovation Pillar, I explore the applicability of machine intelligence and data-driven modelling for enterprise security and illustrate the best approaches to building 'intelligent security'.
Masked under a binge-worthy anime lies an adept critique of the ongoing deep learning craze in the industry. Here’s my commentary on the technical symbols in Death Note.