Some of my projects done during my time at IITK


Relevant Information Retrieval from Text Documents

Under Prof. Arnab Bhattacharya, CSE, IIT Kanpur Jan '22 - Apr '22

  • Implemented and fine-tuned text summarization algorithms like TextRank, LSA, Google PEGASUS, t5-small
  • Preprocessed corpora using a variety of NLP techniques like tokenization, stemming, lemmatization
  • Used ROUGE-N to evaluate accuracy of machine summaries against human summaries for corpus of size 230k
  • Tested out algorithms and achieved SOTA f1-scores on our datasets for abstractive algorithms after fine-tuning
  • Created a Flask web application for users to interact with the algorithms and get summaries of articles or blogs

Racial Disparity in COVID-19 Vaccination in US

Under Prof. Shankar Prawesh, IME, IIT Kanpur Aug '21 - Nov '21

  • Analysed vaccination rates for covid-19 by race in US counties, along with their association with socioeconomic factors, using a number of statistical techniques
  • Transformed and pre-processed data by z-score normalization and PCA dimension reduction of feature matrix
  • Implemented variety of regression algorithms like MLR, Support Vector Regression, Random Forest
  • Fine tuned the hyper parameters of these algorithms using train and test MSE as metric of measure

Indian Agriculture Data Mining

Under Prof. Arnab Bhattacharya, IME, IIT Kanpur Aug '21 - Nov '21

  • Analysed seasonal crop data for last 20 years and unearthed surprising information about the Indian agricultural
  • Pre-processed 2.5 lakh data points using cleaning techniques to get robust dataset for subsequent steps
  • Developed new categorical variables of zones and crop to make analysis more understandable and comparative
  • Used geopandas library to discover the geographical situation of Indian agriculture on zone and crop basis

Modelling and Forecasting of Time Series Data

Under Prof. Amit Mitra, MTH, IIT Kanpur Sep '20 - Nov '20

  • Modeled a seasonal ARIMA model to forecast next year’s temperatures based on historical data
  • Checked data stationarity through Dickey-Fuller test, and performed seasonal differencing to ensure stationary TS
  • Optimized the model parameters using Box Jenkins Method for forecasting purposes
  • Performed residual analysis and information criterion tests to check model adequacy on the dataset
  • Yielded absolute MAE of 2.2% on test data for optimized S-ARIMA model

Repelling–Attracting Metropolis Algorithm

Under Prof. Dootika Vats, MTH, IIT Kanpur Jan '21 - Apr '21

  • Implemented tweaked version of MH algorithm called Repelling–Attracting Metropolis Algorithm (RAM) for multi-modality
  • Used Auxiliary Variable approach to derive the steps of the algorithm
  • Demonstrated how the RAM model outperforms the MH sampler, by generating MCMC samples for real-life numerical examples like sensor network localization, strong lens time delay estimation
  • RAM provided better simulations for multimodal distributions, as seen by ACF plots, trace plots, acceptance rate and downhill-uphill average proposal numbers of generated samples

Expectation-Maximization & Metropolis-Hastings Algorithms

Under Prof. Dootika Vats, MTH, IIT Kanpur Aug '19 - Nov '19

  • Developed expectation-maximization (EM) algorithms to fit multivariate Gaussian mixture models for latent variables
  • Cross-validated the model on test data to determine optimum values of hyperparameters
  • Utilized MH algorithm to implement the Markov Chain Monte Carlo (MCMC) method in Bayesian logistic regression model

Critical Comparative Appraisal of Monetary Policy Tools and Strategies

Under Prof. Sukumar Vellakkal, ECO, IIT Kanpur Aug '21 - Nov '21

  • Conducted structural comparison of monetary authorities in India and the United States
  • Compared monetary policies of these countries including reserve requirements, discount rate, and open market operations
  • Critically analyzed various monetary policy strategies employed to control economic variables like unemployment rate, inflation rate, interest rate, GDP, money supply, etc.