on data science

The coding bug bit me some years ago and damn, I was transformed. I didn’t become Spider-Man but then spiders aren’t bugs. I started coding which much enthusiasm back in college. Then, I moved towards all things data.

For a quick summary of everything, you can visit my LinkedIn. Or, find the timeline and long-form story below.


TIMELINE

June 2022 – Present

Appsilon

R/Shiny Developer | Full-Time

June 2021 – October 2021

Indian School of Business, Hyderabad

Data Analyst cum TA | Full-Time

– Developed R Shiny Apps for econometrics analyses like Conjoint, MaxDiff, and Forecasting
– Facilitated courses on Deep Learning, Big Data and Data Collection
– Conducted tutorials on building an end-to-end Data Streaming pipelines on AWS

July 2019 – July 2020

Townscript (BookMyShow)

Data Modelling Engineer | Full-Time

– Aided Data-Driven Design (Increased KYC success rate to over 90%)
– Day-to-day Analysis + Visualisation
– Built a Spam Classifier, integrated Elastic Search
– Built a Topic Architecture for the B2C Platform (Natural Language Processing)
– Managed Product & Tech Blogs

December 2019 – January 2021

Udacity

Technical Mentor | Part-Time

– Reviewer for the Machine Learning – Introduction and Machine Learning Engineer Nanodegrees
– Technical Mentor for the Knowledge forums for multiple projects for both of them

June 2019 – April 2020

Udacity

Technical 1:1 Mentor | Part-Time

– Mentor for the Machine Learning Engineer and Machine Learning Introduction Nanodegrees
– Provided 1-on-1 assistance for learning via chat and calls, troubleshooting and overall guidance
– Maintained an overall average rating of 4.73 out of 5.00 throughout this endeavour

January 2019 – June 2020

Townscript (BookMyShow)

Data Engineering Intern | Full-Time

– Worked on different day-to-day analysis and data requirements
– Built an approach for Townscript’s data lake and flow
– Used Boto3 to sync data to S3 (reducing transfer costs by 96%)
– Worked on building product metrics and trackable KPIs
– Visualised the KPIs and metrics on dashboards by using multiple services in conjunction


The Story So Far

During my Bachelor’s, I was exposed to several programming languages. Eventually, however, I chose Python as my weapon of choice. Although, I did get a hands-on experience with Java during the Android Development nanodegree by Udacity.  Eventually, I pursued a Postgraduate Diploma in Data Science which exposed me to R scripting. I also learned some other tools of the trade such as SQL, Microsoft Excel and Tableau.

I would be lying if I said it isn’t fun. I owe the geek in me that much, at least. I find data to be the perfect coding island for me. It has everything I like doing and more.

Townscript (BookMyShow)

My professional career started with a Data Engineering Intern at Townscript, a BookMyShow startup, where I started working on data flows, pipelines, and different analysis projects. Then, I moved to a full-time role as a Data Modelling Engineer at the same place where my job responsibilities involved the entire data process. I worked on everything from how the data starts to flow to how it’s analysed to how it helps the product.

At Townscript,

  • I worked on day-to-day analysis tasks which involved making complex queries, making reports and visualisations.
  • I built Townscript’s Data Lake using the AWS stack. For this, I wrote a Boto3 project to move data from multiple AWS services (Kinesis, RDS et al.) to AWS S3, reducing transfer costs by 96%.
  • I was also an individual contributor for Townscript’s Dashboarding project where we built multiple dashboards visualising countless KPIs at both B2C and B2B levels.
  • With the design & product team, I helped aid the data-driven design approach to improve B2B KYC success rate to over 90%.
  • I also implemented a multi-level spam classifer, employing both domain knowledge as well as several machine learning algorithms. This project also involved building a classifier after rigourous feature engineering.
  • With a colleague, I integrated Elastic Search at Townscript while also aiding the overall search flow and experience with the design team.
  • We also built a Topic Architecture employing various methods such as LDA, user studies as well as textual data analysis for the B2C platform.
  • I managed the product and tech blogs at Townscript, and was also responsible for improving the coffee situation at the office by collaborating and getting a proper machine leased by Coffee Day Beverages.

Udacity (Mentorship)

I also worked part-time as Udacity’s 1-1 Mentor for the Machine Learning and Machine Learning Engineer Nanodegrees handling cohorts of students and helping solve their queries as well as guiding them to a roadmap for their future studies and path. Throughout this endeavour, I managed to get a rating of 4.73 on 5 from the students. Udacity sunsetted the mentorship program in March 2020. I continued to serve as a Reviewer for projects for the same nanodegree actively till January 2021.

Indian School of Business, Hyderabad

After a gap of almost a year during the year of the COVID-19 pandemic, I started working as Data Analyst cum TA at the Indian School of Business, Hyderabad.

At Indian School of Business,

  • While my stint at ISB, Hyderabad was short, I worked with Prof. Manish Gangwar to develop R Shiny Apps for various econometrics analyses like Conjoint Analysis, MaxDiff Analysis, Time Series Forecasting (using Facebook’s Prophet).
  • I also helped facilitate courses on Deep Learning, Big Data Analytics, and Data Collection as a Teacher’s Assistant.
  • I also conducted tutorials on building an end-to-end Data Streaming applications on Amazon AWS with Prof. Aashish Chandra.

Independent & Freelance Data Projects

Here’s what I’ve worked on without anyone telling me to do so or the notable projects I’ve done on platforms like Fiverr or independently. While I can’t share files for all Fiverr projects, I would love to discuss them with discretion.

  • CloudSimplifieR package for R (CRAN Package) (link)
  • Fiverr: Cryptocurrency Sentiment Analysis (Coingecko, Google Trends, Reddit & Twitter)
  • Fiverr: Kucoin API Data Aggregation
  • Fiverr: Factor Analysis for Stock Data in R
  • Fiverr: Dolores River Fish Species Analysis (link)
  • Facebook Birthdays Data Crawling using Python (link) (blog)
  • Fiverr: Malicious URL Detector (Regression)
  • Fiverr: XML Data Reformatting & Aggregation
  • Kaggle Survey: Is There Someone Else Like Me? (link)
  • Fiverr: Nigeria – State & Federal Earning Analysis (link)

You may view more of my projects on my GitHub.