Introduction

Hello! I am Utkarsh Mathur, a 24-year-old new-grad with MS in Data Science from University at Buffalo, B.Tech. from IIT Roorkee, and 2 years of experience in Data Science, Software Engineering, and Data Engineering.

I am passionate about leveraging my skills to develop scalable AI/ML software and services as well as picking up new skills on the way. My expertise spans programming (Python, C++, SQL, R, Perl, and JavaScript), machine learning, data engineering, and software development, and I am eager to apply these skills in an IT setting to drive innovation and deliver impactful solutions.

I am committed to continuous learning and passionate about creating impactful and practical solutions. I believe that with dedication, hard work, and a positive mindset, anything is achievable - a principle I've demonstrated throughout my academic and professional career. I am actively seeking opportunities in Data Science, Software Engineering, Data Engineering, and Data Analysis.

Career Highlights

Education

University at Buffalo, State University of New York
Master of Science in Data Science (Engineering Science)
January 2023 – June 2024

Indian Institute of Technology Roorkee (IIT Roorkee)
Bachelor of Technology in Polymer Science (Chemical Engineering)
July 2018 – July 2022

Work Experience

  1. Data Scientist at Atriano (September 2024 - Present)
  2. Data Engineer, Imaging at ImagoAI (October 2024 - January 2025)
  3. Data Scientist at Quinbay (May 2022 - October 2022)
  4. Machine Learning Engineer at Hono (July 2021 - April 2022)
  5. Data Scientist at ImagoAI (April 2021 - May 2021)
  6. Research Intern under Dr. Kusum Deep (September 2021 - December 2021)
  7. Research Intern under Dr. Mayank Goswami (September 2020 - June 2021)
  8. Research Intern under Dr. Gaurav Manik (July 2019)

Key Skills and Expertise

  1. Python Programming: Leveraged Python to develop solutions and deliver analysis across various industries.
  2. Machine Learning: Developed and deployed ML solutions for Big Data, Computer Vision, and NLP using sklearn, PyTorch, TensorFlow, and JAX.
  3. Software Developmemt: Experienced in fullstack deployment of developed solutions using Python, JavaScript (Node, Angular, React, MongoDB, Django), C++, and HTML/CSS.
  4. Data Analysis: Performed data analysis with PowerBI, Tableau, Qlik, and R to extract actionable insights.
  5. ETL/ELT Pipelines: Proficient in developing ETL pipelines for big data using Hadoop, Snowflake, and AWS Glue.
Technical Skills
  • Programming Languages: Python, C, C++, SQL, JavaScript, Java, C#, HTML, CSS, R, MATLAB, Perl
  • Data Science Skills: PyTorch, TensorFlow, Keras, Tableau, PowerBI, PySpark, Databricks, Qlik, Snowflake, Hadoop
  • Software Skills: Node, React, MongoDB, Django, Kafka, Jenkins, Selenium, Git, Kubernetes, Docker, AWS, GCP, Jira, Confluence
Crucial MOOC Courses
  • Design and analysis of Algorithms (NPTEL)
  • Artificial Intelligence Nanodegree (Udacity)
  • Deep Learning Specialization (Coursera)
  • Machine Learning A-Z (Udemy)
  • Sports Analytics Using Python (Mad About Sports)
  • Data Science using Python (EICT IIT Roorkee)

Key Projects

  1. Graph Neural Networks and Large Language Models: A Literature Review (Capstone Project)
  2. Metaheuristic Optimization v/s Backpropagation (Course Project)
  3. Top Spotify Tracks Database (Course Project)
  4. Statistical Analysis (Course Project)
  5. Clustering and Time Series Analysis (Course Project)

Contact

Feel free reach out at utkarsh.datamathur@gmail.com and datamathur@outlook.com to explore potential opportunities and collaborations.
For more information regarding means to contact me, please visit the contact page.

Personal Interests

Hobbies

I am a man of varied taste. I like sports specially football (soccer) and cricket. I am a very big fan and follower of Real Madrid and Cristiano Roanldo.

I am very fond of various types of music and I really cherish Coke Studio - Pakistan, classic rock bands (like Pink Floyd, AC/DC, and Nirvana), and independent Indian artists (like The Local Train, Indian Ocean and Parvaaz).
I also like to play guitar and sing along. I mostly sing bollywood songs and romantic english songs because I'm just an amateur guitarist.

I often spend my free time binge watching TV series and critically acclaimed movies. I am a huge fan of The Lord of the Rings Trilogy and middle earth tales.
I try to maintain a reading habbit too. The Art of Thinking Clearly, Sapiens, and The Diary of a Young Girl are some of my favorite books. I prefer to read non-fiction books but being a crazy fan of various fantasy storylines I am forced to read various fictions too.

Extra-Curricular

I like to explore and work on different extra curricular activities.
Here are some of my extra curriculars and some of the Positions of Responsibilites I've held in IIT Roorkee:

  • Member, Team Inclusion, IIT Roorkee
  • Manager, TEDx IITROORKEE
  • Placement Associate, Placement and Internship Cell (PIC), IIT Roorkee
  • Core Team Member, Cognizance – Technical Festival of IIT Roorkee
  • Volunteer, Prahari Kaksh, NSS (3 UK CTR), IIT Roorkee
  • Friend of Section (FOS), Music Section, IIT Roorkee

Work Experiences

1. Data Scientist at Atriano

As a Data Scientist at Atriano since September 2024, I am developing an educational story generation web app for kids using Large Language Models (LLMs) and recommendation algorithms.

2. Data Engineer, Imaging at ImagoAI

At ImagoAI, I worked as a Data Engineer since October 2024. I’ve successfully deployed Galaxy, the world’s fastest mycotoxin test, for eight clients by gathering and processing client data and fine-tuning AI models using Python.
Additionally, I automated the hyperspectral image camera calibration process using Python and Shell scripting, improving overall efficiency by 80% to 85%.
This role has allowed me to deepen my expertise in automation, data-driven solutions, and client-specific AI model enhancements.

3. Data Scientist at Quinbay

I joined Quinbay as a full-time Data Scientist post completing my B.Tech. in May 2022. As part of a Data Science that catered to the e-commerce needs of BliBli.com, I was part of some of the most exciting and comprehensive Deep Learning projects.

  • In an effort to reduce order return time, I deployed quantized version of the Object Detection model in use to increase the performance (measured by inference speed) by 30%.
  • In the other project, our team was required to reduce the human efforts involved in recognizing counterfiet products for certain brands. By applying my Computer Vision and Natural Language Processing skills, I contributed by performing analysis on product image & product details, and creating Image Recognition model for counterfiet identification. We created product screening pipeline to flag potential counterfiet products reducing the human screening by 30% and counterfiet product identification process by 70%.

4. Machine Learning Engineer at Hono

I joined Hono in July 2021 as a ML Engineering Intern. I worked on real-life HR related data and I performed predictive analysis over specific aspect of these datasets.
I created automated pipelines for Machine Learning model training by leveraging Data Analytics, Machine Learning algorithms, Statistical Analysis techniques, and Time Series Analysis to add personalized features for our clients.
During my tenure at Hono, I worked on the following features for the Hono HRMS software:

  • Employee Attrition Forecasting - Forecasting the period of time when an employee might resign.
  • Promotion Suggestion - Suggesting list of employees for promotion along with salary increments to aid the HR.
  • Hike Estimation - Estimates hikes for employees based on performance and allocated budget to maximize employee satidfaction and retention.
  • Attendance Forecasting - Based on the discipline of the employees and past years seasonal trends, forecast the attendance pattern of employees to help the managers plan the next few weeks.
  • Organization Hierarchy - Constructed organization hierarchy from employee and manager directory to clearly identify teams and populated them with team projects & employee contributions. This helped the managements to analyze the performances of teams.

5. Research Associate
    (Dr. Kusum Deep at IIT Roorkee)

From September 2021 - December 2021 I worked with Ms. Preeti (PhD candidate at IIT Roorkee) and under Dr. Kusum Deep, Department of Mathematics, IIT Roorkee as a research assistant.
I worked to provide programming support to Preeti's ongoing research on using Random Walk Grey Wolf Optimization (RW-GWO) and its renditions for feature selections in large-scale multi-dimensional datasets.

6. Research Associate
    (Dr. Mayank Goswami at IIT Roorkee)

From September 2020 - June 2021 I worked under Dr. Mayank Goswami, Assistant Professor, Department of Physics, IIT Roorkee as a research intern.
I developed and trained U-Net models (based on architectures such as ResNet and VGG) on OCT B-Scans, exploring different loss functions (Intersection over Union, Dice Loss), optimizers (SGD, Adam), and hyperparameters to analyze tumor growth trends.
The learning opportunities were tremendeous in this Deep Learning project. I acquired some valuable data science skills like data annotation and hyperparameter tuning

7. Data Scientist at ImagoAI

I joined ImagoAI for April - May 2021 as a Data Science Intern. My main responsibilty was to making ML models for Hyperspectral data which was both challenging and exciting at the same time. I contributed by creating ML models for 4 features of the ImagoAI software.
I acquired many valuable skills like data prepocessing, team work, time management, and adaptability.

8. Research Associate
    (Dr. Gaurav Manik at IIT Roorkee)

In June 2019, I worked under Dr. Gaurav Manik, Associate Professor, Department of Polymer and Packaging, IIT Roorkee as a research intern.
My project specification was very simple which were to help Mr. Sushanta Sethi, Ph.D. student at the Department of Polymer and Packaging, IIT Roorkee, in proceeding with his research in the field of super-hydrophobic polymers. In doing so I had to script an extended library for the Forcite module of Material Studio that helped in calculating the contact angle and motion of a liquid droplet on an inclined surface coated with super-hydrophobic polymers.
While working there, I learned how to work with a research team and how to tailor software codes according to requirements.

Education

1. MS in Data Science
    University at Buffalo

I graduated MS in Engineering Science Data Science from School of Engineering and Applied Science at University at Buffalo (State University of New York) in June 2024.

SUBJECTS
EAS 501 Numerical Mathematics, EAS 502 Introduction to Probability Theory, EAS 503 Introduction to Programming and Databases, EAS 508 Statistical Learning and Data Mining I, EAS 509 Statistical Learning and Data Mining II, CSE 531 Analysis + Design of Algorithms, CSE 560 Data Models Query Language CSE 574 Introduction to Machine Learning, CSE 573 Computer Vision and Image Processing, EAS 504 Application of Data Science: Industrial Overview.

During the course of 3 semesters, I completed 5 semester projects, 5 technical presentations, and 2 essays.

SEMESTER PROJECTS

  1. Titanic Survival Prediction and Analysis (Introduction to Programming and Databases)
  2. Statistical Analysis (Statistical Learning and Data Mining I)
  3. Time Series Analysis (Statistical Learning and Data Mining II)
  4. Top Spotify Tracks Database (Data Model Query Language)
  5. Metaheuristic Optimization v/s Backpropagation (Computer Vision and Image Processing)

TECHNICAL PRESENTATIONS

  1. Convolutional Neural Networks and Computer Vision (Statistical Learning and Data Mining I)
  2. The Birth and Rise of Generative AI (Statistical Learning and Data Mining II)
  3. Paper Presentation:- U-Net: Convolutional Networks for Biomedical Image Segmentation (Computer Vision and Image Processing)
  4. Paper Presentation:- End-to-End Object Detection with Transformers (Computer Vision and Image Processing)
  5. Paper Presentation:- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Computer Vision and Image Processing)

ESSAYS

  1. Graph Neural Networks and Large Language Models (Capstone Essay)
  2. Ethics in AI: Autonomous Driving (Computer Vision and Image Processing)

2. Bachelor of Technology
    IIT Roorkee

I hold a B.Tech. degree in Polymer Science and Engineering from the Department of Chemical Engineering at IIT Roorkee which I completed in July 2022.

SUBJECTS:- IEE 03 Artificial Neural Networks, MAN 303 Mathematical Statistics, MAN 204 Database Management System, PEN 103 Computer Programming and Numerical Methods, MAN 001 & MAN 002 Mathematics, PEN 302 Modeling and Simulations of Polymers

During my 4 years in IIT Roorkee I indugled in and work on various extra curricular activities. Here are some of my extra curriculars and some of the Positions of Responsibilites I've held in IIT Roorkee:

  • Member, Team Inclusion, IIT Roorkee
  • Manager, TEDx IITROORKEE
  • Placement Associate, Placement and Internship Cell (PIC), IIT Roorkee
  • Core Team Member, Cognizance – Technical Festival of IIT Roorkee
  • Volunteer, Prahari Kaksh, NSS (3 UK CTR), IIT Roorkee
  • Friend of Section (FOS), Music Section, IIT Roorkee

Projects

Course Projects

1. Metaheuristic Optimization v/s
    Backpropagation

I worked on this project for CSE 573 - Computer Vision and Image Processing course. The aim of this project is to analyze the performances of meta-heuristic optimization algorithms in contrast to gradient-based backpropagation techniques for Convolutional Neural Network model training.
I utilized PyTorch to develop custom optimizers using Particle Swarm Optimization and Grey Wolf Optimization techniques. These optimizers were applied to train a LeNet-5 model for image classification, and their performance was evaluated against standard Stochastic Gradient Descent (Backpropagation) optimizer.
Project repo

2. Top Spotify Tracks Database

I focused on Distributed and Relational Databases through project for CSE 560 - Data Model Query Language course.
I designed and normalized a database with multiple tables linked by foreign keys to store top 50 songs playlists from 67 countries on Spotify. This setup enabled efficient execution of complex SQL queries directly on the database populated using Spotify Web API.
Project repo

3. Statistical Analysis

During the first semester at UB (February 2023 - May 2023), I worked on group project and presentation required for EAS 508 - Statitical Learning and Data Mining I. We were assigned 2 projects for Regression Analysis and Classification Analysis. The aim of these projects was to implement various Statistical Learning techniques in their respective scopes and analyze the results to get a better understanding of these algorithms. Other than these project, we chose "Convolutional Neural Networks and Computer Vision" as our presentation topic in which we scored 110% marks.
The project and presentation reports can be found here.

4. Clustering and Time Series Analysis

I worked on group project and presentation required for EAS 509 - Statitical Learning and Data Mining II. We completed 2 projects on Clustering Analysis and Time Series Analysis. For clustering analysis we used Land Mine dataset to derive insights that can aid in data annotation for multiclass classification. For the Time Series project we were assigned Oil Sales data which we used to perform Time Series Analysis and Forecasting using ARIMA and ETS models. Other than these project, we chose "The Birth and Rise of Generative AI" as our presentation topic in which we scored 120% marks and for this we invited to present in multiple courses.
The project and presentation reports can be found here.

5. Titanic Survival Prediction and Analysis

This was my group project for EAS 503 - Introduction to Programming and Databases. The aim of the project was to apply the knowledge acquired in the course to work on real-life data and experiment with Python and its libraries to understand. Using the Kaggle Dataset on sinking of HMS Titanic, we analyzed the survival patterns based on socioeconomic class, age, & gender and created prediction models to see how well does these models understand the aforementioned patterns.
Project Notebook

6. Breast Cancer Classification

In January 2021, I started working on my course project for IEE 03 - Artificial Neural Networks with my classmate Aman Arora.
The project aims to develop classifier for Classification of Breast Tumor into Malignant (cancer tumor) and Benign (non-cancer tumor) using features obtained from several cell images. The dataset we used for this purpose was Breast Cancer Wisconsin dataset from Scikit Learn. The classification process was carried out by three models, Support Vector Machine (SVM), Neural Network (with Particle Swarm Optimizer), and Neural Network (with Gradient Descent). The main objective is to compare these three models and find the most suitable model.
For more information refer to Project Report or visit the Project Repository

Capstone Projects

1. Graph Neural Networks and
    Large Language Models

I selected "Graph Neural Networks and Large Language Model: A Literature Review" as the topic of my capstone essay for MS in Data Science at UB.
The main objectives of this literature review were:

  1. Understand the foundational concepts of GNNs and LLMs.
  2. Explore the open problems in the research of GNNs and LLMs.
  3. Explore the possibility of reducing hallucinations in LLMs using the GNN representations as knowledge map.
Upon submission of my capstone essay, I recieved 100% marks for the task and I was awarded my degree.

2. Production of Sustainable Aviation Fuel

As a part of the B.Tech. curriculum, I was teamed with 4 more batchmates to pursue this project for 2 semesters (August 2021 - April 2022) under the mentorship of Dr. PK Jha, Associate Professor, Chemical Department, IIT Roorkee. The aim of the project was to design a production plant that optimizes the production of Sustainable Aviation Fuel in India.
After working for more than 8 months as a team, we completed the project and passed the academic requirements with flying colors. We were able to satisy our reviewers in 6 reviews and in our final B.Tech. Project Presentation the panel found our work reasonable and our answers convinving.
Above all, I found that this project not only inflated my knowledge in Chemical Engineering but also rekindled my spirit of research and camaraderie.

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form